Masaryk University Faculty of Informatics

Accelerating Chemical Simulation through Model Modification

Doctoral Thesis

Jana Pazúriková

Brno, Fall 2017

Masaryk University Faculty of Informatics

Accelerating Chemical Simulation through Model Modification

Doctoral Thesis

Jana Pazúriková

Brno, Fall 2017

Declaration

Hereby I declare that this paper is my original authorial work, which I have worked out on my own. All sources, references, and literature used or excerpted during elaboration of this work are properly cited and listed in complete reference to the due source.

Jana Pazúriková

Advisor: Luděk Matyska

i

Acknowledgement

The years of my doctoral studies have been filled with life lessons and people who helped me to get through them. Let’s tackle those lessons first. Working on such a huge project for years, mostly alone, as I have, often can be (and sure was) overwhelming. I have fought self-doubt and the imposter syndrome almost every week and I have not won. However, if you open up and share, you find out that everyone has its doubts. And suddenly, you are not alone. I have invested enourmous time and energy to the research of one of the problems and the results did not correspond with the effort. I ended up with more problems to solve than I had at the beginning and little to show for years of focus. Science can be like that sometimes, even the most elegant solutions might not work despite reasonable expectations. Leaving behind so much of my work, so much of my time has been the most challenging and at the same time the most freeing thing I have done. And I learned not to linger on my already spent time just as I try not to linger on material things. And then a side project turned out nicely and offered an unex- pected opportunity leading to an elegant method. And I learned you should always keep a door open to another direction. After that I came across a problem that seemed too easy at the first glance, not scientific enough. And I learned that if you dig deeper and do things properly, you can come up with the solution that surprises you with its simplicity. Just because it’s science, it does not have to be complicated. During this journey, several people supported me and helped me to get up and more forward again and again. Professor Matyska has often calmed my worries and offered a new insight. Scientists I have co-operated with, Aleš Křenek, Radka Svobodová and Vojtěch Spiwok, have always been supportive to me as a scientist-in-training and pro- vided me with valuable feedback. My colleagues in the Laboratory of Advanced Network technologies, Víťa, Pavel, Milan, Fila and Honza, have shared their experience and life lessons during our conversations in the kitchen.

iii I would not be where I am now without my family. Their love and care has shapen me as a person. And finally, Tomáš has stood right beside me the whole time asan infinite source of hugs that soothe my soul and lift my spirit.

This work was supported by Czech Science Foundation (15-17269S) and LM2015047 Czech National Infrastructure for Biological Data. Computational resources were provided by the CESNET LM2015042 and the CERIT Scientific Cloud LM2015085, provided under the pro- gramme “Projects of Large Research, Development, and Innovations Infrastructures”.

iv

Abstract

Computer modeling and simulations have supported research in almost all scientific areas. Especially for life sciences, they offer new insights for thinking and new possibilities for experimenting. One of the most powerful computational methods available in simu- lations regarding chemical processes is molecular dynamics. In this type of simulations, particles move due to their potential and kinetic energy. Several models differing in scale and the level of approxima- tion describe those particles and their potential energy, ranging from quantum to coarse-grained models. Two main issues hamper the wider application of molecular dy- namics simulations: cost and accuracy of models. High computational demands stem from the necessity of long simulation timescales due to the factors given either by the physical reality of our world or the requirements put on scientific methods. Except for quantum model of particles and interactions, all others approximate the reality rather coarsely. Complex nonlinear behavior is replaced by a scalar con- stant or an empirical function, all to simulate faster and therefore reach longer timescales and larger systems. Computer science can contribute to solving these issues as their sub problems often relate to high performance computing, numerical methods, optimization problems, and others. In this dissertation, we propose a methodology to deal with such demanding computations. A lot of them share a common concept: the evaluation of a computationally expensive function in a loop. The usual approaches aim to accelerate the function, but they are begin- ning to hit the limits. We, on the contrary, turn our attention to the loop. First, we build a model of the problem that emphasizes the it- erative character of computation. Then we modify the execution of the loop in various ways that lead to the acceleration of the whole computation. We apply this methodology to three demanding compu- tational problems from chemical simulations: two regarding the long timescales and one regarding the accuracy. We change the scheme of the loop to add another level of parallelization, we reduce the number

vi of the function’s evaluations by reusing calculations from previous iterations or we omit iterations that do not contribute to the result. In the first problem, we applied parallel-in-time computation scheme to increase the scalability of molecular dynamics simulation. Even though we successfully performed the first such simulation of a biomolec- ular system, many issues need to be addressed before routine use. We offer their rigorous analysis supported with experiments. In the second problem, we approximated the calculation of the mean square distance between many molecular structures, a demand- ing part of metadynamics. Metadynamics pushes the molecular dy- namics simulation forward to quicken the occurrence of rare events. Our method manages to reduce its overhead significantly without a loss of accuracy. In the third problem, we simplified the algorithm for parametriza- tion of atomic charges, basically an optimization problem. By sys- tematic analysis, we found out that state-of-the-art approaches are needlessly complicated. We developed a method that reaches or sur- passes their accuracy and runs faster. The achieved results demonstrate that the proposed methodology focusing on the iterative evaluation of bottleneck function can be successfully applied to difficult computational problems. The results also show that even mature computational problems like molecular dynamics simulations can benefit from a systematic application of the state-of-the-art computer science.

vii Keywords modelling, simulation, computational chemistry, approximation, opti- mization, acceleration

viii Contents

Acronyms 1

Glossary 2

1 Introduction 7 1.1 Simulating Chemical Processes ...... 7 1.1.1 Accuracy of Models ...... 9 1.1.2 Computational Demands ...... 11 1.2 Motivation and Current Limits ...... 13 1.3 Acceleration and Its Difficulties ...... 15 1.4 Our Approach ...... 16 1.4.1 Structure of the Thesis ...... 18

2 Parallel-in-Time Molecular Dynamics 19 2.1 Problem ...... 20 2.1.1 Molecular Dynamics ...... 20 2.1.2 Parallel-in-Time Computation ...... 24 2.1.3 Motivation ...... 29 2.1.4 Problem Description and Solution Proposal . . . 29 2.2 Model of Computation ...... 29 2.3 Modified Model of Computation ...... 31 2.4 Analysis of Problematic Aspects ...... 32 2.4.1 Prototype and Setup of Experiments ...... 33 2.4.2 Differences between Gravitational and Electro- static N-body Problem ...... 37 2.4.3 Application of Molecular Dynamics to Parallel- in-Time Integration Scheme ...... 42 2.4.4 Overview of Problematic Aspects ...... 45 2.4.5 Theoretical Speedup ...... 46 2.5 Conclusion ...... 48 2.5.1 Future Work ...... 49

3 Approximation of Mean Square Distance Computations in Metadynamics 51 3.1 Problem ...... 52 3.1.1 Metadynamics ...... 52

ix 3.1.2 Motivation ...... 59 3.1.3 Problem Description and Solution Proposal . . . 59 3.2 Model of Computation ...... 61 3.3 Approximated Model of Computation ...... 63 3.4 Accuracy and Speedup ...... 66 3.4.1 Number of MSD Computations ...... 66 3.4.2 Implementation ...... 67 3.4.3 Datasets and Computational Details ...... 68 3.4.4 Theoretical Speedup ...... 71 3.4.5 Practical Speedup ...... 73 3.4.6 Accuracy Evaluation ...... 79 3.5 Conclusion ...... 84 3.5.1 Future Work ...... 85

4 Simplified Optimization Method in Atomic Charges Parametriza- tion 87 4.1 Problem ...... 88 4.1.1 Atomic Charges ...... 88 4.1.2 Electronegativity Equalization Method and its Parametrization ...... 91 4.1.3 Motivation ...... 92 4.1.4 Problem Description and Solution Proposal . . . 93 4.2 Model of Computation ...... 93 4.3 Analysis of the Optimization Space and the Fitness Landscape 96 4.3.1 Optimization Space ...... 96 4.3.2 Fitness Landscape ...... 97 4.4 Simplified Model of Computation ...... 105 4.5 Accuracy and Speedup ...... 107 4.5.1 Implementation ...... 107 4.5.2 Evaluation Methods ...... 108 4.5.3 Accuracy Evaluation ...... 109 4.5.4 Theoretical Speedup ...... 111 4.5.5 Practical Speedup and Scalability ...... 112 4.6 Conclusion ...... 113 4.6.1 Future Work ...... 114

5 Conclusions 115 x Bibliography 117

A Author’s Publications 145

xi

List of Tables

2.1 List of datasets. 35 3.1 Percentage of computation time spent in metadynamics (MTD), CV evaluation (CV), distance computation (MSD), and rotation matrix computation (DSYEVR). 60 3.2 Values of the current-close structure distance threshold ε influence K, the average number of steps between the reassignment of the close structure. 72 3.3 Speedup for the whole simulation for various ε and thus various maximal speed-ups of MSD computations 73 3.4 Comparison of original and close structure trajectories. 84 4.1 Locality-related measures for random sample with local search. 101 4.2 Average fitnesses of starting vectors and vectors after 100 local iterations. 102 4.3 Quality of Best Parameters Calculated by Guided Minimization. 109 4.4 How Many Solutions of Linear System (LS) Are Needed in Each Method? 112

xiii

List of Figures

1.1 Models of particles 9 2.1 Computational flow of the parareal method. 26 2.2 Temperature fluctuations in SiT simulations of salt ions with coarse (G) and fine (F) function. 38 2.3 Temperature fluctuations in PiT simulations of salt ions. 39 2.4 Temperature fluctuations in SiT simulations of peptide in water with coarse (G) and fine (F) function. 40 2.5 Temperature fluctuations in PiT simulation of peptide in water. 40 2.6 Temperature fluctuations in SiT simulations of 27 water molecules with coarse (G) and fine (F) function. 41 2.7 Temperature fluctuations in PiT simulation of 27 water molecules. 41 2.8 Temperature fluctuations in SiT simulation of water. 43 2.9 Temperature fluctuations in the PiT water simulations with starting structure equilibrated by ℱ. 44 2.10 Temperature fluctuations in the PiT water simulations with starting structure equilibrated by 풢1. 44 2.11 Temperature fluctuations in the PiT protein in water simulations. 45 3.1 Filling potential energy minima with a bias potential. 54 3.2 The original method computes the distances to many reference structures in every step. 61 3.3 Close structure method expensively computes usually only the distance between the current and the close structure. 64 3.4 A ring of cyclooctane and its most common conformations. Taken from [5]. 68 3.5 Visualization of three-dimensional Isomap projections of reference structures generated by Brown et al. [28] 69 3.6 A molecule of Trp-cage. Taken from PDB [147]. 71 3.7 Speedup and strong scalability of cyclooctane simulations, the close structure method with ε = 0.01 nm2. 75

xv 3.8 Speedup and strong scalability of the Trp-cage simulations, the close structure method with ε = 0.01 nm2. 76 3.9 Speedup and strong scalability of cyclooctane simulations, the close structure method with ε = 0.001 nm2. 77 3.10 Speedup and strong scalability of the Trp-cage simulations, the close structure method with ε = 0.001 nm2. 78 3.11 The values of collective variables for simulation with original method (left) and with the close structure method (right) for MSD computation. 80 3.12 Visualizations of cyclooctane energy landscape explored by metadynamics simulations with the original method (A) and the close structure method (B), ε = 0.005 nm2, for MSD computation. 81 3.13 Visualizations of Trp-cage energy landscapes explored by metadynamics simulations with the original method (A) and the close structure method (B), ε = 0.01 nm2, for MSD computation. 82 4.1 Distribution of fitness values in R landscape for set1. 100 4.2 Distribution of fitness values in avg(RMSDa) landscape for set1. 100 4.3 The objective value in the process of local search. 103 4.4 The objective value in the process of local search. 104 4.5 Correlation of Ligandexpo’s QM Charges and EEM Charges Trained on Set3. 110 4.6 Computation time for all *MIN methods. 113

xvi List of Algorithms

1 Molecular dynamics simulation ...... 21 2 Computation of forces ...... 22 3 Model of molecular dynamics ...... 30 4 Parallel-in-time model of molecular dynamics . . . . . 31 5 Molecular dynamics simulation enhanced with meta- dynamics ...... 55 6 Molecular dynamics simulation enhanced with meta- dynamics and property map collective variables . . . . 58 7 Mean square distance computation between the current and the reference structure ...... 58 8 The model of computing property map collective vari- ables ...... 62 9 Approximated model of computing the collective vari- ables with the close structure ...... 65 10 Algorithm for EEM Parametrization with DE(MIN) or GA(MIN) ...... 95 11 The main loop of calibration method ...... 96 12 Guided minimization algorithm (GDMIN) ...... 106

xvii

Acronyms

CV collective variable.

DE differential evolution.

EEM Electronegativity Equalization Method.

GA genetic algorithm.

MD molecular dynamics.

MM molecular mechanics.

MSD mean square distance/deviation.

MTD metadynamics.

NL neighborlist.

PDB Protein Data Bank.

PiT parallel-in-time.

QM quantum mechanics.

RMSD root mean square deviation.

SiT sequential-in-time.

1

Glossary atomic charge A real number representing the contribution of elec- trons and their charge, related to one atom. blow-up The state with extremely large force that causes the failure of the integrator. The reasons the system gets into such state include insufficient prior energy minimization, large timestep, inappropriate pressure or temperature control, unsuitable con- straints, and more [62]. bonded interactions Interactions between bonded atoms, they in- clude interactions due to bond, angle, and torsion angle. collective variable A degree of freedom of the simulated system that encapsulates its relevant motions, used in many enhanced sam- pling methods. computation time The sum of time durations of the algorithm run as measured by clocks of all processors that took part. conformation A distinct 3D structure of a protein in a local minimum or in a saddle point of the energy surface. energy landscape A multidimensional plane that describes the total energy of the system that includes both its potential and kinetic energy with respect to the atoms’ coordinates. enhanced sampling method Method solving the sampling problem. explicit solvent The solvent is represented by explicit atoms of sol- vent molecules that take part in calculation in the same way as atoms of the molecule of interest. force field A set of parameters and equations that empirically de- scribes the interactions of atoms. fs femtosecond, 10−15 s.

3 Glossary implicit solvent The solvent is not explicitly represented by its molecules, its effects are included in the parameters of the system. long-range interactions Interactions that range to long distances, in the context of this work it means electrostatic interactions. metadynamics Enhanced sampling method that adds an artificial bias to the potential and thus disfavours already visited states. molecular dynamics A method of computational chemistry that cal- culates the movements of atoms caused by their interactions; this work deals with molecular dynamics simulations with molecular mechanics model. molecular mechanics A model that considers atoms as undivided points of mass, it describes their interactions with empirical functions, it does not explicitly account for electrons or quantum effects. molecule of interest A molecule of biological or chemical interest, e. g., a protein, a virus, a nanotube. non-bonded interactions Interactions between non-bonded atoms, they include van der Waals interactions and electrostatic interac- tions, usually are computed only between pairs of atoms within the given distance radius. potential energy Also called potential, function of atoms’ coordinates that describes the interatomic interactions. protein folding A biochemical process in which a protein folds, i. e., changes its 3D structure in fluent movement; often simulated with molecular dynamics. sampling problem The energy surface usually has many local min- ima split by energy barriers that are crossed only with some probability; it can take quite long for molecular dynamics sim- ulation to cover all minima, i. e., to sample the space of energy surface, hence the sampling problem.

4 Glossary short-range interactions Interactions that range to short distances, they include all bonded interactions and van der Waals interac- tions. simulation time The time duration of the simulated process; often the number of steps × the length of a single timestep. solvent Usually molecules of water and salt ions that surround the mol- ecule of interest in the simulation. strong scaling A function that maps the increasing number of compu- tational resources that solve the problem of fixed size to the wall- clock time of the computation; it shows how the wallclock time (ideally) decreases as the resources grow but the problem size remains the same. wallclock time The time duration of the algorithm run as measured by the wall clock. weak scaling A function that maps the increasing number of compu- tational resources that solve the problem of increasing size to the wallclock time of the computation; it shows how the wall- clock time ideally remains the same as the resources grow and the problem size grows.

5

1 Introduction

This thesis deals with the acceleration of chemical simulations by modifying the model of their computation. In this chapter, we ex- plain all mentioned concepts: chemical simulations, their acceleration, and our methodology for approaching this task. First, we introduce simulations of chemical processes and their main issues, including the requirement for acceleration and increasing accuracy. Second, we outline the limits of the current state of the art and explain the moti- vation to push them further. Third, we list common approaches for acceleration, show the potential contribution of computer science and describe the related difficulties. And finally, we propose our method- ology how to approach demanding computational problems and we present three problems of chemical simulations that we want to accel- erate by applying the methodology.

1.1 Simulating Chemical Processes

Computer modeling and simulation have proven to be immensely valuable to the science and research. They enable more complex mod- els and by fast computation push the spatial and temporal limits of simulations. With the increasing number of computational resources and the growing amount of data, they have changed the techniques of research in many areas of science. A lot of researchers claim they have become the third pillar of science, alongside the theory and experimentation. With the routine use of these techniques, new areas of research have emerged. Especially life sciences benefit from a new way of look- ing at things. Computer simulations make it easier to scale down and observe processes closely than would be possible through experi- ments. They make it possible to simulate events we could not witness due to technology constraints, danger, high cost, or long timescale. And in addition to that, they can be arbitrarily modified and played with. Many of these artificial additions and what-if experiments have proven to be helpful. Computational chemistry, one of the rather novel scientific fields, focuses on computational ways of researching chemical structures and

7 1. Introduction their behavior [106, 82]. Chemical processes are caused by underlying physical processes and these can be expressed by mathematics. Their simulations are therefore usually based on equations. However, only the level of the quantum mechanics (QM) model describes the reality of the world quite accurately (although not perfectly, e. g., because of numerical, not analytical methods for solving equations), anything above this scale has to be approximated to a smaller or greater ex- tent. Nevertheless, despite the inaccuracy, these models are highly beneficial. Chemical simulations can be static (e. g., computing some prop- erty of molecule from its structure) or dynamic (observing a specific process or computing a macroscopic property of the process). Some- times, the models themselves include components that have to be computed. For example, the model of molecular mechanics (MM) that approximates atoms as charged mass points works with the theoretical concept of partial atomic charge1. These charges are not measured, they are computed [66]. Computer scientists can find challenging problems in the com- putational chemistry, regarding for example high performance com- puting, numerical methods, optimization problems, or graph theory. Moreover, problems of computational chemistry sometimes have their analogies in other computational sciences, physics is the closest field. One of the most powerful tools for studying chemical processes is molecular dynamics (MD) [82, 41]. This well-known computational method simulates the movements of particles over time caused by their interactions (potential energy) and the influence of the environ- ment (kinetic energy). A continuous process is naturally discretized with the integration scheme, usually velocity Verlet method. Within each timestep, the potential energy and the forces it determines are cal- culated. Particles are then moved to new positions using the Newton’s second law. The kinetic energy is introduced to the system during the preparation when it is heated to the required temperature. Dur- ing the simulation, the thermostat and the barostat often control the environment and keep the conditions around the target values.

1. We explain terms from chemistry as they come. Moreover, the most important ones are in Glossary at the beginning of the thesis.

8 1. Introduction

Figure 1.1: Models of particles. From left to right, quantum mechanics model, molecular mechanics model and coarse-grained model. Taken from http://www.riken.jp/csrp-mol/research.html in September 2017.

By observing the movement of particles over time, scientists can study processes such as protein folding, enzyme reaction, passing of an object through the membrane, and many others. Several models describe what the particles are and how they interact at various levels of approximation, ranging from quantum models to coarse-grained view. The applicability and usefulness of molecular dynamics simula- tions have been proven many times, details in section 1.2. Even more common use is hampered because of two main reasons: high cost and problems with model’s accuracy. We elaborate on both in the next sections.

1.1.1 Accuracy of Models

As mentioned before, the models describing the particles and their interactions in molecular dynamics simulations differ in the level of approximation, coarsening the accuracy with moving up the scale, as shown in Figure 1.1. The quantum mechanics (QM) model considers all subatomic par- ticles (electrons, protons, neutrons) and their behavior is described by Schrödinger’s equation [103]. Although even this model includes some approximations (e. g., Born-Oppenheimer’s approximation or meth- ods approximating wave function), it is considered to quite accurately describe the reality [106].

9 1. Introduction

In molecular mechanics (MM) model we get to a higher level of granularity: particles are atoms, and their behavior is determined by the kinetic energy given by temperature and potential energy de- scribed by the force field. This compact set of empirical functions, parameters, and constants approximates the interatomic interactions [82]. However, during the design and development of the force field, researchers often focus only on one type of molecules. They tune the constants and parameters for the highest accuracy with molecules of this particular type, typically proteins, due to their importance and the widespread interest in their behavior [110]. Because of this limited focus, the accuracy of the force field for molecules of other types is lower. Small molecules, including drugs, often suffer from this issue [70, 214]. However, the force field accuracy for proteins also has its limitations [154, 151, 135]. Moreover, atoms in the molecular mechanics model are repre- sented as charged mass points. The partial atomic charge replaces the contribution of electrons which are omitted in this model. The charge of the atom depends on the structure of the molecule and the atom’s position within it, therefore it slightly differs in each conformation. Since this is a theoretical concept, the partial charges cannot be mea- sured, only computed. The quantum mechanics model offers a way for their calculation, which is considered to be accurate, but extremely expensive [106]. Therefore, cheaper empirical methods have been de- veloped to approximate the quantum charges with physical laws and trained parameters. However, this adds another approximation to the chemical simulations that employ atomic charges. We focus on atomic charges in chapter 4. Furthermore, the simulated system needs to be carefully prepared. Usually, it undergoes the following procedure. The structure of the molecule is selected, typically from a database such as Protein Data Bank [147]. Sometimes, the structure might be incorrect (missing atoms) or the force field parameters might be missing for nonstandard elements in some parts of molecules. When everything is correct, the structure undergoes energy minimization, the setting of simulation parameters (e. g., periodic boundary conditions), adding solvation and ions to approximate water or internal cell environment, and further minimization. Heating up the system, usually to 300 K, introduces the kinetic energy. The system has to be equilibrated, temperature,

10 1. Introduction potential energy, pressure or volume all fluctuating around desired values. Molecules of solvation should freely move around the molecule of interest. Only after this careful preparation, we can perform the production simulation. Otherwise, numerical instabilities will occur during the simulation, or it will not correspond to the experiments. Hybrid QM-MM models combine the quantum and molecular mechanics [209]. The parts of the simulated system that are of ma- jor interest are simulated with quantum mechanics and the rest is approximated by molecular mechanics model. Therefore, these simu- lations combine the accuracy of quantum models and the performance of molecular mechanics, that requires an order of magnitude fewer operations for interactions’ evaluation. Coarse-grained models shift the approximation even further, mod- eling several atoms together as a bead [6]. However, this makes it possible to simulate the timescale orders of magnitude longer than with molecular mechanics.

1.1.2 Computational Demands Computational demands of molecular dynamics simulations depend on three factors: the complexity of the model (complicated interac- tions mean demanding computation), the size of the simulated system (more particles mean more interactions), and the length of the simula- tion (longer simulation time means longer computation time). The first two factors are intertwined, as the model determines what is considered a particle. The same molecule can be simulated at the level of protons and electrons, atoms as mass points, or beads including several atoms [103]. In the molecular mechanics model, the evaluation of the electrostatic interactions takes the most time. As these interactions range to large distance, they need to be calculated between (almost) every pair of atoms in the system. Therefore, the complexity of a naive algorithm grows with a square of number of atoms in the system. Current methods that approximate electrostatic interactions managed to decrease that, but this part of potential energy computation still remains a bottleneck. The third factor, simulation timescales, causes the most challenging complications. Its issues have their foundation on the tiny timestep of the integration scheme. In simulation with molecular mechanics

11 1. Introduction model, the integration scheme has to capture the fastest movements between atoms—oscillations of X-H bonds, where H is a hydrogen atom and X is an atom of arbitrary element. Due to the frequency of its movement, 1014 s−1, the timestep needs to be in the order of femtoseconds, i. e., 10−15 s. This means that a microsecond simulation contains a billion steps. Moreover, the simulation timescales usually grow for chemically or biologically relevant simulations. We state the following four reasons for that, they stem from the physical reality of the world or from the requirements put on MD as a scientific method. First, the chemical process itself can take long, for example, the passing of the virus through the cell’s membrane. Second, the chemical process includes or is triggered by a rare event, for example, a change in the conformation of a protein. Every molecule’s behavior depends on its potential and kinetic energies. The energy landscape E(x), a function of atoms’ coordinates determining their energy, includes energy minima and barriers between them in multidimensional space. Naturally, the molecule tends to stay in the energy minima or move towards it. Any crossing of the energy barrier happens only rarely, sometimes only under specific conditions (e. g., higher temperature or the presence of a catalyst) with the probability indirectly proportional to the height of the barrier. Then, the rarer occurrence of the relevant event, the longer simulation necessary to capture it. Third, the size of the molecular systems influences the required length of the simulation. Generally, large systems need more time for significant changes to express. For example, at the scale of a whole cell, nothing obvious can happen during one picosecond. Microseconds, miliseconds or even whole seconds of the simulation time would be necessary to observe some changes. Therefore, the larger the system, the longer the simulation. The fourth reason stems from the characteristic demanded from scientific methods. In order to draw general conclusions from the simulation of a certain process, the simulation has to include all the relevant behavior. That means it has to explore the whole energy landscape, visit all states of the molecule, ideally even more than one time to capture the probability of crossing from one minimum to an- other. A molecule with complex behavior has more energy minima and the probabilities of traversing between them are not the same,

12 1. Introduction

some crossings might be preferred (due to a lower barrier or a higher concentration of catalyst that needs to be present). Again, more en- ergy minima and more heterogeneity between them, the longer the simulation. It is difficult, if not impossible, to overcome these four reasons for long simulation timescales without jeopardizing the accuracy and scientific soundness of the simulation. The relation between thesize of the system and its simulation’s timescale has been addressed by increasing the level of model’s approximation: shifting to the coarse- grained models as these decrease the number of particles in the sys- tem. Some development has been done regarding the second reason, several methods accelerate the occurrence of rare events by artificial interventions. One of them, metadynamics (MTD), adds an artificial bias into the simulation and thus pushes the atoms into the required direction. The right direction is described by several variables, such as the distance between particular atoms. Even though the trajectory of the simulation enhanced with metadynamics does not correspond with the reality, most of the important properties (such as probabili- ties of various conformations) can be calculated from it. We focus on metadynamics in chapter 3. Overall, the necessity of long simulation timescales determines the computational demands of molecular dynamics to a greater extent than the size of the system. Moreover, it is more difficult to overcome.

1.2 Motivation and Current Limits

To justify the motivation for acceleration of chemical simulations, we list several interesting processes simulated in recent years in atomic resolution. Then we show the limits of the current state of the art in terms of the size of the simulated system and the timescale. Researchers from life sciences used molecular dynamics simula- tions with molecular mechanics model to study various interesting processes, for example:

∙ confirming the structure of HIV capsid [216],

∙ examining the behavior of protein connected to Alzheimer’s disease [14, 104],

13 1. Introduction

∙ explaining the properties of a nanomaterial [86, 35, 17],

∙ predicting the influence of genes to later cancer development [95],

∙ investigating the delivery of anticancer drugs [85],

∙ exploring the conformations of B-DNA [141],

∙ observing folding of RNA [37],

∙ analyzing the drug resistance mechanism of hepatitis C [138].

These simulations made it possible to observe and analyze vari- ous processes closely and thus facilitate research in life and materials sciences. This benefit does not mitigate for larger systems or longer timescales, actually quite the opposite. Larger and longer all-atom simulations might shed some light on even more relevant processes, which are currently researched through coarse-grained models. As explained in section 1.1.2, the necessity for long simulation timescales is the essential problem of these simulations, more pressing than the growing size of the simulated systems due to the problematic strong scaling. The only way to reach longer timescales is to accelerate the simulation (how fast the interesting parts happen) or its computa- tion (how much time does it take to finish the calculation), naturally without losing the accuracy and correctness. Currently, the largest all-atom simulations consisted of a billion atoms [180, 172], although their timescales reached only thousands of picoseconds. The longest simulations reached several microsec- onds [102, 73, 165, 206, 150, 45], or even a millisecond scale [107, 177], sometimes with the help of enhanced sampling methods that “push the simulation forward” [132, 181, 149] or by many short simulations connected with Markov models [181, 176]. Martinez-Rosell et al. [116] predict that a one-second simulation timescale will be reached before 2022. In order to reach this timescale, several issues need to be ad- dressed [136, 46]. Computer science can contribute to the solution as we aim in this dissertation.

14 1. Introduction 1.3 Acceleration and Its Difficulties

The approaches aiming to accelerate chemical simulations or their related computations can be divided into three following categories, loosely ordered by the level of abstraction: acceleration through the approximation of the model of the problem, through modification of the model of the computation, or through the modification of the implementation. We explain all of them on two examples: (1) calcula- tion of electrostatic interactions between atoms and (2) calculation of atomic charges. We can accelerate through the approximation of the model of the chemical concept at hand. The development of these approximations is an interdisciplinary task, requiring the expertise from computational chemistry, often mathematics, and to a smaller extent also computer science. For example (1), the exact calculation of electrostatic interac- tions includes the evaluation of Coulomb’s potential in all pairs of atoms within the system. Several methods have been developed to approximate this expensive pair-wise calculation, some even preced- ing the era of computers [51]. For example (2), the exact calculation of atomic charges is based on quantum mechanics, thus extremely computationally demanding. Several empirical methods have been developed to approximate it with parameters [66]. We can accelerate the simulations through the modification of the model of the computation. So, the changes do not relate to what we compute but how we compute that. Clearly, computer science can contribute at this level with various decomposition techniques, par- allel and distributed execution, problem representations, numerical methods and approximations, and many others. For example (1), fast multipole method for calculation of electrostatic interactions has been executed on a massively parallel supercomputer [4]. For example (2), many empirical methods for atomic charges parametrization are basi- cally optimization problems, the class of problems computer science knows well how to solve. Parallel execution of simulations suffers from problematic strong scaling [136]: with the fixed-sized system and the increasing number of resources, the computation time stops decreasing after the satura- tion of spatial decomposition. On the contrary, the weak scaling of molecular dynamics simulations is almost ideal, with the system of

15 1. Introduction increasing size and growing number of resources, the computation time remains almost the same. And finally, we can accelerate through the modification of imple- mentation. Here belong namely code optimizations, vectorizations, offloading to GPU, many-core processors, or specialized hardware. For example (1), many methods for electrostatic calculation have been implemented for GPUs [2]. And for example (2), empirical methods for atomic charges parametrization include the solving of linear systems, a routine highly optimized in libraries such as BLAS. All these approaches share common difficulties given the charac- teristics of chemical processes or their simulations. Due to the chaotic character of these processes, there is not a single correct trajectory of atoms, e. g., the folding of a protein each time looks a bit differ- ent. Molecular dynamics simulations share this characteristic, even the tiniest changes in the input cause significant changes in the out- put. This obviously presents a challenge for testing. The evaluation of correctness and accuracy often has to be indirect: we calculate some macroscopic property and compare it to the reference simulations or to experimental results. Moreover, this type of simulations is highly sensitive to any sudden changes in the otherwise contiguous pro- cess. These can be caused by numerical instabilities (due to errors in molecular structure, charges, force field, insufficient minimization or equilibration, too long timestep of the integrator, inappropriate parameters of simulation, and others) or any artificial interventions to the simulation. Either way, they can cause a blow-up [62]: the failure of the integrator due to extremely large forces as two atoms get too far apart or too close to each other.

1.4 Our Approach

In this thesis, we propose a methodology to deal with the acceleration of chemical (and other) simulations by modifying the model of their computation. Many of these models of computation follow the same structure: the evaluation of the demanding function in a loop 2. Usually, the acceleration efforts focus on speeding up the bottleneck function,

2. The evaluation of the function creates a bottleneck in the computation, thus we often call it the bottleneck function.

16 1. Introduction

by changes in the code, new faster methods or parallel execution with spatial decomposition. However, these approaches reach their limit, especially with highly optimized codes and a growing number of computational resources. Quite often, it is extremely difficult, if not impossible, to further accelerate the bottleneck function. Therefore, we propose to focus on the loop. First, we build the model of the problem that emphasizes the iterative evaluation of the bottleneck function. That makes it easier to look at the character of the computation and look for approximations of the problem’s compu- tation, not the approximation of the problem itself. Then we modify the execution of the loop in a way that provides some acceleration of the whole computation. We apply the modified model back to the problem and demonstrate the preserved accuracy while speeding up the computation or increasing the accuracy while not slowing down. We have applied this methodology to three demanding computa- tional problems from the area of chemical simulations. Two deal with long timescales, the last one deals with the accuracy of the molecular mechanics model. In each problem, we employed a different modifi- cation of the loop execution: first, we reframed the loop in a way that increases the scalability, second, we approximated the bottleneck func- tion reusing the calculation from the previous iterations, and third, we omitted some iterations due to their redundancy. The first problem touches the problematic strong scaling oflong molecular dynamics simulations. They calculate computationally de- manding interatomic potential in the loop of many timesteps. By mod- ifying this with the employing parallel-in-time computation scheme, we can shift the point of saturation further and efficiently use mas- sively parallel computational resources. The second problem investigates another way of dealing with long simulation timescales. Metadynamics, an enhanced sampling method, “pushes the simulation forward” by adding artificial energy. This way, it accelerates the occurrence of rare events. However, it sometimes brings quite a substantial cost. With a specific setting of metadynamics, a sum of computationally demanding mean square distances between the simulated molecular structure and a large set of reference struc- tures has to be computed in every step of the simulation. By reusing the calculations from the previous timesteps, we can approximate the calculation and thus reduce the number of bottleneck function calls.

17 1. Introduction

The third problem deals with the accuracy of the molecular me- chanics model. Partial atomic charges can be calculated by empirical methods instead of expensive quantum-based methods. However, pa- rameters have to be obtained by optimization from the training set. During this parametrization, in each step of the optimization method, a large number of linear systems has to be solved. By rigorous analysis of the optimization problem and the fitness landscape, we reduced the number of iterations, since they could not contribute to the result. Overall, our methodology has provided a useful viewpoint on demanding computational problems. Furthermore, we show that com- puter science and its state-of-the-art methods can contribute to the solution of complex problems from other research fields, well beyond the issues regarding the implementation of the code.

1.4.1 Structure of the Thesis This thesis continues as follows. The following chapters focus on three problems of chemical simulations we have contributed to with our methodology. Each chapter follows a similar structure. After setting the background and defining the problem, we build the model of computation emphasizing the evaluation of bottleneck function in a loop. Then we explain how we modify the model in a way that accelerates the whole computation. And finally, we evaluate the speed- up and accuracy of our modified model through experiments. The final chapter of this thesis concludes our main accomplishments in solving each problem and explains the contributions regarding computer science.

18 2 Parallel-in-Time Molecular Dynamics

In the first problem we have researched, we applied parallel-in-time computation scheme onto molecular dynamics algorithm. Molecular dynamics (MD) simulates the movements of particles caused by their potential and kinetic energy. These simulations of chemical processes require long timescales due to the character of these processes, high barriers in their energy landscape, system’s size, or complexity of behavior, as explained in section 1.1.2. However, achieving such timescales is not easy. Parallel execution through spa- tial domain decomposition improves mostly the weak scaling, while the strong scaling is of higher importance here. Using massively par- allel supercomputers, it is happening more and more often that the spatial domain is saturated—it is not efficient to decompose the task further to more cores because the overhead completely cancels out or even exceeds the performance gain of parallelization. Parallel-in-space computation focuses on the acceleration of the execution of a single step. Long MD simulations can take millions or billions of steps due to a tiny timestep of the integration scheme. Parallel computation along this loop might bring additional speed-up. Decomposition of the temporal domain is a bit unintuitive and much more complicated. However, 50 years of research in parallel- in-time approaches in mathematics and numerous successful appli- cations in other computational problems suggest it is possible and fulfills the aim of the efficient usage of extensive computational power available nowadays. In this work, we apply parallel-in-time method PFASST [48] to molecular dynamics simulations of biomolecular systems, which is, to our best knowledge, the very first such attempt in this domain. We systematically analyze the issues that prevent a straightforward application of PFASST to MD code. This chapter continues as follows. First, we introduce the area of molecular dynamics simulations and the existing approaches of its acceleration. Second, we present methods for calculating parallel-in- time scheme and their applications in many computational problems, with the focus on molecular dynamics simulations. Third, we build a model of molecular dynamics algorithm that emphasizes the bottle-

19 2. Parallel-in-Time Molecular Dynamics neck function and its evaluation in many iterations. Fourth, we modify this model with the parallel-in-time scheme. Fifth, we demonstrate the behavior of parallel-in-time molecular dynamics on simulations of four molecular systems, including one biomolecular system. We elaborate on various aspects and issues that arise. And finally, we conclude with suggestions for possible directions of future research in this area.

2.1 Problem

In this section, we introduce simulations of molecular dynamics and focus on one of their major problems: the necessity of long timescales. We present known methods that deal with this problem and we in- troduce approaches applied in other scientific fields which served as an inspiration for our proposed solution, such as parallel-in-time computation.

2.1.1 Molecular Dynamics Molecular dynamics is a deterministic, chaotic, continuous, dynamic, equation-based, widely-used computational method. It represents an electrostatic N-body problem, the simulation deals with the inter- actions between N particles (electrons, protons, atoms, beads). The particles move according to Newton’s second law of motion due to the forces determined by the interactions and the environment influence. In this work, when we talk about molecular dynamics simulations, we always mean MD with molecular mechanics model, if not specified otherwise. That means the particles are atoms and their interactions are described by a set of empirical functions, parameters and constant, together called a force field. The simulation of molecular dynamics repeats two main steps in a loop, see Algorithm 1, line 3. In the first step (Algorithm 1, line 4), the potential energy of the system is calculated as a function of topology and current positions of atoms x. The force field approximates the value of system’s energy with empirical functions for each type of interactions, see Algorithm 2: bonded interactions consisting of bond (lines 2-3) and angle (lines 4-7) oscillations, nonbonded short-range

20 2. Parallel-in-Time Molecular Dynamics van der Waals interactions (lines 8-9) and nonbonded long-range elec- trostatic interactions (lines 10-11). Various force fields [39, 111, 68, 169] differ in their focus, their developers have obtained (computationally or experimentally) parameters and constants for a specific type of molecules, typically proteins. In the second step (Algorithm 1, line 7), the forces f (determined by the energy) are integrated with Newton’s second law of motion in the velocity Verlet integration scheme [196] to obtain the positions of atoms in the next step. The atoms tend to increase or decrease their velocity spontaneously. It is caused by the transformation of the kinetic energy to the potential energy (e. g., to trigger the conformation change) or by the error of the integration scheme (relative force error around 1% [117]). The velocities determine the instantaneous temperature of the system. While ideally, the average temperature should remain constant, even though the instantaneous one naturally fluctuates, the numerical error from the integrator can cause steady increase or decrease. In order to prevent that, the thermostat [82] rescales the velocities every few steps to compensate and draw the system close to the target temperature. Such a control of the environment is quite common in simulations, further controlled measures include pressure or volume (Algorithm 1, lines 5-6).

Algorithm 1 Molecular dynamics simulation 1: function do_md(x) 2: init() 3: loop through time steps 4: f ← compute_md_forces(x, force_field) 5: if control_environment then 6: f ← control_environment(f, target_values) 7: x ← integrate(f) 8: finalize()

Computational demands of molecular dynamics simulations grow with the number of atoms and the number of simulation steps. The 2 naive calculation of the interatomic potential scales as 풪(Natoms) due to the calculation of the electrostatic potential between all pairs of

21 2. Parallel-in-Time Molecular Dynamics

Algorithm 2 Computation of forces 1: function compute_md_forces(x, force_field) 2: loop through all bonds 3: f = compute_bond_interactions(x, force_field) 4: loop through all angles 5: f += compute_angle_interactions(x, force_field) 6: loop through all torsion angles 7: f += compute_torsion_angle_interactions(x, force_field)

8: loop though all pairs of atoms within cutoffW 9: f += compute_van_der_Waals_interactions(x,force_field) 10: loop though all pairs of atoms 11: f += compute_electrostatic_interactions(x, force_field) 12: return f

atoms. Methods approximating its evaluation based on the Fourier space or hierarchical division of space achieve scaling of 풪(Natoms log Natoms) [76, 43, 16], those based on multigrid approach even 풪(Natoms) [60, 174], albeit with a large multiplicative constant. The parallelization through the spatial domain and almost ideal weak scaling makes it possible to simulate even billions of atoms [180]. On the other hand, the timescales of simulations are more diffi- cult to handle. As explained in section 1.1.2, long simulation time is required in case of complex (or) rare process of a large system and the timestep needs to remain extremely small to capture all movements. Naturally, a huge effort has been put to explore how to accelerate molecular dynamics simulations to enable longer timescales. Vari- ous methods and approaches make it run faster through changes regarding the code, the approximation of the model, the change in the computation, or through parallelization. The acceleration through the approximation of model focuses on interventions to the MD algorithm that omit some parts of the com- putation. It includes enhanced sampling methods (more in chapter 3), discrete molecular dynamics [162], implicit models of solvent [133], and coarse-grained or multiscale models, instead of molecular me- chanics [6].

22 2. Parallel-in-Time Molecular Dynamics

The acceleration through the code focuses on changes to the MD algorithm or its implementation. It includes

∙ faster methods for calculating electrostatic potential [92],

∙ optimizations on the code level, like vectorizations [136],

∙ or offloading the demanding part to other architectures like GPU [136, 96, 71, 59], many-core processors [81, ch.20], [125] or specialized hardware [178, 91].

These tasks exceed the level of sole implementation, they often require changes in the algorithm flow or the structure of data and thus are connected also to the next type. The acceleration through parallelization is a usual choice in case of computationally demanding tasks. Spatial domain of MD decomposes rather well, all MD software tools [136, 153, 34, 159] run in parallel through OpenMP, MPI, or both. The weak scaling of MD simulations is not an issue, with a sufficient number of cores we can easily simu- late millions or billions of atoms. Unfortunately, the temporal domain is more difficult to handle. With traditional integration schemes, it does not scale well with a growing number of resources, it just takes more time to reach longer simulation timescales. Nevertheless, some techniques have been developed that parallelize the temporal domain as well. They are based on the concept of many short MD simula- tions running in parallel and then connecting them or predicting the behavior from the previous simulations. The project Copernicus [163, 164] and lately Perez et al. [149] com- pute MD parallel in time, although in a different manner than the mathematical methods presented in the next section. They are run- ning many short simulations simultaneously and gradually build the Markov chain of molecule’s states. In addition to the exploration of the energy landscape, a number of short simulations transitioning from one minimum to another allows for accurate assessment of transition probabilities. They achieve remarkable strong scaling and efficiently use highly distributed computational resources. This approach is suit- able for the simulation of processes that consist of many metastable states separated by short rare transitions, e. g., protein folding or con- formation space exploration.

23 2. Parallel-in-Time Molecular Dynamics

Yu et al. [215] take into account the behavior from the prior related simulations to guide the system. The prediction algorithm forecasts the state of the system at the beginning of several time intervals. That gives the initial value and the classical MD simulation through the in- terval follows. If the prediction does not comply with the simulation’s result at all intervals, the data from the mispredicted point in time are discarded. The prediction algorithm adapts and the simulation continues from the latest correct state. The method has been evaluated on nanotubes pulled by the external force, they achieved high strong scaling and speed-up. However, the method relies heavily on the re- liability of the prediction: the presented high speed-up and strong scalability depend on no mispredictions during the simulation. Except for this “coarse-grained” time parallelism, there are other methods to parallelize the temporal domain that compute solutions in several time points concurrently. We introduce them in the next section.

2.1.2 Parallel-in-Time Computation In parallel-in-time (PiT) computation, different processors calculate results in different time points, as opposed to the traditional parallel- in-space computation where different processors calculate results for different points in space. Spatial decomposition has to deal with interactions at the neighboring parts and synchronization, the decom- position of the temporal domain has a more complicated major issue: the sequential character of time. Most of the computational research problems sequentially solve the initial value problem, i. e., the re- sult from the previous time points determines the result in the next time point. Usually, the computation follows sequential-in-time (SiT) scheme. However, several mathematical techniques are able to handle this requirement. Basically, they shift the sequential computation to a cheap function (referred to as 풢) that approximates solutions in several time points. Then they apply in parallel the usual, expensive function (referred to as ℱ) to improve the accuracy. It is important to note that these methods bring additional computation, i. e., they increase the number of floating point operations necessary to get the result. However, as we apply them only after the saturation of the spatial domain decom-

24 2. Parallel-in-Time Molecular Dynamics position and their scheme makes it possible to divide the labor among a higher number of cores, this is not a problem. The aim is to get to the result sooner, i. e., decrease the walltime of the computation, and take advantage of huge computational resources. Clearly, with massively parallel supercomputers, this happens more and more often, leading to the extensive research in this area and many applications to specific computational research problems. In this section, we first present the most common parallel-in-time methods, focusing on those applied to MD or similar problems. Then we show their applications in many research problems and last, we elaborate on the applications with molecular dynamics simulations.

Parallel-in-Time Methods Several mathematical methods have been developed that calculate parallel in time: they compute results of a time dependent differential equation in a few (successive) time points simultaneously. We adopt the classification from the review article [54], although this division is not strict. The existing methods often combine approaches or can be formulated in multiple ways. The first parallel-in-time method by Nievergelt, 1964, [130] later be- came the basis for multiple shooting methods. They divide the time in- tervals to many subintervals, roughly approximate the initial value for each interval (e. g., with large timestep, simpler integration method), solve exactly the results for intermediate time points and force the continuity with interpolation. The difference between the predicted, approximated solution and an accurate one is propagated to the next iteration. Thus, with the number of iterations reaching the number of subintervals, solutions converge to ones that would be obtained by sequential-in-time (SiT) computation. This concept of serializing the parallel-in-time computation with a large number of iterations exists also in other types of methods. The parareal method presented in 2001 [108, 13, 112] has been widely applied [128] and extensively studied [191, 190, 8, 12]. It can be formulated as a multiple shooting method. The solution λn (deter- mined by ℱ(λn−1) in SiT methods) is approximated by coarse function 풢 at the beginning of subintervals and then tuned by the fine function ℱ in all steps. The sequential nature of the computation is shifted from

25 2. Parallel-in-Time Molecular Dynamics the expensive function to the coarse approximation and ℱ can be eval- uated in parallel for several time points, see Figure 2.1. Unfortunately, its speed-up is limited by 1/K, where K is the number of iterations.

parallel t1 t2 t3 t4 t5 t6 t7 t8 t9 t10

λ = v λ λ λ λ λ λ λ init 10 G 20 G 30 G 40 G 50 G 60 G 70 G 80

F F F F F F F F ∆10 ∆20 ∆30 ∆40 ∆50 ∆60 ∆70 ∆80

0 0 0 0 0 0 0 0 0 k=0 λ1 = v G λ2 G λ3 G λ4 G λ5 G λ6 G λ7 G λ8 G λ9

F F F F F F F F F 0 0 0 0 0 0 0 0 0 ∆1 ∆2 ∆3 ∆4 ∆5 ∆6 ∆7 ∆8 ∆9

1 1 1 1 1 1 1 1 1 1 k=1 λ1 = v G λ2 G λ3 G λ4 G λ5 G λ6 G λ7 G λ8 G λ9 G λ10

1 1 1 λ1 = v λ2 ≈ λ2 ≈ ℱ∆T (T1, v) λ3 ≈ λ3 ≈ ℱ∆T (T2, λ2) 1 2 λ4 ≈ λ4 ≈ ℱ∆T (T3, λ3) λ5 ≈ λ5 ≈ ℱ∆T (T4, λ4) ...

Figure 2.1: Computational flow of the parareal method, where each subinterval includes only one timestep. Notice the sequential computation with cheap 풢 and parallel computation of expensive ℱ in several time points simultaneously. The error k k k ∆n = ℱ(λn−1) − 풢(λn−1) propagates into subsequent iteration.

The second group of parallel-in-time methods based on waveform relaxation divides the spatial domain into parts and computes the solution in all time points within these parts. The basic idea comes from the 19th century [155], with further advancements later [105, 203]. Moreover, the combinations of the waveform and multiple shooting methods were developed, dividing both the spatial and temporal domain [113]. The third group of parallel-in-time methods, multigrid, works si- multaneously on the entire domains, both spatial and temporal [77]. Recently developed PFASST [48] resembles para- real in structure but combines it with spectral deferred correction method (SDC) and full approximation scheme (FAS). SDC methods solve differential equations to high-order accuracy by iterative applica-

26 2. Parallel-in-Time Molecular Dynamics

tion of correction equations approximated with a low-order numerical method to the provisional solution [186]. PFASST combines parareal iterations and SDC iterations (also called sweeps) and interpolates the results of these iterations in both time and space with FAS, known from multigrid methods. However, due to its complexity, a systematic mathematical analysis of its convergence has appeared only recently, and solely for linear problems [24]. Thus, the conditions of its conver- gence are not known. All of the above methods use iterations to ensure the convergence. The last group of methods solves the problem directly, like predictor- corrector method [122, 58] where the two main steps can be executed in parallel.

Applications Many research areas have successfully adopted parallel-in-time meth- ods for their computational problems, for example

∙ simulations of fluid flow [208, 142],

∙ computational fluid dynamics [167, 168],

∙ simulations based on Navier-Stokes equation [53, 42],

∙ quantum control [114],

∙ quantum dynamics [118],

∙ stochastic chemical kinetics [49],

∙ power system dynamics simulations [67],

∙ reservoir simulations [56],

∙ simulations based on the heat equation [57],

∙ simulations of particles in electric/magnetic field [212],

∙ simulations of electrical activity in neural tissue [20],

∙ simulations of N-body gravitational problem [185, 187, 211, 93].

27 2. Parallel-in-Time Molecular Dynamics

Out of these, the most relevant for our work is N-body gravitational problem that resembles in some ways MD simulations, details in section 2.4.2. PFASST method has been applied to simulate extremely large simulations (109 particles) on massively parallel supercomputers (more than 250 000 cores) [211, 185]. The combination of PiT methods and molecular dynamics has been examined already several times. Baffico et al. in [11] have done the first MD simulation with the parareal algorithm. The paper examined the possible speed-ups and the suitability of the parareal scheme for MD with MM model and MD with QM model. Shorter and longer timesteps were applied for the fine and coarse functions, respectively. They concluded that this approach is worth exploring and with many possible choices of fine and coarse functions, we are “limited onlyby our imagination”. Waisman and Fish [205] combined the multigrid method for elec- trostatics calculation [174, 183, 195] and waveform relaxation method into space-time multilevel method with implicit integration scheme and speeded up a MD simulation of a polymer melt. Audouze et al. [9] and similarly Chen et al. [38] modified the para- real algorithm to include multiple timesteps. Thus, they could use different timesteps for coarse and fine functions in combination with different equations approximating the potentials. They evaluated it on 3D silicon diamond lattice with 216 atoms and achieved speed-up of 4 to 11, depending on the number of subintervals. In 2013, Bulin published a master thesis [29] where he compared the waveform relaxation method and the parareal method. According to his analysis, the waveform relaxation function is “useless for this kind of problems” due to slow convergence. For the parareal algo- rithm, he stated that it is not suitable for large-scale computing due to low speed-up. However, different coarse functions (faster, yet still reasonably accurate and numerically stable) could improve this. It is worth noting that none of these publications dealt with biomolec- ular systems and force fields typical for it. Moreover, molecular sys- tems used for evaluation were often just “toy” systems, small in a number of atoms and limited in the range of interactions. However, they confirm that MD can be combined with PiT method.

28 2. Parallel-in-Time Molecular Dynamics

2.1.3 Motivation As shown and explained in sections 1.2 and 1.1.2, molecular dynam- ics simulations provide an invaluable insight to many processes of biological and chemical interest and its benefits grow with the size of the temporal domain. Longer simulations can capture rarer or more complex processes of larger systems—all aspects that usually corre- spond with the relevance for life sciences. In order to achieve longer simulation timescales in reasonable computation time, we need to accelerate the computation.

2.1.4 Problem Description and Solution Proposal The combination of the required timescale of the simulation and the tiny timestep of the integration scheme leads to a lot of steps. So far, the acceleration efforts have focused on fast execution of a single step by optimizing the code or parallel execution through spatial domain decomposition. However, the strong scaling of these approaches is hitting the limit while the number of computational resources still grows. The spatial domain cannot be decomposed ad infinitum, at a certain point the cores become saturated and adding more would be meaningless. Any performance gain would be annulled or even exceeded by communication and synchronization overhead. At that point, we still have more cores at our disposal, we just do not know how to efficiently use them to get the result faster and make it possibleto perform longer simulations. Therefore, other means of decomposition are worth exploring. Inspired by many other computational problems, even those similar to molecular dynamics simulations, we propose to compute parallel-in-time.

2.2 Model of Computation

We build the model of molecular dynamics algorithm that emphasizes the iterative computation and the bottleneck function. Algorithm 3 combines the relevant parts from Algorithms 1 and 2. Molecular dynamics includes two main iterations within its ba- sic algorithm. First, it loops over time steps (line 2) and within each timestep, it computes the short-range interactions, all bonded and

29 2. Parallel-in-Time Molecular Dynamics

Algorithm 3 Model of molecular dynamics 1: function do_md(x) 2: loop through time steps 3: f = compute_short_range_forces(x, force_field) 4: loop through pairs of atoms or grid points in parallel 5: f += compute_electrostatic_forces(x, force_field) 6: x ← integrate(f)

van der Walls, (line 3), the electrostatic interactions (lines 4-5), and integrates the forces (line 6). Second, in order to calculate electro- static interactions, it loops over pairs of atoms or grid points or other method-specific items (line 4). The first loop corresponds to the tem- poral domain, the second one to the spatial domain. The evaluation of electrostatic interactions as a whole (lines 4 and 5 of Algorithm 3) presents the bottleneck, as the simulation spends there usually 70-80% of the computation time. This is caused by both lines. Due to the large number of atom pairs or grid points the methods scale as 풪(N2) (naive), 풪(N log N) (Fourier-based methods and methods based on hierarchical division of space) or 풪(N) but with a huge multiplicative constant (multigrid methods). The calculation of the interaction itself is rather expensive, see the Coulomb potential in Equation (2.1), as it includes a reciprocal distance between atoms.

qiqj U = (2.1) Coulomb ∑ ∑ | − | i j xi xj

where q is the partial atomic charge, x are the atoms’ coordinates, i and j loops over atoms’ indices. As reducing the computation time of a single iteration by paral- lel execution decomposing the spatial domain has reached its limits and the number of computational resources continues to grow, the parallelization of the temporal domain seems to be the natural next step.

30 2. Parallel-in-Time Molecular Dynamics 2.3 Modified Model of Computation

In order to reduce the computation time of MD simulations, we want to increase the ability to run in parallel by decomposing also the temporal domain. This, a bit nonintuitively, in fact increases the amount of computation (the number of floating point operations). However, as this computation can efficiently use more cores, it can run faster than the parallel execution with only spatial decomposition. Algorithm 4 outlines the structure of molecular dynamics algo- rithm changed with parallel-in-time execution. The first loop, over time steps, runs in parallel (line 2). The line 3 represents the overhead of parallel-in-time method. As this changed model does not rely on a specific method, the overhead might mean an additional loopof method’s iterations (like in parareal), a loop of different levels of hier- archy (like in multigrid methods), or loop over different parts of the given method (in other methods).

Algorithm 4 Parallel-in-time model of molecular dynamics 1: function do_md(x) 2: loop through time steps in parallel 3: loop through iterations/levels/parts of parallel-in-time method 4: f = compute_short_range_forces(x, force_field) 5: loop through pairs of atoms or grid points in parallel 6: f += compute_electrostatic_forces(x, force_field) 7: x ← integrate(f)

Method integrate(f) on line 7 might be explicit or included in the parallel-in-time method. Either way, the atoms have to be shifted into new positions according to computed forces. The combination of parallel-in-time method and molecular dynam- ics raises several issues. The PiT methods include the concept of coarse and fine functions, where the coarse function 풢 cheaply approximates the solution and the fine function ℱ then improves the accuracy. In the context of MD, these functions might differ in four main aspects: 1. simplification of the model, e. g., discrete MD or coarse-grained MD for 풢 vs. classical MD for ℱ;

31 2. Parallel-in-Time Molecular Dynamics

2. different parameters of MD, e. g., longer timestep for 풢;

3. different methods for electrostatics evaluation, e. g., cutoff for 풢, Fourier-based for ℱ;

4. different parameters for method calculating electrostatics, e.g., smaller cutoff or required accuracy for 풢.

Longer timestep has been a choice for many of the state-of-the-art experiments combining PiT and MD. We have chosen the third and the fourth option in our experiments, as the timestep length in MD is not a particularly flexible parameter. Its prolonging does not mean only lower accuracy, it can destabilize the whole simulation to the point of a blow-up. Electrostatic interactions do not have such dramatic effect unless they are completely omitted, and the reduction in their accuracy means also a substantial reduction in cost. The second issue is the interpolation between solutions computed by 풢 and ℱ. Clearly, this applies only if the solutions do not exist at the same time and space points. That can happen if the coarse function does not follow the all-atom model (but e. g., coarse-grained), or if it has a longer timestep. If the functions differ only in methods used for the evaluation of electrostatics, the interpolation is not necessary. We elaborate on these and further issues in the next section, sup- porting our findings with computer experiments.

2.4 Analysis of Problematic Aspects

In this section, we systematically analyze the aspects of parallel-in- time molecular dynamics that hamper a straightforward application of parallel-in-time scheme. Several issues, complicated and many of them out of the scope of computer science, have emerged during our research. They stem from the following two areas. First, the gravita- tional N-body problem, that has been successfully combined with PFASST, differs from the electrostatic N-body problem, the rootof molecular dynamics. These differences pose potential issues that need to be addressed. Second, the application of molecular dynamics into parallel-in-time scheme includes several domain-specific choices to be made.

32 2. Parallel-in-Time Molecular Dynamics

In order to show how these problematic aspects influence parallel- in-time MD simulations, we ran experiments with our prototype on four molecular systems. We have chosen instantaneous temperature as a measure describing the numerical stability in the simulation as it can quickly indicate the existence of a numerical problem. Temperature naturally fluctuates to a moderate extent, but higher peaks or steadily rising or falling temperature should raise a warning flag. A large number of factors can cause these abnormal changes: wrong structure, insufficient minimization, too short equilibration, missing thermostat or, the factor we are interested in, the destabilizing effect of the parallel- in-time integration scheme. We have carefully set up the molecular systems to avoid other causes, as described in detail in section 2.4.1. From now on, we use the following nomenclature for simplicity. Sequential-in-time, SiT simulations explicitly denote common molec- ular dynamics simulation with a usual integrator, such as velocity Verlet. On the other hand, parallel-in-time, PiT simulations run with a parallel-in-time integrator, such as PFASST. We apply three levels of accuracy to the simulations. We have three types of functions that follow a standard molecular dynamics algorithm but differ in the method for calculating long-range electro- static interactions or its parameters. A fine ℱ function is accurate but expensive, coarse 풢1 function is less accurate but cheap and coarser 풢2 is even less accurate and cheaper. We select one function for SiT simulations and a pair of functions (ℱ with 풢1, or ℱ with 풢2) for PiT simulations. Specific settings for each function differ in each dataset. Moreover, we ran simulations with the thermostat, marked as T+, and without the thermostat, marked T−.

2.4.1 Prototype and Setup of Experiments Prototype

We have selected PFASST as the PiT method to combine with molecular dynamics. Successful application to gravitational N-body problem [185, 211] gave a reasonable assumption that it would fit well also to the electrostatic N-body problem that lies in the center of MD. We implemented our prototype as a combination of PFASST code [152] for PiT scheme, LAMMPS code [158, 159] for molecular dynamics,

33 2. Parallel-in-Time Molecular Dynamics and our own code for domain-specific method calls, all in C++. The prototype has been developed to investigate whether the application of PFASST scheme as the integration method of MD is viable. As a first step, we needed to analyze the viability and the convergence. Therefore, we did not focus on performance and did not mind the overhead of the PiT scheme. Currently, the prototype runs slower than the SiT simulations, mainly due to an inefficient implementation and only a basic tuning of PFASST settings. The theoretical analysis of the potential speed-up follows in section 2.4.5. Several of experiments’ results presented in the next section con- firm that the prototype calculates accurately. First, the lattice formation of salt ions sustained during a whole PiT simulation. Second, even when working with very coarse methods, the simulations with a large number of iterations did not exhibit any suspicious behavior, sug- gesting they converged. We have not evaluated the accuracy in terms of whether PiT simulations explore the same energy landscape or simulate the same rare events as that would require long simulations unfeasible with our low-performing prototype.

Datasets and Setup

For experiments showing the main problematic aspects of the combi- nation of MD and PiT scheme, we have selected four datasets varying in bond types, size, and types of atoms and molecules. All simula- tions have been conducted with a common biomolecular force field, such as Charmm [111], including bonded, van der Walls, and elec- trostatic interactions. Functions ℱ and 풢 differ in method calculating electrostatic interactions or its parameters, such as cutoff or required accuracy. Generally, ℱ computes with the accuracy at the standard level or above it. 풢1 calculates below the standard level of accuracy and 풢2 approximates only roughly, at the level deep below the standard. We have approximated the performance gain of coarse functions by comparing the time spent on the calculation of nonbonded interac- tions in our experiments. Although these numbers depend highly on the implementation of the MD code, LAMMPS in this case, they give a basic idea of the relation between accuracy loss and performance gain.

34 2. Parallel-in-Time Molecular Dynamics

In all PiT experiments, we compute four intervals in parallel, each containing ten 2 fs steps. After obtaining the result at the last time point, we move this “computational window” forward until we reach required number of simulation steps. The number of iterations is only two in most experiments, a few exceptions of this are clearly noted. As mentioned before, it is not known what is the maximal number of iterations to ensure convergence of PFASST. In order to achieve the speed-up, it should not exceed the number of intervals which are computed in parallel. Table 2.1 summarizes the four datasets we use in our experiments and their main features.

Table 2.1: List of datasets.

type number of atoms feature salt ions 27 040 no covalent bonds water 81 small size water 4 500 typical solvent peptide in water 32 000 a biomolecular system

The lattice of 27 040 salt ions represents a molecular system with- out covalent bonds. If simulated without the thermostat, it resembles rather closely the problems of N-body gravitational simulations. We prepared the system according to [44], ran energy minimization, grad- ually heated up the system to 300 K, and ran 10 ps equilibration with the thermostat. Parameters of the force field were taken from [44]. The nonbonded interactions were modeled by the Buckingham potential (short-range) and the Coulomb potential (long-range) calculated with the Ewald method. The coarse function 풢1 calculates Coulomb within 8 Å with the accuracy of 0.11, which is little below the standard. The fine ℱ calculates up to 18 Å with the accuracy of 0.001, which is above the standard. 풢1 should run 3.5 times faster than ℱ.

1. The setting determining the accuracy of the Ewald and other methods in LAMMPS is a dimensionless number that represents the maximal relative root mean square error in per-atom forces, compared to the reference force between two particles one Å apart [100].

35 2. Parallel-in-Time Molecular Dynamics

27 water molecules represent a small molecular system, suitable to examine the effect of the PiT scheme in case of naturally highfluc- tuations of the environment measures. We generated the system with Moltemplate [83], ran energy minimization, gradually heated up to 300 K, and ran 5 ps equilibration with the thermostat. A water molecule is of the TIP3 model, nonbonded interactions were modeled by the Lennard-Jones CHARMM potential (short-range) and the Coulomb potential (long-range). The simple cutoff method was chosen to cal- culate the electrostatic interactions. The coarse function 풢1 had the cutoff at 8 Å, which is little below the standard, the fine function ℱ at 18 Å, which is above the standard. 풢1 should calculate around seven times faster than ℱ. A larger box of 1500 water molecules represents an important part of molecular systems: water is used as a common solvent in many biochemical simulations as it significantly influences the processes. We took the benchmark example from LAMMPS [159] and prepared it like the previous systems. Water molecules were of the SPCE model, nonbonded interactions were modeled by the Lennard Jones cutoff potential and the Coulomb potential calculated by PPPM method. The fine function ℱ calculated the electrostatics up to 9.8 Å cutoff with the accuracy of 0.0001, which is standard. The coarse function 풢1 calculated up to 9.8 Å with the accuracy of 0.01, which is slightly below the standard and should calculate only 1.1× faster. The coarser function 풢2 calculated up to 3.0 Å with the accuracy of 1.0, which is deeply below the standard, almost completely omitting the elec- trostatic interactions. However, it should calculate 12.5× faster than ℱ. Peptide (small protein) in water represents a biomolecular system. We took the benchmark example of the rhodopsin in water solvent from LAMMPS [159] and prepared it with short heating up to 300 K and the equilibration with thermo- and barostat. The CHARMM force field is used with PPPM as the method for electrostatics. Thefine function ℱ used cutoff 18 Å with 0.0001 accuracy of P3M, whichis above the standard. Coarse function 풢1 used cutoff 8 Å and 0.01, which is below the standard; it should calculate five times faster. The coarser function 풢2 calculated electrostatics only within 3.0 Å radius with the

36 2. Parallel-in-Time Molecular Dynamics

accuracy of 1.0, which is deeply below the standard; it should calculate 38-times faster. All experiments ran for 5000 steps with the timestep of 2 fs.

2.4.2 Differences between Gravitational and Electrostatic N-body Problem The N-body problem deals with interactions between N bodies. Gravi- tational N-body problem [1] deals with interactions of celestial bodies that are governed by gravity. The electrostatic N-body problem, as modeled by molecular dynamics, shares one part and differs in three aspects. The Coulomb potential (Equation (2.1) on page 30) is an analogy to the Newton’s law of gravity, both share the same or similar methods for their fast calculation. However, the interatomic potential includes more than just one type of interactions, such as bonded or van der Walls interactions, opposed to sole gravitational potential. Naturally, that complicates the whole computation, as the superposition of many interactions acts nonlinearly. Furthermore, the distances between atoms are restricted in both directions, while celestial bodies can both collide and drift far apart. Atoms cannot get too close to each other, the potentials (bond or van der Waals) are designed to build up a substantial force pulling them apart. Moreover, the distances between bonded atoms cannot grow far beyond a “normal” length. Molecular dynamics with the model of molecular mechanics does not permit the breaking and forming of bonds. Such behavior, if not prevented by potential or special con- straints, would result in the blow-up of the simulated system. And finally, simulations of molecular dynamics usually control the environment: the temperature, and pressure or volume. This cor- responds with the fact that MD simulates the living world, which functions only within a narrow interval of temperatures. The simula- tions of celestial bodies do not need to restrict these measures. Even though parallel-in-time integration scheme has been suc- cessfully applied to the gravitational N-body problem, these factors and their influence need to be analyzed. This subsection follows with three cases of experiments with various molecular systems. Each one focuses on a different factor and we explain its influence.

37 2. Parallel-in-Time Molecular Dynamics

Salt

To observe a molecular system as close as possible to systems from the gravitational N-body problems, we simulated salt ions, i. e., a system without covalent bonds. The atoms still cannot collide, but we could analyze the influence of the environment control without the effects of bonded interactions. Moreover, salt ions form a lattice, which would break in the case of major negative interventions to the simulation.

360 315 G T+ F T+ 350 310 340 G T- F T- 305 330 320 300 310 295 temperature [K] temperature [K] temperature 300 290 290 0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000

steps steps

Figure 2.2: Temperature fluctuations in SiT simulations of salt ions with coarse (G) and fine (F) function. Thermostat (T+) keeps the temperature closely around the target. Without it (T-), the system is a bit less stable.

First, see the normal behavior in SiT simulations in Figure 2.2. With the thermostat, the temperature stays in the close proximity of the target value–300 K, suggesting a well-equilibrated environment. Even without the thermostat, we can see only modest and slow heating up in the 풢1 simulation and cooling down in the ℱ simulation. The bump at the beginning of 풢 simulations can be attributed to the equilibration process, as explained in the next section. Parallel-in-time simulations also kept the temperature around the target value with the similar extent of fluctuations, see Figure 2.3. Moreover, the lattice of the salt ions did not break. As the scheme achieved this convergence after only two iterations, we consider that PFASST integration scheme works for the molecular system without covalent bonds. Moreover, the control of the environment itself does not cause any major problems to the numerical stability of the parallel- in-time integration.

38 2. Parallel-in-Time Molecular Dynamics

335 330 T+ 325 T- 320 315 310 305

temperature [K] temperature 300 295 290 0 1000 2000 3000 4000 5000

steps

Figure 2.3: Temperature fluctuations in PiT simulations of salt ions. Thermostat (T+) keeps the temperature stable, without it (T-) the system starts to heat up.

Peptide in Water

However, molecular systems without covalent bonds are more of an exception than the rule in usual MD simulations. Therefore, we ran experiments with the protein in water. Again, first, see the normal behavior on Figure 2.4. The temperature fluctuates to a larger extent compared to the salt system, but still within reasonable intervals. The thermostat is able to keep the temperature stable in simulations with both functions. Without it, the 풢1 simulation starts to overheat gradually, while the ℱ function manages to keep it stable due to its high accuracy. Parallel-in-time simulations of a protein in water raise some issues. While the thermostat manages to keep the temperature stable, when we turn it off, the system quickly overheats. As the PiT simulations of salt ions did not exhibit such behavior with the same level of accuracy of 풢 functions, we can speculate that the addition of covalent bonds caused this. PiT integration scheme seems to magnify any destabilizing effect on the system.

39 2. Parallel-in-Time Molecular Dynamics

340 340 330 G1 T+ 330 F T+ G1 T- F T- 320 320 310 310 300 300

temperature [K] temperature 290 [K] temperature 290 280 280

0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000

steps steps

Figure 2.4: Temperature fluctuations in SiT simulations of peptide in water with coarse (G) and fine (F) function. Thermostat (T+) keeps the temperature closely around the target. Without it (T-), the 풢 simulation slowly heats up.

1000 T+ 800 T- 600 400

temperature [K] temperature 200 0 0 1000 2000 3000 4000 5000

steps

Figure 2.5: Temperature fluctuations in PiT simulation of peptide in water. Thermostat (T+) manages to keep the temperature stable, but without it (T-) the system quickly overheats.

Small Water System To confirm this, we ran a simulation with a small molecular system that naturally fluctuates wildly. We simulated 27 molecules of water, i. e., 81 atoms. First, see the normal behavior in Figure 2.6. Notice the large extent of fluctuations, almost ±100 degrees. This is caused by the size of the system. Any change in the velocity of an atom naturally influences the temperature much more if the whole system is smaller.

40 2. Parallel-in-Time Molecular Dynamics

500 500 450 G T+ 450 F T+ 400 G T- 400 F T- 350 350 300 300 250 250 temperature [K] temperature [K] temperature 200 200 150 150 0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000

steps steps

Figure 2.6: Temperature fluctuations in SiT simulations of 27 water molecules with coarse (G) and fine (F) function. Even with thermostat (T+), the temperature fluctuates wildly.

2000 1800 T+ 1600 1400 T- 1200 1000 800 600 temperature [K] temperature 400 200 0 1000 2000 3000 4000 5000

steps

Figure 2.7: Temperature fluctuations in PiT simulation of 27 water molecules. Thermostat (T+) manages to keep the temperature stable, but without it (T-) the system quickly overheats.

Parallel-in-time simulations of such small systems worked in a similar manner than the previous example, see Figure 2.7. Again, the thermostat (T+) kept the temperature around the target temperature. The large extent of these oscillations corresponds to that of the SiT simulations. Without the thermostat (T-), the system quickly overheats. We ran also experiments with the system of a large number of water molecules, the results are in the next section.

41 2. Parallel-in-Time Molecular Dynamics

To conclude, the differences between the N-body gravitational problem and the N-body electrostatics problem do not make it im- possible to apply PFASST as the parallel-in-time method to molecular dynamics. However, PiT scheme magnifies the numerical instabilities during the simulation. The thermostat manages to keep the destabi- lizing effect of the PiT scheme under control, at least with reasonably accurate coarse functions we used in these experiments.

2.4.3 Application of Molecular Dynamics to Parallel-in-Time Integration Scheme In order to combine molecular dynamics and parallel-in-time scheme, we need to choose domain-specific components for the coarse and the fine function and the initial value. We show these problematic aspects on the medium-sized molecular system with water molecules and on the system of a protein in water. The choice of fine and coarse functions is not an easy task. Inorder to get some speed-up, 풢 has to be much cheaper than ℱ (more on the theoretical speed-up in section 2.4.5). However, lower cost means less accuracy, and too little accuracy means numerical instabilities. As the calculation of electrostatic interactions takes the most of the computation time of molecular dynamics simulations, it is natural to look for a performance cut there. Typically, the settings for MD simu- lations offer the specification of cutoff within which the electrostatic interactions are calculated and/or the maximal allowed relative error of the given method. All other parts of MD algorithm (calculation of bonded interactions, control of the environment, integration) remain the same. As the electrostatic interactions influence the behavior of the molecule quite extensively, it might be difficult to find the balance between the performance and accuracy. The initial starting structure also poses a problem. Stable molecular dynamics simulations require equilibrating the structure beforehand. However, if we equilibrate with ℱ and then run the simulation with 풢 or the other way around, this causes numerical instability at the beginning of the simulation. First, see the normal behavior on the SiT simulation of a water system. Figure 2.8 shows the temperature fluctuations of SiT simula- tions when we equilibrated the system with a different function than

42 2. Parallel-in-Time Molecular Dynamics

350 1000 340 F T+, eq G1 900 F T+, eq G2 330 G1 T+, eq F 800 G2 T+, eq F 700 320 600 310 500 300

temperature [K] temperature [K] temperature 400 290 300 280 200 0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000

steps steps

Figure 2.8: Temperature fluctuations in SiT simulation of water. Equilibration (eq) was done with a different function than the simulation itself. 풢2 is even less accurate function than 풢1. The bump in the temperature at the beginning evens out after a few hundred steps. we used for the simulation itself. To better show the influence of the coarse function, we experimented with two coarse functions, 풢1 and 풢2, where 풢2 is even less accurate than 풢1 (it almost completely ignores the electrostatic interactions). The thermostat in the simulation with ℱ is able to cope with the instability in case of 풢1 used in equilibration, but in the case of 풢2, we can see the bump. Both coarse functions need some time to deal with the inconsistency caused by ℱ equilibration, but the bump is higher and takes longer to even out in the case of 풢2. Parallel-in-time simulations again confirm that they magnify the instabilities, see Figures 2.9 and 2.10. After the initial bump, simula- tions were not able to get back to the target temperature, unless the number of iterations got close to the number of steps in which we compute in parallel. So, the inaccuracy of the coarse function and subsequent numerical errors can be solved by the higher number of iterations. However, these instabilities depend greatly on the accuracy of the coarse function. We ran experiments with a protein in water where we kept 풢1 rather accurate, only slightly below common values for electrostatics cutoff. On the other hand, we increased the accuracy of ℱ above the usual. The PiT simulations done with this function do not show any bumps and maintain the target temperature from the beginning, see Figure 2.11. The simulation with the starting structure equilibrated by 풢 functions looked similar. On the other hand, 풢2 with

43 2. Parallel-in-Time Molecular Dynamics

550 1400 500 F G1 k=2 1200 F G2 k=2 450 F G1 k=4 1000 F G2 k=4 400 F G1 k=10 800 F G2 k=10 600 350 400

temperature [K] temperature 300 [K] temperature 200 250 0 0 0 2000 4000 6000 8000 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 10000 12000 14000 16000 18000 20000

steps steps

Figure 2.9: Temperature fluctuations in the PiT water simulations with starting structure equilibrated by ℱ. On the left, simulations were run with ℱ and 풢1. On the right, simulations were run with ℱ and 풢2. k is the number of iterations in the PiT scheme.

600 1400 550 F G1 k=2 1200 F G2 k=2 500 F G1 k=4 1000 F G2 k=4 450 F G1 k=10 800 F G2 k=10 400 600 350 400

temperature [K] temperature 300 [K] temperature 200 250 0 0 0 2000 4000 6000 8000 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 10000 12000 14000 16000 18000 20000

steps steps

Figure 2.10: Temperature fluctuations in the PiT water simulations with starting structure equilibrated by 풢1. On the left, simulations were run with ℱ and 풢1. On the right, simulations were run with ℱ and 풢2. k is the number of iterations in the PiT scheme.

44 2. Parallel-in-Time Molecular Dynamics

600 F G2, eq F F G1, eq F 500 400 300 temperature [K] temperature 200 0 1000 2000 3000 4000 5000

steps

Figure 2.11: Temperature fluctuations in the PiT protein in water simulations. The starting structure was equilibrated with ℱ. As 풢2 is less accurate than 풢1, the simulations do not recover so well after the initial bump.

low accuracy caused that the initial bump was higher and not even 5000 steps sufficed to recover from this. Also, the fluctuations ofthe temperature remain quite wild during the entire run. To conclude, the choice of fine and coarse functions is crucial to the later behavior of PiT simulations in terms of both the accuracy and performance. The equilibration of starting structure influences more the simulations with inaccurate coarse function, otherwise the thermostat deals with the initial bump efficiently in few hundred steps.

2.4.4 Overview of Problematic Aspects We analyzed the potentially problematic aspects of combining MD and PiT scheme. Based on our experiments, we conclude the following.

∙ Control of the environment (thermostat) does not interfere with PiT .

∙ More than one potential between N bodies makes the problem more complex, but still manageable.

45 2. Parallel-in-Time Molecular Dynamics

∙ Any spontaneous wilder fluctuations are magnified with PiT scheme and more prone to overheating, potentially a blow-up.

∙ Equilibration of the starting structure needs to be long enough, but specific function used does not matter. After the initial bump in the temperature, the thermostat manages to compensate after a few hundred steps. These can be omitted from the resulting trajectory, so they would not affect the further analysis.

∙ Water behaves rather specifically, it shows less stability than the system with protein for the comparable accuracy of used func- tions. No such strange behavior was observed in SiT simulations.

∙ The selection of ℱ and 풢 crucially influences the PiT simulation. Even simulations with the coarse function of the accuracy deep below the standard still managed to get the temperature down to the expected value.

However, the question of correctness, i. e., whether PiT and SiT simu- lations simulate the same thing, emerges. We elaborate on that later in discussions about future work.

2.4.5 Theoretical Speedup Here, we outline the potential speed-up brought by PFASST method in molecular dynamics simulations. Before further analysis, it is crucial to realize that the overall cost of PFASST (and any parallel-in-time method) greatly exceeds the cost of SiT computation. It simply requires more floating point operations to be calculated. Any speed-up canoc- cur only if it enables the saturation of larger number of computational resources and thus reduces the wallclock time of the computation. Even then, PFASST cannot provide the optimal efficiency [185], thus it should be applied only after the saturation of spatial domain decom- position. To show the conditions under which the speed-up is possible, we compare the cost of computing T steps, where the parallel-in- time computation executes on T cores. The cost of sequential-in-time computation is Cs = TΥℱ , where Υℱ is the cost of the evaluation of the fine function ℱ. The cost of parallel-in-time computation depends on

46 2. Parallel-in-Time Molecular Dynamics

the cost of both functions, the number of iteration Kp and the overhead Υo, see Equation (2.2).

Cp = TΥ풢 + Kp(Υ풢 + Υℱ + Υo) (2.2) where T is number of cores computing parallel in time, Υ is the cost of function, 풢 is the coarse function, ℱ is the fine function, o is the overhead, Kp is the number of iterations of PFASST necessary for the convergence. The computation speeds up if Cs > Cp, or

(1 − α)T > Kp(1 + α + β) (2.3) where α = Υ풢 is the ratio between the cost of the coarse and the fine function, and Υℱ β = Υo is the ratio between the cost of the overhead and the fine function. Υℱ Clearly, to achieve high speed-up:

∙ the number of iterations Kp should be low, ∙ the number of time points, in which the computation executes parallel in time, T should be high and T > Kp, ∙ the coarse function should be much cheaper than the fine func- tion, i. e., α ≪ 1, ∙ the overhead should be low, i. e., β ≪ 1. For example, if 풢 calculates five times faster than ℱ, i. e., α = 0.2 and the overhead costs 10% of the ℱ’s cost, i. e., β = 0.1, we calculate four intervals in parallel with two iterations, the theoretical speed-up achieves 1.23. If we manage to calculate ten intervals in parallel, i. e., T = 10 with just two iterations, the speed-up exceeds three. This seems to be a rather modest speed-up. To put these numbers into perspective, we can compare them with the results published at the conference with A+ rank. Speck et al. [185] analyzed the contribu- tion of PFASST in the computation of spherical vortices. The spatial domain decomposition managed to saturate 8192 cores simulating four million particles. With PFASST, they achieved the speed-up of 7 by using 262 144 (32 times more) cores. The poor efficiency of parallel-in-time algorithms determines their application only after the saturation of the spatial domain and only in the case of a large number of additional computational resources available.

47 2. Parallel-in-Time Molecular Dynamics 2.5 Conclusion

In this work, we have addressed the limitations of strong scaling cou- pled with the necessity for long simulation timescales in molecular dynamics simulations by combining them with parallel-in-time com- putation scheme. Traditional means of acceleration, such as faster methods for computing bottleneck function or parallel-in-space com- putation, focus on speeding up the execution of a single step. However, the spatial domain cannot be divided ad infinitum, at a certain point the cores become saturated and more resources do not bring additional speed-up. With the growing number of computational resources, e. g., in massively parallel supercomputers, this saturation happens more and more often. Thus, we shifted our focus from the acceleration of a single step to the loop though timesteps of the simulation. By intro- ducing a parallel scheme along the temporal domain, we can get to the results faster. During our research on how to combine parallel-in-time computa- tion with molecular dynamics, we came across several difficult prob- lems. Even though we have not managed to solve all of them and present here ready-for-production software, we verified that the com- bination is possible and from the point of macroscopic properties (like temperature), simulations seem to give the results corresponding to those obtained in the sequential-in-time simulations. Furthermore, we systematically analyzed all aspects that need to be addressed, a viewpoint missing in the literature. We showed how these aspects exhibit on experiments of four datasets, one of them a protein in water. To our knowledge, it was the first parallel-in-time MD simulation ofa biomolecular system. We plan to publish our analysis in a scientific journal. Two major challenging questions emerged from our analysis. First, how to choose the fine and the coarse functions necessary for PiT computation in a way that balances the performance and accuracy differences. This issue lies in the field of computational chemistry. Second, whether the parallel-in-time simulations would simulate the same rare event as the sequential-in-time simulations and how to evaluate this. This issue partially belongs to the field of computer science, being analogous to the evaluation of correctness, accuracy and numerical stability of parallel and sequential execution, or space

48 2. Parallel-in-Time Molecular Dynamics

search. We have not conducted experiments investigating this hypoth- esis as their length in the combination with low performance of our prototype makes it unfeasible. Moreover, these questions are intertwined, the choice for the coarse function will probably influence the correctness of the parallel-in-time simulation. This issue would require interdisciplinary research with experts both from computational chemistry and computer science. In this research problem, we modified the model computing many sequential iterations through time steps by introducing the parallelism in time. This combination allowed us to discover several new difficult problems that need to be addressed before its common use. Our thor- ough analysis supported by many experiments can provide a starting point for future research.

2.5.1 Future Work This work opens several interesting directions for future research. As mentioned above, the evaluation of correctness of PiT simu- lation belongs to the computer science. The numerical stability of parallel-in-time schemes has been analyzed [191, 190, 128, 8, 12], but the specific combination with molecular dynamics introduces new challenges. Our experiments with temperature as a measure of stability provide only basic verification. Although instantaneous temperature and its fluctuations might be a rather reliable indicator of numerical instabilities, they do not catch all the possible problems. Stable tem- perature does not necessarily mean that the simulation does what it should, and on the other hand, a slightly overheating system can still simulate quite accurately. Many MD simulations are done to capture a rare event, such as a conformation change. Will these happen also in PiT simulations? Or will it take longer than usual for them to occur? Can too coarse function prevent them from happening or will the fine function ensure them? Moreover, the trajectories of PiT and SiT simulations of the same system will differ due to chaotic character of MD, rounding errors and indeterministic parallel execution. However, even different trajectories can cover the same space, similarly as various methods for space search eventually find all possible solutions. The techniques for this comparison exist. As we have turned our interest to the research of

49 2. Parallel-in-Time Molecular Dynamics rare events (see chapter 3) and we needed the same techniques for the evaluation there, we outline them in section 3.4.6. The question of appropriate selection of the fine and the coarse function also offers intriguing options for future research, albeit not completely in the scope of computer science. An approach worth exploring would be to set the fine function as MD with quantum model and the coarse function as MD with molecular mechanics model, resulting in hybrid parallel-in-time simulations. Furthermore, the sole implementation of the software combining parallel-in-time method and molecular dynamics code has proven as a non-trivial task. Several issues need attention, including the close mutual code integration, diminishing the overhead of parallel-in-time method, and the combination of parallel-in-space and parallel-in-time computation. Since the testing and evaluation themselves pose a com- plicated aspect due to the chaotic character of MD, these tasks will require intensive and high-quality engineering.

50 3 Approximation of Mean Square Distance Com- putations in Metadynamics

In the second problem we have researched, we approximated the com- putation of mean square distances between the simulated molecular structure and many structures given beforehand. Such a computation creates a bottleneck in molecular simulations extended by metady- namics. Molecular dynamics simulates the movements of atoms caused by their interactions and the environment [82]. Due to its high cost, the usual computational speed reaches (at most hundreds of) nanoseconds of simulated time per day of computation time. However, as many biologically and chemically relevant processes would need unfeasibly long simulations, the scientific community researched methodologi- cally new approaches to study them. As mentioned in section 1.1.2, one of the reasons of long simula- tion timescales are rare events. Molecules strongly favor staying in the energetic minima, thus they spend most of the simulation time fid- geting there. On the contrary, many interesting phenomena (chemical reactions, change of conformation, binding of a ligand to an active site of a protein etc.) consist of or are triggered by a rare event: the transi- tion over the energy barrier and fall to another minimum. Since these crossings rarely occur spontaneously, the simulation takes longer with the higher height or number of barriers. However, in order to study phenomena at the atomic scale, a spe- cific, contiguous trajectory from a molecular dynamics simulation is not always necessary, the focus lies in exploring as much of the energy landscape as possible and observing transitions. Therefore, we can af- ford to meddle with the simulation to achieve longer timescales, even if it means non-realistic trajectory, as long as we are able to reconstruct the energy landscape of unbiased simulation. Several methods “push the simulation forward” with artificial facilitations [132, 156]. One of them, metadynamics [98], steers the simulation by adding a bias potential to the system. It fills the energy minima with “computational sand”, making it easier to cross the peaks.

51 3. Approximation of Mean Square Distance Computations in Metadynamics

Naturally, these interventions come at the cost. Some overhead is expected, although it is no longer negligible in some promising techniques. Substantial slow down of the simulation jeopardizes the benefit of rare events’ acceleration. That is also the case of our research problem: numerous mean square distance computations in every step introduce significant computational demands in the metadynamics simulation. Here we propose a method that approximates these com- putations by reusing the calculations from the previous steps. This diminishes the number of bottleneck function calls and speeds up the simulation. This chapter continues as follows. First, we describe metadynamics as a method for acceleration of rare events in molecular dynamics simulations. Second, we introduce the computational problem at hand, analyze the bottleneck, and build a model that emphasizes the loop. Third, we propose our approximative model that reduces the number of bottleneck function calls. And last, we evaluate the accuracy and speed-up of the proposed approximated model.

3.1 Problem

In this section we introduce metadynamics, describe the computa- tional problem at hand, and explain the motivation to solve it.

3.1.1 Metadynamics Molecular dynamics simulations have become a common tool for re- searching biologically and chemically relevant processes. Their rather high computational cost grows with the number of atoms and time steps. From these two factors, the size of the temporal domain causes more problems as it is more difficult to parallelize. With highly opti- mized and GPU-accelerated codes [136, 192, 59], massively parallel computational resources [211, 50, 129], or specific hardware [179, 178, 115, 91], it is currently possible to simulate up to a few microseconds [102, 73, 165, 206, 150, 45]. Longer timescales are hardly, if even, possi- ble [177], although many processes of scientific interest take longer. The simulation can meaningfully characterize the process only if it visits all relevant states of the molecule [21]. They usually correspond

52 3. Approximation of Mean Square Distance Computations in Metadynamics

to minima in the energy landscape, separated with barriers. As the transitions over these barriers happen only rarely (molecules tend to stay in the energy minima), the length of the simulation grows with their height and number. In other words, we have to wait through long parts of molecule’s fidgeting in local minima of energy land- scape before a traverse over a high-energy barrier happens. Therefore, the sampling of the energy landscape is highly heterogeneous, over- examined in minima and underexamined in high-energy states. Many techniques have been developed to enhance this sampling, for review see [199, 189, 97]. Methods shorten the slow and uninter- esting parts with

∙ non-physical interventions in alchemistic simulations [182] or simulated annealing [21];

∙ switching the coordinates between several same-start simula- tions with a different parameters in replica exchange [193], e.g., temperature in parallel tempering [120];

∙ parallel execution of many short simulations and building Markov chain to trace the process [101, 163];

∙ biasing the potential with artificial additions in metadynamics [98], umbrella sampling [200], steered molecular dynamics [65], and many others [97, p.126601:2].

Metadynamics (MTD) “pushes the simulation forward” with a bias potential that disfavors the already visited states [15]. The bias potential “floods” local minima and its forces pull the system away from the sampled states. For better visualization, see Figure 3.1. We can say that metadynamics fills the valleys of the energy landscape with computational sand, making it easier to cross the peaks of mountains even with random chaotic movement. Towards the end of the well- functioning metadynamics simulation, the system diffuses between several local minima with random walk. Conveniently, metadynamics keeps the continuity in the simulation and the process still somewhat corresponds to reality as the bias potential does not cause non-physical artifacts, it just facilitates the sampling. Algorithm 5 depicts how the metadynamics code fits into typical molecular dynamics algorithm. In

53 3. Approximation of Mean Square Distance Computations in Metadynamics

Figure 3.1: Filling potential energy minima with a bias potential. The simulation starts with x = 0, and it accumulates the bias potential in this area first. After 20 steps the energy barrier at x = −1.75 is overcome and the bias potential starts filling the minimum at x = −4. After approx. 300 steps the state space is explored. Taken from [98], image copyright (2002) National Academy of Sciences. every step (line 3), after the usual computation of the forces determined by the force field potential (line 4), metadynamics adds the bias force determined by the bias potential (line 5). The algorithm follows with the usual integration (line 6) and the loop continues with the next iteration. In order to put the sand into right places, metadynamics needs several variables that describe somehow the simulated process. These are called collective variables (CVs). The bias potential accumulates the Gaussian hills put into their center every few steps, see Equation (3.1) [97]. The choice and design of collective variables crucially influence the success of metadynamics simulations [15, 97]. CVs must comply with several requirements: a) they must distinguish between different states of the system through the process; b) they must include all relevant motions accompanying the process; c) they must be limited in number,

54 3. Approximation of Mean Square Distance Computations in Metadynamics

Algorithm 5 Molecular dynamics simulation enhanced with metady- namics 1: function do_md(x) 2: init() 3: loop through time steps 4: f ← compute_md_forces(x, force_field) 5: f += compute_mtd_forces(x, mtd_params) 6: x ← integrate(f) 7: finalize()

as the accuracy of metadynamics decreases and the computational demands increase with the growing number of CVs [99, 15]; d) they need to be differentiable in order to evaluate the bias force. To explain points a) and b), we take the simulation from [26]. Bran- duardi et al. studied how one molecule (TMA, tetramethylammonium) penetrates the gorge of another molecule (AChE, acetylcholinesterase). At first, they chose one collective variable: the distance between the centers of two molecules. Obviously, this variable follows the progress of the process. However, the metadynamics simulation indicated a problem as it got stuck at a certain point until some other rare event, uncaptured by this CV, happened. Further analysis showed that a part of AChE needs to open up like a gate for TMA to enter. After adding a second collective variable that described the process of opening, the metadynamics worked fine.

′ ! (S (x(t)) − st )2 ( ( ) ) = − i i Vbias S x , t ∑ ∏ w exp 2 ′ 2σ t =τG,2τG,... i t

computing collective variables, τG is the frequency of Gaussian hills addition, w t′ and σ is the height and the width of the Gaussian hill, si is the value of i-th CV at ′ ′ time t , i. e. Si(x(t )). Clearly, the design or choice of collective variables requires a sub- stantial understanding of the chemical and physical background of the

55 3. Approximation of Mean Square Distance Computations in Metadynamics studied process. This becomes especially difficult in complex processes like protein folding.

Property Map Collective Variables Simple collective variables as distances or angles between specific atoms and their combination can describe only a limited number of processes. For more complex ones, more sophisticated approaches have been developed, like path [27] or more general property map [188] collective variable. They include reference structures, a series of snap- shots of the molecular system throughout the process. Notice here that we do not need a contiguous process for that, we can use any non- physical actions, high temperatures, annealing techniques or even manual construction in a visualization software. The value of the collective variable depends on the distances between the current coor- dinates of the molecular system and the coordinates of these reference structures, see Equation (3.2). It averages the values of the property qi of the closest reference structures weighing their contribution by their distance to the current structure. This property can be anything from an index (as in path collective variables), through the number of specific type of torsion angles, to the value of low-dimensional projection obtained by the dimensionality reduction technique.

N ∑ qi exp (−λD(x, ai)) = S(x) = i 1 (3.2) N ∑ exp (−λD(x, ai)) i=1 where x is the current structure (i. e., coordinates of the molecular system), ai is i-th reference structure, N is the number of reference structures, qi is an arbitrary property of the given reference structure, D() is the distance function, λ is a tuning parameter. Property map collective variables apparently need the distances between the current structure and the reference structures for their evaluation, see Equation (3.2). As the number of reference structures can be rather high and the contribution to the CV value decreases ex- ponentially with the distance, in many cases it suffices to compute the distance to the few closest reference structures kept in the neighborlist

56 3. Approximation of Mean Square Distance Computations in Metadynamics

(NL). To catch up with the movement of the system in the simula- tion, this list needs to be updated every few steps by computing the distance to all reference structures and picking the closest ones. The neighbor lists work well for smooth processes. If the system undergoes abrupt and quick changes, the necessity to update them (almost) every step means no acceleration and they lose their function. In such cases where it would be meaningless, by no neighborlist we mean computing the distances to all reference structures in every step. Algorithm 6 shows the deeper level of metadynamics code. In order to compute bias forces, we first need to compute the values of collective variables, the most demanding part of this algorithm. Only then we can recompute the bias potential, add another Gaussian hill to the bias potential if t = kτG and calculate the bias force. These computations follow Equation (3.1). Method compute_cvs(x, ref_structures) starts with the com- putation of distances between the current structure and the reference structures in the neighbor list. Notice that beside the distances we have to calculate also the derivatives ∂D(x, ai)/∂x as these are needed to compute the bias force (Equation (3.1)). Every few steps, we update the neighbor list by computing the distance to all the reference struc- tures and selecting the closest ones. The evaluation of CV values with Equation (3.2) is the simplest step in this method.

Computing Mean Square Distance Between Molecular Systems

The distance function does not impose any special requirements, ex- cept for differentiability. In computational life sciences, the traditional choice for the distance between molecular systems is mean square distance/deviation (MSD) [64, ch.16]. This work, therefore, focuses on this metric, although the principles of the approximated model might be applicable to other distance functions, as explained later in section 3.3. Algorithm 7 depicts how MSD is computed. First, we have to super- impose the structures: translate and rotate them so that their mutual distance is the lowest possible. Only after that, we calculate the Eu- clidean distance between coordinates of fitted systems and its deriva- tive with respect to coordinates.

57 3. Approximation of Mean Square Distance Computations in Metadynamics

Algorithm 6 Molecular dynamics simulation enhanced with metady- namics and property map collective variables 1: function compute_mtd_forces(x) 2: CVs ← compute_cvs(x, ref_structures) 3: Vbias, Fbias ← Equation (3.1) 4: return Fbias

1: function compute_cvs(x, ref_structures) 2: loop through ai in neighlist 3: distance[i], derivatives[i] ← compute_distance(x, ai) 4: if step + 1 % neighstride == 0 then 5: neighlist ← ref_structures 6: if step % neighstride == 0 then 7: neighlist ← update_neighlist(distances, ref_structures) 8: loop through the cvs 9: cvj ← Equation (3.2) 10: return cvs

Algorithm 7 Mean square distance computation between the current and the reference structure 1: function compute_distance(x, a) 2: tx ← compute_translation_vector(x) 3: ta ← compute_translation_vector(a) 4: Rxa ← compute_rotation_matrix(x, a) Natoms 5: distance ← ∑j ‖xj − tx − Rxa(aj − ta)‖ 6: derivatives ← calculate_derivatives(x, a) 7: return distance, derivatives

58 3. Approximation of Mean Square Distance Computations in Metadynamics

Shifting both molecules to the same location in the coordinate sys- tem means fixing the center of molecule’s mass, i. e., subtracting the center of the mass point from all coordinates. On the contrary to this straightforward computation, finding a rotation that minimizes the distance is not trivial. Several methods have been developed, nicely summarized in [55]. Among the most known and applied are Kabsch’s algorithm based on singular value decomposition [88] and Kearsley’s algorithm that utilizes quaternions [89]. It has been shown that meth- ods based on these two principles are numerically equivalent [40]. They both depend on a demanding matrix operation, singular value decomposition, and matrix diagonalization, respectively, so their per- formances also match.

3.1.2 Motivation The acceleration of MSD computation would obviously speed up MD simulations enhanced by MTD with path or property map CVs. Re- ducing the computational overhead brought by metadynamics makes it possible to simulate at the computational speed of classical MD while reaching longer simulation timescales thanks to MTD. Faster metadynamics means shorter computation time or the same computation time with increased accuracy by adding more reference structures. The motivation for faster simulation presented in the pre- vious chapter holds valid also here: faster means more frequent runs, feasible longer timescales, feasible larger molecules, and the increas- ing potential for further development of simulation methods. The increased accuracy relates to the more accurate representation of the real world. With more reference structures, we can explore broader energy landscape and simulate various alternative pathways. Moreover, the methodological advance can enable or facilitate research in this area as outlined in section 3.5.1.

3.1.3 Problem Description and Solution Proposal Metadynamics, a method for accelerating rare events in molecular dynamics simulations, needs a few preselected degrees of freedom, collective variables, to drive the simulation into the right direction. Their choice depends on chemical and physical characteristics of the

59 3. Approximation of Mean Square Distance Computations in Metadynamics simulated process and becomes more difficult for complex processes like protein folding. Path or property map collective variables show promising results even in these cases. They are based on the compari- son to a set of predefined reference structures, snapshots of the system throughout the process. Clearly, the advantage of enhanced sampling comes at increased computational cost. However, with property map collective variables, a large number of needed MSD computations between the current structure and reference structures can take a substantial portion of computation time. To analyze that, we have run metadynamics sim- ulations with Gromacs as MD tool and Plumed as MTD tool on two molecular systems: a small, 24-atom cyclooctane and a larger, 304- atom Trp-cage protein. We ran both with one process and one , for further computational details see section 3.4.3. We profiled the simulation with intrinsic detailed timers available in Plumed logging and Intel VTune Profiler. Table 3.1 summarizes the percentage of computation time spent in various metadynamics-related parts of code. In the case of cyclooctane, the huge cost of metadynamics is attributed to two factors: a small cost of molecular dynamics simulation due to the small size of the molecule, and no neighbor list due to abrupt changes in the molecule’s structure. The mini-protein Trp-cage presents a more common case, but even with a middle-sized neighbor list, distance computations take almost half of the computation time.

Table 3.1: Percentage of computation time spent in metadynamics (MTD), CV evaluation (CV), distance computation (MSD), and rotation matrix computation (DSYEVR).

MTD CV MSD DSYEVR cyclooctane 99.7% 98.5% 93% 70% Trp-cage 53% 50% 43% 23%

Plumed, a standard software among metadynamics codes [201], implements Kearsley’s method [89] that computes the rotation matrix from a quaternion’s eigenvalues. The related matrix diagonalization is done by the BLAS function DSYEVR [23]. Note that according to

60 3. Approximation of Mean Square Distance Computations in Metadynamics

x:

ai:

Figure 3.2: The original method computes the distances to many reference structures in every step.

the Table 3.1, the metadynamics-related computation spends a lot of time in this already highly-optimized function. Parallel execution of Plumed implemented through MPI shows modest scalability, details in section 3.4.5. It can help to some extent with larger molecules, but more processes do not bring any performance improvement with small molecules. Offloading to GPU is far from trivial as we outlined in [52]. Therefore, to speed up the computation, we propose to focus on how we can modify the whole method in a way that reduces the number of distance computations.

3.2 Model of Computation

We build the model of the original method for CV evaluation to em- phasize the iterative computation of the bottleneck function. First, we explain through Algorithm 8 the relevant parts of metadynamics- related computation combining the methods in Algorithms 6 and 7. Second, we elaborate on bottleneck function and the iteration. As the Algorithm 8 and Figure 3.2 shows, in order to compute the property map collective variables, we need to compute the distance be- tween the current structure and all reference structures in the neighbor list. First, the translation vector, i. e., the center of mass is calculated for x and ai. Shifting the structures does not pose any computational

61 3. Approximation of Mean Square Distance Computations in Metadynamics

Algorithm 8 The model of computing property map collective vari- ables 1: function compute_cvs(x, ref_structures) 2: loop through ai in neighlist

3: Raix ← compute_rotation_matrix(x, ai) 4: distance ← Equation (3.3) 5: derivatives ← Equation (3.4)

or numerical challenge, therefore we omit it for the simplicity from further equations and assume that all structures have been translated beforehand. Second, we need to find the rotation matrix Raix that fits the reference structure ai to the current structure x. Only then we can evaluate the distance (Equation (3.3)) and and its derivatives with respect to the coordinates of the current structure (Equation (3.4)). Out of these three, the calculation of the rotation matrix takes the most time, thus being the bottleneck function.

Natoms 2 D(x, ai) = ∑ wj‖dj‖ j (3.3)

where dj = xj − Raixaij where x are coordinates of the current structure, ai are coordinates of i-th reference structure, Raix is the rotation matrix fitting the reference structure onto the current structure, w is a set of displacement weights describing the contribution of an atom displacement to the total MSD. ! ∂D(x, ai) ′ = 2wkdk + wk ∑ −2wjdj + ∂xk j ! (3.4) ∂Raix ∑ −2wjdj × aij ⊗ j ∂xk

′ where wk is a set of alignment weights describing the contribution of an atom to the center of mass influencing the translation and ⊗ is the element-wise multiplication.

62 3. Approximation of Mean Square Distance Computations in Metadynamics

Neighbor lists, not detailed in Algorithm 8, can reduce the number of distance computations in some cases. Instead of computing the distance between the current structure and all N reference structures, we take only the closest M reference structures and neglect the contri- bution of the further ones. As the contribution to CV value decreases exponentially with the distance, it is safe to assume that the accuracy is kept if the list indeed includes only the closest structures. The regu- lar updates of the neighbor list every L steps work well with smooth processes, but they drop L to 1 in the case of sudden changes. Nevertheless, even with neighbor lists, the number of bottleneck function calls remains rather high. In a usual microscopic step of the molecular dynamics simulation, M or even N distances are computed. Reducing this number while keeping the overall accuracy has been our goal in the development of an approximated model.

3.3 Approximated Model of Computation

In order to reduce the number of bottleneck function calls, we have devised an approximation that utilizes the contiguity of the molecular dynamics simulations. As the molecular structure moves, rotates, and changes smoothly due to the tiny timestep in the integration scheme of molecular dynamics simulation, the rotation matrices should also change their values only slightly each step. Therefore, the rotation matrices saved in one step should approximate the alignment between the reference structures and the current structure to a large extent in next few steps, until the molecule changes its shape too much or moves too far. We start as usual by computing the distance to all reference struc- tures, see Figure 3.3. Then we introduce a close structure y and save the coordinates of the current structure and the rotation matrices

Raiy. In next steps, we reuse the rotation matrices and approximate

Rxai ≈ RxyRyai , replacing expensive diagonalization with cheaper multiplication. We compute the distances and derivatives with modi- fied equations, see Equation (3.5) and (3.6). We continue while the current structure x and the close structure y remain near each other. When their distance exceeds the threshold ε,

63 3. Approximation of Mean Square Distance Computations in Metadynamics

D(x,y) > y⬅x y⬅x x:

ai:

Figure 3.3: Close structure method expensively computes usually only the distance between the current and the close structure. we reassign y ← x and recompute accurately with original equations (3.3) and (3.4). Again, we save rotation matrices and continue. Algorithm 9 summarizes the procedure. The approximated model does not eliminate the loop over the reference structures, nor it lowers the number of iterations. Instead, it replaces the call to an expensive function with a cheaper approximation based on calculations from the previous iterations.

Natoms ˜ ˜ 2 D(x, ai) = ∑ wj‖dj‖ j (3.5) ˜ dj = xj − RxyRaiyaij where y is the close structure, Rxy is the rotation matrix that fits the close structure to the current structure, and Raiy is the saved rotation matrix that fits the close structure to the reference structure.

! ˜ ( ) ∂D x, ai ˜ ′ ˜ = 2wkdk + wk ∑ −2wjdj + ∂xk j ! (3.6) ∂R − ˜ × ⊗ xy ∑ 2wjdj Raiyaij j ∂xk

64 3. Approximation of Mean Square Distance Computations in Metadynamics

Algorithm 9 Approximated model of computing the collective vari- ables with the close structure 1: function compute_cvs(x, ref_structures) ∂Rxy 2: Dxy, Rxy, ∂x ← compute_distance(x,y) 3: if Dxy > ε then 4: loop through ai in neighlist

5: Rxai ← compute_rotation_matrix(x, ai)

6: save Rxai 7: distance ← Equation (3.3) 8: derivatives ← Equation (3.4) 9: else 10: loop through ai in neighlist

11: Rxai ← RxyRyai 12: distance ← Equation (3.5) 13: derivatives ← Equation (3.6)

The neighbor list updates, not detailed in Algorithm 9 for simplicity, can be employed in a usual manner alongside the close structure ap- proximation. In the close structure reassignment steps, we recompute accurately the rotation matrices for all reference structures. During the NL update, we select the closest structures according to an approx- imated distance. However, as the current structure is at most ε nm2 far away from the latest one, the rotation matrices for all reference structures should closely resemble the accurate ones. This approximated model of distance computation can be applied to problems with another distance functions. The first option includes variants of MSD that calculate the rotation matrix but then do not use the Euclidean distance, for example, [90]. The second, a more general option would include distance computations that incorporate an element analogous to the rotation matrix: something demanding to calculate, but easy to approximate with information from the previous steps. With the close structure approximation, we managed to reduce the number of distance computation in almost all steps from M in the

65 3. Approximation of Mean Square Distance Computations in Metadynamics original model to 1. We compare the theoretical performance of the original and the approximated model thoroughly in the next part.

3.4 Accuracy and Speedup

We start this section by comparing the number of expensive MSD computations required in the original and the close structure method. After that, we describe the implementation, datasets, and details of computer experiments. We have evaluated the theoretical and practical speed-up in [146] and the accuracy in [144]. Here we present the main findings with further argumentation based on additional experiments and more detailed explanation of the chemical background.

3.4.1 Number of MSD Computations We have developed the close structure method with the aim of reduc- ing the number of expensive MSD computations. We consider a MSD computation expensive if it involves the cal- culation of the rotation matrix by matrix diagonalization as in [89] or by singular value decomposition as in [88]. Therefore, all distance computations in the original method are considered to be expensive. The close structure method cheaply approximates the rotation matrix in most cases, replacing costly matrix diagonalization with a multipli- cation. In the original method, in a usual step, we need M (the size of the neighbor list, or M = 0 in the case of no neighbor list) distance computations. In a neighbor list update step, every L steps (L = 1 in the case of no neighbor list), we compute N (the number of refer- N−M ence structures) distances. On average, we need M + L expensive distance computations per step. In the close structure method, in a usual step, we need only one MSD computation, the rest is approximated. In a close structure reas- signment step, we expensively recompute rotation matrices for all N N reference structures. On average, we need 1 + K expensive distance computations per step, where the close structure is reassigned (on average) every K steps.

66 3. Approximation of Mean Square Distance Computations in Metadynamics

Clearly, the acceleration of distance computations depends on N−M N whether M + L > 1 + K . For the simulations with the neigh- bor list, we can assume M ≪ N. Moreover, the size of the neighbor list M usually corresponds to its update stride L. The relation between N and K depends on the threshold ε, the molecular system, and its movements. For the simulations without the neighbor list, the comparison re- N duces to N > 1 + K which clearly applies as K ≫ 1. Therefore, the number of distance computations is reduced approximately by an order of N. For the simulations with the neighbor list, the specific values of M, L, N, and K determine the acceleration. If L < K, then generally N−M N N L > K . If K is close to one, the number of distance computations can be reduced up to the order of M.

3.4.2 Implementation

For implementation of the close structure method, we modified Plumed [201], a standard C++ software for metadynamics. It works as a plugin to common molecular dynamics codes like Gromacs [74, 2], NAMD [153], or LAMMPS [158]. We changed two classes, PathMSDBase that represents path or property map collective variables and RMSD that encompasses (R)MSD computation. The code is available at https: //github.com/jpazurikova/plumed2/ and we are currently in the process of integrating it into the next version (v2.4) of Plumed. For accuracy evaluation, we have ported these changes into Plumed v2.1 combined with Gromacs v4.5.7. For performance evaluation, we combined Gromacs v5.1.4 with Plumed v2.3. In addition to the implementation of the close structure method, we also optimized the code. As the close structure method had diminished the cost connected with matrix diagonalization, new bottlenecks appeared such as the loop calculating derivatives. We eliminated these with simple optimizations: enforcing the vectoriza- tion of loops and inlining of template functions. For a meaningful comparison of performances of the original and modified codes, we applied the same level of code optimization also on the original code regarding the CV evaluation.

67 3. Approximation of Mean Square Distance Computations in Metadynamics

We did not continue with more aggressive optimizations, such as changes to data structures (unfortunately, Plumed has array-of- structures design for its data structures, instead of more easily vector- izable structure-of-arrays). We aim to include our method into official Plumed software where the software design outranks in priority per- formance in order to keep the code maintainable and extensible. We have included them into the prototype code when comparing the GPU implementation of close structure in [52], the influence on performance is evaluated in section 3.4.5.

3.4.3 Datasets and Computational Details We tested and evaluated our method on two molecular systems— a non-symmetric trans,trans-1,2,4-trifluorocyclooctane (referring to it as cyclooctane throughout the work), and Trp-cage miniprotein (con- struct TC5b, PDB ID: 1L2Y) [126]. The code has been compiled with Intel compiler v17 and AVX_256 SIMD instruction set. All performance evaluation experiments were executed on one machine with 2x 8-core Intel Xeon E5-2650 v2 2.6 GHz, even multiple MPI processes were assigned to the same machine.

Cyclooctane Cyclooctane is a small, 24-atom molecule that forms different ring conformations (crown, boat, and boat-chair) separated by high-energy barriers, see Figure 3.4. Transitions happen rarely, but quickly, thus it would require long molecular dynamics simulation to observe them.

Figure 3.4: A ring of cyclooctane and its most common conformations. Taken from [5].

68 3. Approximation of Mean Square Distance Computations in Metadynamics

Figure 3.5: Visualization of three-dimensional Isomap projections of reference structures generated by Brown et al. [28]

Biasing, such as metadynamics, successfully accelerates these rare events. Brown et al. generated 8375 conformations of the molecule by a systematic geometry building algorithm [28]. Then they reduced this high-dimensional space with Isomap [197] into three projections, see the visualization in Figure 3.5. We used these three low-dimensional projections as one property for property map collective variables. For reference structures, we selected 512 of 8375 conformations by the cut- off filter. We did not employ a neighbor list, distances to all reference structures were computed in every step. We set up the simulation as described in [144] and [188]. The molecule was modeled by the General AMBER Force Field (GAFF) [207] and partial atomic charges were calculated by the RESP method [18]. We simulated in vacuum by stochastic dynamics, supplementary

69 3. Approximation of Mean Square Distance Computations in Metadynamics material C in [144] shows that the chosen integrator had almost no influence on the energy landscape. We have chosen the cyclooctane for testing and evaluation because of the high computational cost of metadynamics-related code. As men- tioned before (Table 3.1), metadynamics takes 99.7% of the simulation computation time, MSD calculation takes 93%. Therefore, it would greatly benefit from any acceleration of distance computation.

Trp-cage Trp-cage is a 304-atom small protein, see Figure 3.6. It diffuses between its conformations and folds relatively slowly more due to the complex- ity of the free energy surface than due to heights of individual energy barriers [79, 87]. We obtained reference structures by 50 ns parallel tempering sim- ulation in 19 temperatures (280.0–502.3 K) after short minimizations and equilibrations. Structures sampled at 502.3 K (5001 structures) were pooled. 2120 reference structures were selected out of 5001 using a simple cut-off filter. Same as for cyclooctane, we reduced them with the Isomap to two-dimensional projections and used these values as properties in property map collective variable. As Trp-cage moves slowly even during the conformational changes, we were able to apply the neighbor list without a notable loss in accuracy. We selected 50 closest structures every 50 steps. NMR structure of the Trp-cage mini-protein (PDB ID: 1L2Y) [126] was used as a starting structure. After minimization and equilibra- tion, it was simulated in implicit solvent modeled by the generalized Born model with the hydrophobic solvent-accessible surface area term and Still algorithm. Bonds were constrained using the LINCS algo- rithm [75], noncovalent interactions were treated without cut-offs. The timestep was set to 2 fs. The temperature (300 K) was controlled by the Parrinello-Bussi thermostat [32]. We have chosen Trp-cage for testing and evaluation as it repre- sents a more common biomolecular system with still rather high metadynamics-related cost (43% of computation time in MSD compu- tation). In order to reduce the cost of molecular dynamics, we simu- lated it in an implicit solvent, considering the influence of the solvent without the addition of explicit water molecules. This approach might

70 3. Approximation of Mean Square Distance Computations in Metadynamics

Figure 3.6: A molecule of Trp-cage. Taken from PDB [147]. affect the correspondence with reality, as discrepancies in the probabil- ity of two major folding pathways have been discovered [87]. However, as we do not aim to predict the free energy exactly, it suits our need to evaluate the computational demands and compare the landscapes between two simulation methods. The benefit of the acceleration of the distance computation would not achieve the degree possible with cyclooctane, nevertheless, we could get closer to the computational speed of MD simulation with the advantage of MTD.

3.4.4 Theoretical Speedup

First, we analyze the maximal theoretical speed-up of function comput- ing distances. Then, we assess the speed-up of the whole simulation using the Amdahl’s law. For cyclooctane, N = 512, M = 0, L = 1. For Trp-cage, N = 2120, M = 50, L = 50. Values of K are difficult to predict. They depend greatly on the threshold ε, the molecular system and its movements during the simulation time, yielding the differences of orders of mag- nitude. Even for the same system, average K differs with various start- ing structures, longer simulation time, or other simulation parame- ters. Table 3.2 shows the values of K for simulations of 100 000 steps. Here and in experiments evaluating the practical speed-up, we apply

71 3. Approximation of Mean Square Distance Computations in Metadynamics

Table 3.2: Values of the current-close structure distance threshold ε influence K, the average number of steps between the reassignment of the close structure.

ε (nm2) 0.001 0.005 0.01 0.05 cyclooctane 370 4000 10 000 100 000 Trp-cage 55 500 3000 20 000

ε = 0.01 nm2 and ε = 0.001 nm2. As we have shown in [144], even ε = 0.05 nm2 achieves reasonable accuracy. For cyclooctane, we need in each step on average

∙ 512 MSD computations with the original method;

∙ 1 + 512/10000 = 1.05 with the close structure method and ε = 0.01 nm2;

∙ and 1 + 512/370 = 2.38 MSD computations with the close struc- ture method and ε = 0.001 nm2.

The theoretical speed-up for MSD computation reaches 512/1.05 = 488 and 512/2.38 = 215, respectively. For Trp-cage, we need in each step on average

∙ 50 + (2120 − 50)/50 = 92.4 MSD computations with the original method;

∙ 1 + 50/50 + 2120/3000 = 2.7 with the close structure method and ε = 0.01 nm2;

∙ and 1 + 50/50 + 2120/55 = 40.5 with the close structure method and ε = 0.001 nm2.

The theoretical speed-up for MSD computation reaches 92.4/2.7 = 34 and 92.4/40.5 = 2.3, respectively. Clearly, the acceleration of MSD computation can speed up the whole simulation only to some extent as it takes only a portion of the computation time. Table 3.1 on page 60 mentions how many percents are spent in metadynamics, in CV evaluation, in MSD computation

72 3. Approximation of Mean Square Distance Computations in Metadynamics

and in matrix diagonalization. For our purposes here, the numbers re- garding the MSD computation suffice: 93% for cyclooctane simulation, 43% for in Trp-cage simulation. We calculated the speed-up of the whole simulation according to the Amdahl’s law, see Equation (3.7). Table 3.3 summarizes the theoretical speed-up of the whole simulation for various ε, i. e., various K influencing the maximal speed-up of MSD computation, see Table 3.2. We can observe that the whole simulation should accelerate more with higher ε due to less frequent close struc- ture reassignments. However, notice the unintuitively slow rate of acceleration of the whole simulation compared to the corresponding speed-up of MSD computation. For example, for cyclooctane, we triple the speed-up of MSD computation with ε = 0.01 and 0.001. But, we shift only from 13 to 13.9 in the acceleration of the whole simulation. This is caused by the term p/s in the denominator of the Amdahl’s law.

1 S = p (3.7) (1 − p) + s where p is the original portion of time spent in the given function, s is the speed-up of the accelerated function.

Table 3.3: Speedup for the whole simulation for various ε and thus various maximal speed-ups of MSD computations

ε(nm2) 0.001 0.005 0.01 0.05 cyclooctane 13.5 13.88 13.9 13.92 Trp-cage 1.3 1.67 1.72 1.73

The theoretical speed-up only shows what are the limits for accel- eration. We can expect the speed-up at most one order of magnitude for cyclooctane simulation. For Trp-cage, due to the cost of molecular dynamics part, we cannot possibly achieve double speed.

3.4.5 Practical Speedup In this subsection, we evaluate the actual speed-up of simulations and compare it to the theoretical one. Moreover, we analyze whether the

73 3. Approximation of Mean Square Distance Computations in Metadynamics close structure method has not interfered with the scalability on paral- lel execution. We ran simulations without metadynamics (Gromacs), with metadynamics and the original method for MSD computation, code optimizations included (Gromacs+Plumed) and with metady- namics modified with the close structure method and code optimiza- tions (Gromacs+modified Plumed). Furthermore, we executed them for a various number of MPI processes and OpenMP threads. As a speed measure, we took ns/day stated by Gromacs in its logs. As the reference point for speed-up, we considered Gromacs with original Plumed ran with one MPI process and one OpenMP thread. Out of four runs for each variant and MPI/OpenMP configuration, we consider the minimal running time, i. e., the highest speed, to eliminate the random interference of operating system’s background activity. We evaluated the speed-up on one core with the thresholds ε = 0.01, see the upper parts of Figures 3.7 and 3.8, and 0.001 nm2, see Figures 3.9 and 3.10. Cyclooctane simulations with the close structure method with both thresholds almost reached the maximal theoretical speed-up: 12.6 compared to the theoretical 14 for ε = 0.01, 12 compared to the theoretical 13 for ε = 0.001. As expected, the performance decreases with a lower threshold. Although we did not reach the performance of MD simulation without MTD due to Plumed overhead like initializa- tion, logging, Gaussian hills addition and CV evaluation, we managed to reduce the immense overhead of metadynamics with the original method by an order of magnitude. Trp-cage simulations with the close structure method achieved a modest speed-up of 1.4 for ε = 0.01, compared to the theoretical 1.7. In the case of ε = 0.001, the performance was the same as with the original method. As expected, the performance decreases with a lower threshold as K drops from 3000 to 55. However, the practical and theoretical speed-up differ more than in the cyclooctane simulation. We attribute this to a smaller portion of computation time spent in matrix diagonalization function, as shown in Table 3.1. The distance computation takes 43% of the computation time. Out of that, only half is spent in diagonalization, as opposed to three-quarters in the case of cyclooctane. The rest is taken by the computation of distance and especially derivatives, which scale with the size of the molecule.

74 3. Approximation of Mean Square Distance Computations in Metadynamics

Figure 3.7: Speedup and strong scalability of cyclooctane simulations, the close structure method with ε = 0.01 nm2. Notice the logarithmic scale of x axis, the reference point is the blue bar with the original method. The gray bar with molecular dynamics simulation without metadynamics puts into perspective how metadynamics increases the cost. 75 3. Approximation of Mean Square Distance Computations in Metadynamics

Figure 3.8: Speedup and strong scalability of the Trp-cage simulations, the close structure method with ε = 0.01 nm2. The reference point is the blue bar with the original method. The grey bar with molecular dynamics simulation without metadynamics puts into perspective how metadynamics increases the cost. 76 3. Approximation of Mean Square Distance Computations in Metadynamics

Figure 3.9: Speedup and strong scalability of cyclooctane simulations, the close structure method with ε = 0.001 nm2. Notice the logarithmic scale of x axis, the reference point is the blue bar with the original method. The grey bar with molecular dynamics simulation without metadynamics puts into perspective how metadynamics increases the cost.

We reduced the number of calls to DSYEVR with the close structure method, but the other half of computation remained unchanged: both Gromacs+Plumed and Gromacs+modified Plumed contain the same code optimizations. Nevertheless, with ε = 0.01, we got much closer to the performance of Gromacs alone, thus gaining the advantage of enhanced sampling for only a small cost. The lower parts of Figures 3.7 and 3.8 show the strong scaling for various combinations of MPI processes and OpenMP threads. First, compare performances with the same total number of threads, i. e., 1/4 with 2/2, 1/8 with 2/4, and 1/16 with 2/8. In all cases for the original method, two MPI processes perform better than one with twice as many threads. The reason for that stems from the implemen- tation of CV calculation in Plumed. The loop over reference structures in the neighbor list (line 2 in Algorithm 8 on page 62) is parallelized only with MPI. The second process takes half of demanding computa- tions and brings improvement in performance. As the close structure method accelerates the computation within the loop over reference

77 3. Approximation of Mean Square Distance Computations in Metadynamics

Figure 3.10: Speedup and strong scalability of the Trp-cage simulations, the close structure method with ε = 0.001 nm2. The reference point is the blue bar with the original method. The gray bar with molecular dynamics simulation without metadynamics puts into perspective how metadynamics increases the cost.

structures, the second MPI process does not bring any advantage. In the case of Trp-cage, it even slows down the computation due to the communication and synchronization cost. Decreasing speed with more resources (8 total threads for cyclooctane, 16 for Trp-cage) sug- gests that the overhead caused by parallelization exceeds its benefits. Second, compare performances for one MPI process, i. e., 1/4, 1/8, and 1/16. Any speed-up here can come only from faster molecular dynamics. For cyclooctane simulations, little computation saturates even a single core, thus performance stays about the same for all three variants. For Trp-cage simulation, some improvement appears as molecular dynamics requires more computation due to the pro- tein’s size. However, it cannot sufficiently saturate 16 threads, thus the performance drops. The simulations with the close structure method also exhibit this trend. Overall, the close structure method reduces the gain of another MPI process but does not interfere with a general scalability behavior.

78 3. Approximation of Mean Square Distance Computations in Metadynamics

3.4.6 Accuracy Evaluation The close structure method presents an approximation in MSD com- putation. The changes in distance values affect the values of property map collective variables, the height of Gaussian hill representing the bias potential and by extension the trajectories of the atoms. We need to evaluate whether these changes mean some loss of accuracy or whether we are simulating the same thing as before. We compared the simulations done with the original method and with the close struc- ture method. Due to the chaotic nature of the molecular dynamics simulations, even tiny changes caused by the close structure approxi- mation result in significant changes in trajectory. Therefore, we could not just literally compare the trajectories, they were different. However, the main goal of molecular dynamics and metadynamics simulations is to explore the energy landscape. If those two simulations sampled the same landscape, we can say that the accuracy has been sustained. We compared:

1. collective variables by the original and close structure method on the same set of x and reference structures,

2. visualizations of the energy landscapes based on collective vari- ables,

3. essential coordinates [3] of both trajectories.

We elaborate on each approach in the following sections.

CV Values on the Same Trajectory This approach shows how the different distance computation method influences the value of collective variables. We ran 50-ns simulations with various ε with the close structure method. Then, with Gromacs option -rerun we were able to evaluate the values of collective vari- ables with distance computed with the original method. In other words, we evaluated the Equation (3.2) with D(x, ai) computed first with Equation (3.5), then with Equation (3.3) for the same set of x and ai. This way we obtained two vectors of corresponding CV values in many time steps. The Pearson correlation coefficient between them

79 3. Approximation of Mean Square Distance Computations in Metadynamics

Figure 3.11: The values of collective variables for simulation with original method (left) and with the close structure method (right) for MSD computation. Clearly, they differ in corresponding time points. was R > 0.9999 for all cyclooctane simulations, which is effectively perfect correlation considering the numerical precision. For Trp-cage simulation, we got R = 0.98, we attributed the error to the neighbor list after further analysis.

Visual Comparison of Energy Landscape This approach shows whether the collective variables computed in both simulations explored the energy landscape to the same extent. Due to the chaotic character of molecular dynamics simulations, x in specific time points differ in both simulations. Therefore, the values of collective variables at the same time points do not correspond and are not correlated, see Figure 3.11. However, if we take the whole sets of these values from all time points in complete simulations and disregard the information about time, they should map the same space. By employing the low-dimensional projections found by Isomap in property map CVs, the values of collective variables represent the values of low-dimensional projections of given x. Thanks to their low number (three for cyclooctane and two for Trp-cage), we can visualize them, see Figures 3.12 and 3.13. For cyclooctane, 3D visual representations are nearly indistinguish- able. Trp-cage energy landscapes also show great resemblance, espe- cially around the global minimum [-0.5;0.0]. The visible differences

80 3. Approximation of Mean Square Distance Computations in Metadynamics

Figure 3.12: Visualizations of cyclooctane energy landscape explored by metadynamics simulations with the original method (A) and the close structure method (B), ε = 0.005 nm2, for MSD computation. Created with tools Mayavi2, vrml2pov, and PovRay.

81 3. Approximation of Mean Square Distance Computations in Metadynamics

Figure 3.13: Visualizations of Trp-cage energy landscapes explored by metadynamics simulations with the original method (A) and the close structure method (B), ε = 0.01 nm2, for MSD computation. Created with Metadyn View, available at http://metadyn.vscht.cz. are caused by slightly different conformations in the trajectories being mapped to the same points in the projected space. We conclude that simulations with both methods for MSD computation explore a very similar energy landscape.

Trajectory Comparison by Essential Coordinates As mentioned before, the literal comparison of the trajectories has no meaning, since they are different due to small changes of CVs propagating all the way up to atoms’ positions. However, Adamei et al. [3] proposed a linear dimensionality reduction technique for trajectories and de Groot et al. [63] studied it further. We adopt these techniques to compare so-called “essential” motions, i. e., the true conformational behavior. This way,we can find whether the trajectories include the same basic movements (even if not happening at the same time) disregarding harmonic oscillations. First, according to Amadei [3], we computed the covariance matrix of the coordinate motion D E C = cov(x) = (x − ⟨x⟩)(x − ⟨x⟩)T (3.8) where ⟨⟩ denote the average over the whole trajectory.

82 3. Approximation of Mean Square Distance Computations in Metadynamics

Then, we diagonalized the matrix C = UΛUT with real eigenval- ues Λ and the matrix of orthogonal eigenvectors U. Amadei [3] shows that after sorting of the eigenvalues in the descending order, only the limited number n of the most significant ones represent the essential motion. The latter correspond to harmonic oscillations with a narrow distribution and they can be neglected for the purpose of trajectory comparison. Let’s denote ui and vi the n normalized essential eigenvectors corresponding to trajectories x and y. The trajectories can be compared by two characteristics: n 1 T 2 overlap(x, y) = ∑ (uivj ) (3.9) n i,j=1 which yields the value of 1 if the vector sets are the same, even in different order, and decreases to 0 when they differ, hence it focuses on similarities, and 1 n ( ) = | − | ( T)2 penalty x, y 2 ∑ i j uivj (3.10) n i,j=1 which is 0 for identical eigenvector sequences, and it increases with similar eigenvectors being farther from each other in the ordered sequence, hence emphasizing the differences between the trajectories. De Groot concludes [63] that trajectories can be considered similar if the overlap is above 0.4 and the penalty below 1.2 simultaneously. We evaluated these characteristics for trajectories of the simula- tions with the original method and with the close structure method. Table 3.4 summarizes the computed overlap and penalty for various coverages of the motions in trajectories (the percentage of the sum of eigenvalues). Relevant studies [3, 202, 63] conclude that the essential movements sum up to 70—90% of the eigenvalues’ sum, consistent with our find- ings. If we increase the coverage, we get even higher overlap, meaning that not only essential movements correspond, but minor oscillations match as well. With cyclooctane, we can observe a destabilizing effect of close structure reassignments with ε = 0.01 nm2 . They cause a slight dis- continuity in the simulation, resulting in worse overlap. With ε =

83 3. Approximation of Mean Square Distance Computations in Metadynamics

Table 3.4: Comparison of original and close structure trajectories. “cvrg” is percentage of the sum of eigenvalues used for comparison, “overlap” and “penalty” are Eq. 3.9 and Eq. 3.10 evaluated for the given eigenvalue coverage.

cyclooctane 50 ns Trp-cage 200 ns ε(nm2) 0.01 0.001 0.01 cvrg overlap penalty overlap penalty overlap penalty 60% 0.34 0.10 0.53 0.21 0.33 0.11 70% 0.44 0.14 0.53 0.16 0.54 0.20 80% 0.42 0.11 0.56 0.16 0.63 0.21 90% 0.44 0.13 0.56 0.12 0.73 0.21 95% 0.50 0.14 0.64 0.14 0.69 0.15 99% 0.66 0.16 0.74 0.16 0.80 0.15

0.001 nm2, frequent updates minimize the discontinuities, getting close to the essential motions of the original trajectory. The results of Trp-cage protein concur with the previous findings, good match starting with 70% coverage, improving with the increasing percentage. Again, even the minor oscillations match. Altogether we conclude that the computation with the close struc- ture yields trajectories which explore the same energy surface as the original Plumed method. According to these results, we can recom- mend tuning the ε so that the close structure reassignments happen every few hundred steps, gaining the balance between the accuracy and the speed-up.

3.5 Conclusion

We addressed the high computational overhead of the method acceler- ating the occurrence of rare events in molecular dynamics simulation. The original algorithm loops over many reference structures in each step and expensively calculates the distance to the current structure. Out of this calculation, finding a rotation matrix superimposing both structures to the closest fit takes a substantial part. We approximated

84 3. Approximation of Mean Square Distance Computations in Metadynamics

it by introducing a close structure. The loop still goes over many refer- ence structures in each step, but instead of expensive calculation, we cheaply approximate the vast majority of distance computations by reusing rotation matrices from the previous timesteps. We evaluated both theoretical and practical speed-up on two molec- ular systems [146]. First one, small molecule of cyclooctane, was chosen due to high computational cost introduced by metadynamics (slow- down of over 300 compared to MD simulation without MTD). We managed to reduce this by one order of magnitude, virtually reaching the assessed maximal possible speed-up. Second one, miniprotein Trp-cage, represents a more common biomolecular system, where we can expect only modest improvement in performance due to the high cost of MD. Nevertheless, with the acceleration brought by our method we got closer to the performance of MD simulation with the additional benefit of MTD. Naturally, we evaluated also the accuracy of our approach by com- paring trajectories from both simulations [144]. Due to the chaotic nature of molecular dynamics simulations, literal comparison is mean- ingless. However, with more sophisticated approaches, we showed that they explore the same energy landscape, thus reach the same level of accuracy. In this research problem, we modified the model by introducing a new element that trades a large part of computational burden by sav- ing data in the memory and reusing them. This approach, well-known in computer science, fits nicely into the problem of metadynamics simulations because of its “forgiveness” in terms of accuracy. The er- ror caused by the integration scheme, force field approximation, and rounding exceeds the degree of error caused by the close structure approximation, given that the threshold is chosen reasonably.

3.5.1 Future Work

Our method opens the possibility of further research in several areas. First, the concept of reusing a rotation matrix (or analogous data) in structure-structure distance computations can be applied to other distance computations, as outlined in section 3.3.

85 3. Approximation of Mean Square Distance Computations in Metadynamics

Second, the method can be applied to other enhanced-sampling methods combined with path or property map collective variables, such umbrella sampling or steered MD as in [25, 109, 137]. And third, the acceleration especially for small molecules with abrupt changes, such as drugs, makes it possible to simulate their metadynamics simulations much more cheaply. In addition to faster MTD simulations, new approaches can be applied. Instead of using the bias potential to push the simulation forward, we can use it as a correction to interatomic potential, correcting the inaccuracy of the force field (as explained in section 1.1.1). If we incorporate the infor- mation from quantum mechanics regarding the small molecule into MD as bias potential of MTD, we can correct the errors of MM force fields for small molecules. Representing the correction as theMTD bias potential would be highly impractical with the original method due to high computational cost. Our method opens the possibility of this approach.

86 4 Simplified Optimization Method in Atomic Charges Parametrization

In the third problem we have researched, we simplified the opti- mization method within the atomic charges parametrization. Atomic charges, theoretical real numbers, approximate the charge distribution around the atom and their values influence molecular movements and interactions in simulations and molecular properties in model- ing. Ideally, we would always compute them by quantum mechanics methods. However, their extensive computational demands make that difficult, if not impossible, for larger molecules (a thousand of atoms and more). Moreover, the atomic charges depend not only on the molecular topology, but also on its three-dimensional structure, thus they change with each conformation change. Therefore, fast methods for their computation are required. Empirical methods have been developed as a cheaper alternative. They incorporate parameters that should mimic the quantum com- putation. The process of obtaining parameters, the parametrization, consists of the optimization problem of finding such parameters that would determine charges as close to quantum charges from the train- ing set as possible. This optimization problem has been approached by the least squares minimization for a long time, although recently more advanced tech- niques like evolutionary or genetic algorithms have appeared. Modern heuristics should cope better with heterogeneous training sets as they search in a whole multi-dimensional space. However, they bring along a substantial computational cost due to many iterations and expen- sive fitness function evaluation that includes solving of many systems of linear equations. As this highly-optimized routine is difficult to accelerate, it is beneficial to look at the loops. The choice of modern heuristics method is not backed up by any argumentation in the literature. Therefore, we propose to first system- atically analyze the optimization space and find the requirements for its search. Then we can devise a new (or modify an existing) method that would effectively solve the problem without superfluous bottle- neck function evaluations.

87 4. Simplified Optimization Method in Atomic Charges Parametrization

This chapter continues as follows. First, we describe the problem– its background, state-of-the-art approaches, their limitations, and mo- tivation to overcome them. Second, we build the model of the current solutions to emphasize the bottleneck. Third, we analyze the optimiza- tion problem, find its characteristics and requirements for effective search. We use these to simplify the model in the fourth section. Fifth, we implement both the original and our simplified model and evalu- ate their performance in accuracy and computational cost. And last, we conclude the main findings and suggest the directions for future research.

4.1 Problem

In this section, we introduce atomic charges, the process of their com- putation, and shortly present where and why the state-of-the-art ap- proaches fail in solving the most challenging part.

4.1.1 Atomic Charges Many biologically and chemically relevant processes at atomic level happen because of electrons and their distribution. Change in the distribution can mean forming or breaking a bond between atoms, the basis of every chemical reaction. The interaction between electrons, either within bonds between atoms or in the long-range electrostatics, determines the movements of atoms. Thus, an accurate representation of the electron distribution is essential for accurate modeling and simulations of atoms. Electron density distribution based on the probability of their occurrence in molecular orbitals describes the reality quite precisely [106]. However, it has many weak points: difficult representation, intertwined quan- tum mechanics model, a high number of particles in the simulated system, and extensive computational requirements. Therefore, it is convenient to reduce the electron distribution of an atom into one real number, the partial atomic charge, specifically determined for the given atom within the given molecule [66]. This rough but ex- tremely useful approximation has a vast number of applications [103]: molecular dynamics simulations with molecular mechanics model

88 4. Simplified Optimization Method in Atomic Charges Parametrization

(protein folding [148], virus behavior [216, 61]), drug design (docking [47], pharmacophore design [7], virtual screening [139]), predicting chemical properties [204, 7], and others [198]. As a theoretical concept, atomic charges cannot be measured, but they can be computed. For their validation, we exploit the fact that the electron distribution determines several static properties of chemical compounds, e. g., the dissociation constant for acid in the solution [84]. Comparing values of these properties computed with atomic charges and measured with experiments can confirm that this approximation of electron distribution concurs with the physics of reality [140]. Quantum mechanics (QM) models are considered to describe the reality quite well, although at the high computational cost [106, ch.5.6]. As there are many ways to reduce a complex characteristic into a real number, there are many QM methods how to compute atomic charges. They differ in three major components [82]: QM theory, basis set,and population method. QM theory provides a numerical method for solv- ing the Schrödinger’s equation, most common are Hartree-Fock (HF) [184] and B3LYP [19]. The basis set is a linear combination of func- tions that approximate molecular orbitals, for example, STO-6G basis set consists of six Gaussian primitive functions. Population method partitions the density to particular atoms, most common are Natural Population Analysis (NPA) [171], Mulliken Population Analysis (MPA) [124], and Atoms in Molecules (AIM) [10]. The QM method’s name is composed of all three parts, e. g., B3LYP/6-311G/NPA denotes the method based on B3LYP theory using 6-311G basis set and Natural Population Analysis. Different combinations mean slightly different values of atomic charge even for the same atom in the same molecule. The particular choice can be driven by the suitability for application or compliance with the force field, although it will always contain a certain degree of arbitrariness [72, p.437]. Due to the computational demands of QM methods, e. g., 20 CPU days for protein ubiquitin charge calculation [80], cheaper and faster empirical methods have been developed. These work with simpler physico-chemical equations and empirical constants, i. e., parameters. The charge of an atom within a molecule is determined by parameters of atom’s type (specific for the atom’s element and often its highest bond order), distances to other atoms and their charges. With the parameters’ values known, the computation of atomic charges for all

89 4. Simplified Optimization Method in Atomic Charges Parametrization atoms in a molecule means only solving one system of linear equations. Such fast computation of charges has made it possible to routinely use and recompute atomic charges in modeling and simulation tasks. Several equation systems have been devised, differing in number of parameters and ratios between components (EEM [123], SFKEEM [36], ABEEM [213], MPEOE [131], QEq [170], and many others), the most common is the Electronegativity Equalization Method (EEM) [123]. To find out the parameters’ values, we need the training set of molecules and their QM charges. Then, we calibrate the parameters to produce empirical charges as close as possible to the training QM charges. Ideally, in the parametrization, we would like to obtain the param- eters applicable to some type of molecules with high accuracy, i. e., the empirical charges would resemble the charges we would obtain by long QM computation. This clearly assumes that parameters together with the equation system somehow capture the principal components of QM computation for the given molecules and approximate them with the high accuracy if applied on a similar molecule. This general- ization is, of course, possible only to some extent. And this extent is determined by the quality of the parametrization process: the choice of the equation system and calibration method, and the quality of the training set. We would like to comment on the importance of the training set used in parametrization. Although these issues are out of the scope of this work, they are worth mentioning as they show the difficul- ties connected with the parametrization. As in any other algorithm depending on the training set, its quality determines the quality of out- come. Therefore, we need to carefully select molecules for the training set. With molecules that have something in common, the parameters will capture the aspects of their similarities and work well. However, with a too narrow choice, we end up with too specific, less applica- ble parameters. Furthermore, the structures of the molecules have to be validated and checked for errors [166], as these can skew the parameters’ values. And at last, some attention must be paid also to QM computation, selection of its components, validation of molecular structures and careful choice of parameters for software. The empirical charges computed with trained parameters will “reproduce” them, so no matter how accurate the parametrization is, if the QM charges include some error, the empirical charges will not perform well either.

90 4. Simplified Optimization Method in Atomic Charges Parametrization

4.1.2 Electronegativity Equalization Method and its Parametrization Electronegativity equalization method [123], the most common em- pirical method, stands on the electronegativity equalization principle [175]. That states that when a molecule is formed, the electron distri- bution spreads around atoms and their electronegativities (the ability to keep electrons near) equalize. This equalization together with the charge conservation principle (the total charge of a molecule is con- stant) and a linear approximation of electronegativity form the system of linear equations, see Equation (4.1). To compute charges, we use precomputed A, B, κ to find qi for atoms in an arbitrary molecule with the same atom types by solving the linear system. To parametrize, we use QM charges qi of molecules from the train- ing set to find A, B and κ in the calibration procedure, an optimization problem in many dimensions. Fitness function evaluation consists of two steps. First, the empirical charges determined by the given param- eters are computed for each molecule in the training set. Second, the objective function such as (squared) Pearson correlation coefficient or root mean square deviation (RMSD) assesses how closely empirical charges approximate QM ones.

 −1 −1      2B1 κ/R1,2 ··· κ/R1,N 1 q1 −A1   κ/R−1 2B ··· κ/R−1 1  q   −A   2,1 2 2,N   2   2   . . . . .   .   .   ......  ·  .  =  .  (4.1)        R−1 R−1 ··· B   q  −A  κ/ N,1 κ/ N,2 2 N 1  N   N 1 1 ··· 1 0 −χ Q where i, j denote the given atom, q charge, R the interatomic distance, κ parameter

shared by all atom types, Ai, Bi parameters shared by all atoms of the same atom type, χ equalized electronegativity, N number of atoms, Q the total charge of the molecule. EEM parametrization (the whole process of finding new parame- ters, including creating the training set, choosing the equation system and calibration method, computing and validating them) has been combined with various calibration approaches (solely the optimiza-

91 4. Simplified Optimization Method in Atomic Charges Parametrization tion method used) to find the parameters’ values: least squares method [123], differential evolution [134], genetic algorithm, or their combina- tions with local optimizations [119, 30, 36, 166]. Least squares method (LR) optimizes only through one dimension: κ. Parameters A and B are directly computed via least squares mini- mization. The method is extremely fast, but it fails in case of training sets that are heterogeneous in terms of molecules’ types variability [166]. Optimization methods searching through all dimensions should deal with them better. They employ modern heuristics searching the space globally and local optimization methods for polishing. Ouyang et al. applied differential evolution (DE) [134], no local method was mentioned. Others [119, 36] applied genetic algorithm (GA) and mini- mized the result using the simplex method [127]. Bultinck et al. applied simplex also to each member of a new population at every step [30]. However, they did not reason their choice for the calibration method and evaluated it only on small homogeneous sets. Systematic anal- ysis of the problem and comprehensive evaluation is missing in the literature.

4.1.3 Motivation The motivation for accuracy is quite straightforward. High-quality parameters mean accurate charges, and accurate charges mean an increased accuracy of models and simulations they take part in. Ac- curate models represent the physics of reality better and conclusions drawn from them apply to a greater extent. The motivation to keep the computational cost low is a bit more complex. As the parametrization needs to be done only once in a while, one could argue that its cost does not matter unless is extensive. Nev- ertheless, we suggest that low cost brings substantial advantages. First, even if we need to parametrize only rarely, with cheap computation we can run several computations with slightly different settings on the same training set and choose the best performing parameters. Thus, we can obtain parameters of higher quality. Second, we can parametrize more often. Especially with databases of validated molecules and computed QM charges for common ones, we could routinely compute parameters for more specific types of

92 4. Simplified Optimization Method in Atomic Charges Parametrization

molecules. Third, using computationally expensive methods, when simpler and cheaper would suffice, is a waste of resources and time. And last, but not least, overcomplicated methods require more time to understand, implement, debug, extend, and tune for routine usage and all that without increasing the accuracy or making the computation faster.

4.1.4 Problem Description and Solution Proposal The choice of the calibration method influences the atomic charges parameters greatly. Current approaches have problems with accuracy (least squares and heterogeneous datasets) or the computational cost (modern heuristics, sometimes combined with local search). However, there is a common source of these problems: no thorough analysis of the optimization space or the requirements put on the calibration method. Not surprisingly, some attention paid to characteristics of the problem can help us to devise a method (or modify the existing one) that will effectively solve it with just the necessary computational complexity.

4.2 Model of Computation

We build the model of atomic charges calibration so that we can ana- lyze the problem methodologically. As the main underlying goal is to run computations faster, we want to emphasize the bottleneck. As mentioned before, the calibration is basically an optimization problem. First, we specify basic components as they are included in state-of-the-art heuristic methods. We omit least squares minimiza- tion from further analysis. Its limitations in accuracy have been docu- mented in detail [166] and its cost is unbeatable. Many of the following also applies to least squares calibration, but discussions about an iter- ative character of optimization are relevant only for recent, advanced methods. Decision variables corresponding to EEM parameters (κ, Ai, and Bi for all atom types) are represented directly as a vector of real num- bers. Their a priori known usual values constrain them softly and we construct the initial solution taking that into regard.

93 4. Simplified Optimization Method in Atomic Charges Parametrization

The fitness function has many problem-specific parts. It is evalu- ated indirectly in two steps: first, we need to compute EEM charges for all molecules in the training set and second, we compare them to QM charges for those same molecules with the objective function. That is usually done by correlation coefficients (Pearson, Spearman) or distances (RMSD, Euclidean). Sometimes, weights are included to bias the objective function towards the most used atom types. Ideally, we would like the objective function to take into account several aspects: no negative correlations, many high squared correlations, and low RMSD. Choosing the objective function based on one aspect can lead to overfitting, especially in case of too many optimization iterations. For example, if we aim for the lowest possible RMSD, the method can worsen the correlation considerably, just to get RMSD lower by one centesimo. Generally, low RMSD means also the high correlation, but not vice versa. Due to the evaluation of the linear system for each molecule in the training set and then computing the objective function, the evaluation of the fitness function is quite expensive. As it needs to be evaluated in every iteration of the method, it is the bottleneck of the whole optimization. The search strategy of the state-of-the-art heuristic methods differs for genetic algorithms (mutations and local search for intensification, recombination for diversification) and for differential evolution (com- bined intensification and diversification in one evolution step). Algorithm 10 shows the modern heuristics approach to atomic charges calibration. As we want to emphasize the bottleneck, Algorithm 11 describes the main loop of the optimization algorithm. We disregard the search strategies of global methods for simplicity. The iteration here is a part of the method that evaluates the fitness function for a vector of parameters. First, we generate a new vector of parameters (line 2 in Algorithm 11). This can be done by evolution in DE, recombination or mutation in GA or in a step in local optimization method like simplex, conjugate gradients, or NEWUOA, as described in Algorithm 10. Second, we solve the system of linear equations for each molecule in the training set to compute their empirical charges (line 3 in Algorithm 11). Third, we compare the empirical charges to quantum charges from the train-

94 4. Simplified Optimization Method in Atomic Charges Parametrization

Algorithm 10 Algorithm for EEM Parametrization with DE(MIN) or GA(MIN) 1: function find_parameters(method) 2: population ← generate_random_population() 3: ∀ x ∈ population: R(x) ← evaluate(x) 4: if method is *MIN then 5: ∀ x ∈ (subset ⊂ population): x ← local_opt(x) 6: best ← find_the_best(population) 7: switch method do 8: case de: best ← de(population) 9: case ga: best ← ga(population) 10: best ← local_opt(best) 11: return best

1: function evaluate(x) 2: ∀ molecule ∈ training_set: eem_charges ← eem(x) 3: R(x) ← compare(eem_charges, qm_charges)

1: function de(population) 2: loop 3: r1, r2 ← select_random(population) 4: trial ← combine(r1, r2) 5: R(trial) ← evaluate(trial) 6: if method is *MIN then 7: trial ← local_opt(trial) 8: if R(trial) < R(best) then 9: best ← trial 10: return best

1: function ga(population) 2: loop 3: parents ⊂ population 4: parents ← local_opt(parents) 5: population ← generate_new_population(parents) 6: ∀ x ∈ population: R(x) ← evaluate(x) 7: this_generation_best ← find_the_best(population) 8: if R(this_generation_best) < R(best) then 9: best ← this_generation_best 95 10: return best 4. Simplified Optimization Method in Atomic Charges Parametrization

Algorithm 11 The main loop of calibration method 1: loop through iterations of the global/local method 2: x ← generate_new_vector(population) 3: ∀ molecule ∈ set.molecules: EEM ← compute_charges(x) 4: R ← compute_objective_fcn(EEM, set.QM_charges)) 5: decide_next_action(x,R) ing set using the objective function (line 4 in Algorithm 11). And last, we decide whether to save the current vector and whether to stop or continue (line 5). This view on the method emphasizes that the only expensive step here is the computation of empirical charges, the first part of fitness function evaluation.

4.3 Analysis of the Optimization Space and the Fitness Landscape

In this section, we analyze the optimization space and the fitness landscape of the calibration problem.

4.3.1 Optimization Space The optimization space of atomic charges calibration has a few prop- erties that imply difficulties in its search. It is continuous, constrained (but only softly, constraints present usual intervals, not hard, enforced limits), nonlinear and multidimensional (2× number of atom types (A and B) +1 (κ), usual practical parametrization searches for 15-30 parameters). High-quality solutions do not have any special features or recognizable patterns to take advantage of as a priori knowledge. It is not decomposable due to the influences between parameters and charges. For a clearer explanation, lets’ take a methane molecule CH4. The value of the hydrogen’s parameter determines the value of its charge. Furthermore, the hydrogen’s charge influences the charge of the carbon. During the calibration, with known QM charges, this influence expresses as a change in the value of the carbon’s parameter. Therefore, it is not possible to find the ideal parameters for one atom

96 4. Simplified Optimization Method in Atomic Charges Parametrization

type in a separate subproblem solving. We must always search for all parameters as a whole because it is their specific combination that does or does not perform well. Furthermore, the calibration has an indirect evaluation of the fit- ness function that includes solving many linear systems (scaling with the size of the training set), regardless of a particular choice for the ob- jective function. Thus, the cost of each fitness function evaluation and their high number needed during the calibration creates a bottleneck. We have one characteristic that makes things easier: we do not search for the global minimum, deep enough local one suffices for further applications.

4.3.2 Fitness Landscape The fitness landscape is the surface in multi-dimensional space where the heights are given by the fitness function value of a given vector. With two-dimensional vectors, it can be easily imagined like an earth landscape with smooth or rugged valleys, steep or slowly rising hills, deep trenches or wells, plateaus, and saddle points. Exploratory fitness landscape analysis [157, 210] has measures to capture the characteristics of landscapes for various optimization problems. It started with combinatorial problems [173], variants for continuous problems have appeared later [194, 121]. Better insight into the character of the fitness landscape and the optimization problem means we can apply a suitable method for its search according to general properties of algorithms [173] or detailed frameworks, such as [69, 78, 22]. Almost all measures for continuous landscape work with a se- quence of solutions. Due to the continuity, we had to sample the space. We applied two approaches on two datasets, homogeneous set1 with eight atom types (17 parameters) and heterogeneous set3 with 15 atom types (31 parameters), datasets DTP_small and CCD_gen from [166]. First, in random search, we generated the million-vector sample with Latin Hypercube Sampling (LHS) and kept the random sorting. We refer to the sample as RSS. And second, as a previous approach has not effectively sampled deep minima, we generated 1000-vector sample with LHS and then ran 1000 iterations of local optimization method NEWUOA [160, 161] on each vector. We refer to samples as

97 4. Simplified Optimization Method in Atomic Charges Parametrization

LSSi for i = 1 . . . 1000. The number of iterations has been determined experimentally, more iterations fail to discover a deep minimum if it was not found by 1000th iteration (although they can improve a found one). NEWUOA has been selected due to its easy application, good performance, and a low number of fitness function evaluations as these are expensive in our problem.

Measures We computed several measures to analyze the samples and describe the landscapes for two objective functions for the vector evaluation: Pearson correlation coefficient R and per-atom-type average RMSD avg(RMSDa) between QM charges from the training set and EEM charges computed with parameters represented by the vector. We note here that for the sake of simplicity we talk about a minimization problem in both cases, even though searching for the highest possible R is a maximization problem (albeit easily converted to a minimization one by subtracting the correlation from 1). We were interested in

∙ locality (how close are objective values for two close vectors?),

∙ deep minima (how many are there?),

∙ distribution of fitness values (is the landscape rugged and steep or smooth and slow-changing?).

Locality describes to what extent close solutions have close objec- tive function values. Usually, it is computed as a correlation between solutions’ distances and objective values’ distances. Many variants of locality-based measures differ in how the pairs of solutions are selected [173], e. g., all solutions are compared to the global optimum (fitness-distance correlation FDC) or solutions are compared inpairs within random walk sequence (random walk correlation RWC). In random search sample, we computed RWC(RSS) taking consec- utive pairs in random walk and their fitnesses. In the random sample with local searches, we computed several measures based on locality. First, we computed FDC(LSSi) for ∀i, taking that search’s minimum as the global one. Then we computed an average of these localities, i. e.,

98 4. Simplified Optimization Method in Atomic Charges Parametrization

avg(FDC(LSSi)), further referred to as eLLSS. Second, we compared the average localities within “good” and “bad” searches, dividing them by whether or not they found a deep minimum. We refer to those as eLgood and eLbad. Third, we computed the locality LLSS−whole=FDC(LSS1...1000) within the whole set of LSS1...1000, comparing vectors to the best vector found. By deep minima, we mean Pearson correlation coefficient above 0.9 or average per-atom-type RMSD below 0.15. Its existence within the sample and its number give a rough image of the landscape. Fur- thermore, we were interested in the basin size, i. e., how large is the area where if local search is started, it will end up in deep minima. The distribution of fitness values gives a basic overview of high and low peaks in landscape and their relative occurrence. Averages of the vectors at the beginning of the local search, starting vectors, divided by their successful (Festarting−good) or unsuccessful (Festarting−bad) end in local minimum might give us a clue about the relation between the fitness of the vector and its later performance in local search.In addition to that, we investigated how this relation changes after 100 iterations of local search through Fea f ter100−good and Fea f ter100−bad. More- over, the sequence of fitness values within the process of the local search shows how rugged and steep the landscape is and if there are plateaus around the minima. These measures should give us sufficient description of fitness landscape and optimization problem to devise a suitable method.

Results and Discussion

Random search with random ordering has shown only a little infor- mation: zero locality and only one vector with fitness in deep local minimum out of a million samples. The distribution of both landscapes shows the vast majority of vectors to be of low-quality, see Figures 4.1 and 4.2. This suggests that the minima are scarce and the landscape is not smooth. Moreover, they are not wide wells with perpendicular walls, but some downhill process is required to get to the minimum. The random search clearly does not sample the space around local minima sufficiently. Therefore, we included random sampling with local search as the second approach to the sampling.

99 4. Simplified Optimization Method in Atomic Charges Parametrization

Figure 4.1: Distribution of fitness values in R landscape for set1. Notice the logarithmic scale of y-axis.

Figure 4.2: Distribution of fitness values in avg(RMSDa) landscape for set1. Notice the logarithmic scale of y-axis and uneven x-axis. Fitnesses above 100 were omitted, they represented around 10% of all values.

100 4. Simplified Optimization Method in Atomic Charges Parametrization

Table 4.1: Locality-related measures for random sample with local search.

dataset eLLSS eLgood eLbad LLSS−whole set1 R 0.65 ± 0.33 0.8 ± 0.13 0.53 ± 0.38 0.66 set3 R 0.67 ± 0.38 0.92 ± 0.06 0.59 ± 0.41 0.72

set1 avg(RMSDa) 0.41 ± 0.27 0.56 ± 0.25 0.29 ± 0.21 0.005 set3 avg(RMSDa) 0.39 ± 0.33 0.69 ± 0.28 0.29 ± 0.26 0.007

Random sampling with local search managed to sample the parts of the landscape near local minima. Almost half of 1000 local mini- mizations for homogeneous set1 (432 and 421 for R and avg(RMSDa), respectively) and a quarter for heterogeneous set3 (238 and 234 for R and avg(RMSDa), respectively) ended in deep minima. Therefore, we can describe these relevant parts of the landscape in closer detail. Table 4.1 shows computed locality-related measures for both datasets and both landscapes. Due to a high standard deviation in eLLSS, we can assume that the locality varies throughout the landscapes, depending on a position. If we put the vector into the basin of attraction for a local minimum, high locality helps the local search to fall into it. How- ever, recognizing these basins is quite difficult, the average fitness of successful starting vectors (Festart−good) does not differ much from fit- ness of unsuccessful starting vectors (Festart−bad), see Table 4.2. Slightly better values with smaller standard deviation for successful starting vectors give better chances to starting vector with lower fitness values for avg(RMSDa). After 100 iterations of local optimization done on those starting vectors, the results reveals a more prominent difference in fitnesses, see Fea f ter100−good and Fea f ter100−bad. Even such short try-out can help to determine if the vector is promising and worth longer local optimization. Figure 4.3 (a) shows how avg(RMSDa) drops in case of near-by local minima and stays the same if we did not hit the basin of attraction for set3. In the case of the R objective function, it takes more iterations, but the difference is still notable, as part (b) shows. The landscape of R is more smooth (rather high LLSS−whole), the curve displaying the objective value during the local search does

101 4. Simplified Optimization Method in Atomic Charges Parametrization

Table 4.2: Average fitnesses of starting vectors and vectors after 100 local iterations.

dataset Festart−good Festart−bad Fea f ter100−good Fea f ter100−bad set1 R 0.04 ± 0.46 0.02 ± 0.28 0.48 ± 0.48 0.02 ± 0.3 set3 R 0.07 ± 0.35 -0.008 ± 0.25 0.25 ± 0.35 -0.006 ± 0.26

set1 avg(RMSDa) 122 ± 969 366 ± 1532 0.61 ± 0.7 178 ± 1828 set3 avg(RMSDa) 351 ± 4443 501 ± 2567 0.92 ± 0.49 149 ± 615 not exhibit many steep changes as Figure 4.4 (b) shows. NEWUOA changes the vector a bit in each iteration (although not in a random way), so two neighboring vectors have a close distance in optimization space. However, their objective value can differ substantially in case of avg(RMSDa) landscape, as we can see in Figure 4.4 (a). The local minima lay in a plateau in both landscapes. Almost all successful searches ended in a different local minimum. We clustered deep local minima with HDBscan [33], density-based clustering algorithm with minimum cluster size 2. HDBscan found only 6 clusters out of almost 500 minima in landscapes corresponding to set1 and 4 clusters out of almost 250 minima in landscapes corre- sponding with set3, considering the vast majority of local minima to be solitary points. Therefore, we can assume a large number of differ- ent local minima widely spread through the landscape with narrow basins of attraction. Even though parameters’ values depend on the training set, we have shown that the fitness landscapes based on the same objective function share similar properties even for different datasets. This means that characteristics of the fitness landscape apply generally on this particular optimization problem and the given objective func- tion.

Overview of Found Properties

Through the analysis of the optimization space and the fitness land- scape, we have revealed the following important properties:

102 4. Simplified Optimization Method in Atomic Charges Parametrization

(a)

(b)

Figure 4.3: The objective value in the process of local search. Notice how it drops after 100 or 300 iterations of local search if the local minima, later found, is close. In case of no close local minima, it remains the same even for 1000 iterations. 103 4. Simplified Optimization Method in Atomic Charges Parametrization

(a)

(b)

Figure 4.4: The objective value in the process of local search. In all four cases, we chose the run with lowest locality that have not found the deep minimum. Notice the prominent ruggedness in (a) avg(RMSDa) landscapes as opposed to quite smooth (b) R landscapes.

104 4. Simplified Optimization Method in Atomic Charges Parametrization

∙ the problem is continuous, only softly constrained, and nonlin- ear,

∙ the problem is not decomposable,

∙ little or no a priori knowledge about high-quality solutions,

∙ the locality changes over the landscape with the highest values close to local minima,

∙ there is only weak connection between the fitness of the vector and the fitness of the local minimum it can getto,

∙ this connection strengthens after 100 local search iterations,

∙ the distribution of fitness values suggests plateaus around each deep minima. Now, we will argue with these the design decisions made in our novel algorithm.

4.4 Simplified Model of Computation

As we mentioned before, no state-of-the-art methods have offered arguments for their choice of the calibration method. The analysis of the optimization problem has shown that they often go against its properties thus ineffectively solve the problem. The least squares minimization searches only along one dimension in a multidimensional optimization space, thus neglecting many deep local minima. Modern heuristics methods disregard one of the basic properties of the problem. Due to the non-decomposability, the recombination in genetic and evolutionary algorithms cannot possibly contribute to the high-quality solution. These operators construct a new solution from the parent ones by recombining their (potentially good) solution to sub-problems under the assumption of problem’s decomposability [173, p.118]. However, this does not apply to the problem of atomic charges calibration. The results obtained by [119, 36, 30] came only from the extensive use of the local optimization method, genetic algorithm worked only

105 4. Simplified Optimization Method in Atomic Charges Parametrization as an expensive diversification operator. The same holds for differen- tial evolution [134], although its operator could work as local search with high mutation constant, constrained values, and homogeneous dataset. We have taken into account the properties of the optimization problem and devised a guided minimization method. Its algorithm completely omits the genetic and evolution steps from Algorithm 10, thus follows a much simpler procedure, see Algorithm 12. First, we generate a random sample of vectors, much like a population in modern heuristics. We evaluate all of them and select a few of the best as the starting vectors. Then, we run short local optimization with NEWUOA and select the most promising vector. This one undergoes longer local optimization, finding the result of the whole optimization.

Algorithm 12 Guided minimization algorithm (GDMIN) 1: function find_parameters() 2: sample ← generate_random_sample() 3: ∀ x ∈ sample: F(x) ← evaluate(x) 4: ∀ x ∈ (subset ⊂ population): x ← local_opt(x) 5: best ← find_the_best(population) 6: best ← local_opt(best) 7: return best

1: function evaluate(x) 2: ∀ molecule ∈ training_set: eem_charges ← eem(x) 3: F(x) ← compare(eem_charges, qm_charges)

This algorithm respects the limiting properties of the problem and takes the advantage of the little knowledge we have about the local minima. It searches through the whole multi-dimensional space. It uses constraints as intervals for initial generation of vectors, but later on, it does not enforce them strictly. Our local optimization of choice, NEWUOA, is an unconstrained algorithm. Apart from the state-of- the-art methods, it does not rely on decomposability, a characteristic not present in the problem of atomic charges calibration. It does not construct new vectors by recombination, instead, it gradually modifies an existing solution.

106 4. Simplified Optimization Method in Atomic Charges Parametrization

The generation of a random sample and its evaluation makes it possible to select the best vectors for further local minimization. This takes the advantage of a slightly better chance of successfully found deep minima if the starting vector has better fitness. Furthermore, the first short minimization quickly shows which vectors are within the basin of attraction for a local minimum and which are not. The second, longer minimization takes advantage of a high locality in the landscape around local minimum and with high probability should find one. This simplified model of atomic charge calibration should run faster as it reduces the number of iterations in the main calibration loop (line 1 of Algorithm 11). That should result in less memory- and time-consuming computation compared to the state-of-the-art methods. Furthermore, its straightforward structure makes it easier to implement, debug, tune, and routinely use, saving time and resources.

4.5 Accuracy and Speedup

In this section, we evaluate the gains that come from the simplified model of our optimization problem. We have extensively compared current approaches and our method in [143], evaluating both theoreti- cal and practical speed-up and comparing the results on parametriza- tions with three different training sets. Here, we shortly recapitulate the most important findings, for detailed information see [143].

4.5.1 Implementation We implemented the state-of-the-art approaches (least squares mini- mization, differential evolution, genetic algorithm, and their modifi- cations with local optimization), a few unused methods for complete comparison (solely local method, GA without local method) and our novel GDMIN into NEEMP [166], a tool for calculation, parametriza- tion and validation of EEM parameters implemented in C. As men- tioned earlier, we took Powell’s NEWUOA algorithm [160] imple- mented in FORTRAN [161] for local minimizations. We applied Latin Hypercube Sampling implemented by [31] to generate the initial pop- ulation/sample in pseudo-random, unbiased way. For solving of sys-

107 4. Simplified Optimization Method in Atomic Charges Parametrization tems of linear equations we employed LAPACK implementations and we naively parallelized with OpenMP. The code will be available in the next version of NEEMP at http://www.fi.muni.cz/~xracek/neemp/.

4.5.2 Evaluation Methods In our experiments in [143], we compared seven algorithms: ∙ NEWUOA: solely local optimization on one random vector, never used before,

∙ LR: least squares minimization over κ, as proposed by [123],

∙ DE: differential evolution without local optimization, as done by [134],

∙ DEMIN: differential evolution with local optimization applied on the result vector after evolution (variant DEMIN1), and on every trial vector (variant DEMIN2) and also on every vector from the initial population (variant DEMIN3), as developed also by us in [166],

∙ GA: genetic algorithm without local optimization, never used before,

∙ GAMIN: genetic algorithm with local optimization applied on result vector after genetic procedure (variant GAMIN1, as done by [119, 36]) and also on every parent in every iteration (variant GAMIN2, as done by [30]),

∙ GDMIN: guided minimization method, as developed by us in [143]. For each algorithm, we have tried many combinations of its set- tings, such as population size and iteration count. We call a specific combination of settings a configuration. For each configuration, we ran four experiments with different random seeds to prevent the influence of a “lucky” random seed. In all experiments, we applied avg(RMSDa) as the objective func- tion, as it performs better than R2 on heterogeneous datasets and comparably on homogeneous datasets [166].

108 4. Simplified Optimization Method in Atomic Charges Parametrization

We evaluated these algorithms on three datasets differing in the variability of atomic and molecular types. Dataset set1, the same as the one used for fitness landscape analysis in section 4.3, consisted of almost 2000 small organic molecules composed of five atom elements presented by eight atom types. Dataset set2, not used in our analy- sis before, consisted of approximately 4500 small organic molecules composed of nine atom elements represented by 15 atom types. And dataset set3, the same as the one used for fitness landscape analysis in section 4.3, consisted of approximately 4500 organic and anorganic molecules, peptides, and organometals composed by nine elements represented by 15 atom types. We aimed to compare the accuracy and the relative speed-up be- tween the algorithms.

4.5.3 Accuracy Evaluation

By the accuracy of a calibration method we mean its capability to reliably produce high-quality parameters. By high-quality parameters we mean those that produce empirical charges with total R2 > 0.9, per-atom-type Ra > 0 for all atom types, total RMSD < 0.1, per-atom- type RMSDa < 0.2 for all atom types. We considered a method and its configuration reliable if it produced high-quality parameters inat least three out of four runs with different random seeds. Sole local (NEWUOA) and sole global (LR, DE, GA) optimiza- tions failed to provide the high-quality solutions. The combination of

Table 4.3: Quality of Best Parameters Calculated by Guided Minimization, taken from [143]. The configuration describes the size of the population/number of selected vectors/number of iterations in the first, short minimization/number of iterations in the second, long minimization.

2 2 configuration R avg(Ra) RMSD avg(RMSDa) set1 100/1/100/500 0.970 0.789 ± 0.140 0.058 0.074 ± 0.026 set2 100/1/100/3000 0.959 0.674 ± 0.257 0.070 0.078 ± 0.033 set3 250/1/100/2000 0.970 0.741 ± 0.227 0.0653 0.0632 ± 0.035

109 4. Simplified Optimization Method in Atomic Charges Parametrization the global search and local optimization (DEMIN, GAMIN, GDMIN) coped with the multidimensional optimization well even for heteroge- neous datasets and often reliably, not depending on the lucky random seed. The experiments have shown the redundancy of evolution or genetic iterations, high-quality results occurred regardless of their number. Guided minimization method showed great performance, providing high-quality parameters reliably in experiments with popu- lation size 100 (250 for set3), selecting five (or even one) best vectors, minimizing them shortly for 100 iterations, choosing the best, and lo- cally minimizing it for 2000 iterations. Table 4.3 provides details about the smallest reliable configurations and their results. DEMIN did not fall behind much, the differences were tiny. GAMIN produced the worst parameters out of these three methods although still satisfying the requirements. Furthermore, we validated the best parameters found by GDMIN on non-training datasets. Empirical charges they determine excellently

Figure 4.5: Correlation of Ligandexpo’s QM Charges and EEM Charges Trained on Set3, taken from [143]

110 4. Simplified Optimization Method in Atomic Charges Parametrization

agree with QM charges with R2 > 0.97 and RMSD ≈ 0.06. Figure 4.5 visualizes the validation between empirical charges computed with parameters trained on set3 and QM reference charges on the large heterogeneous dataset ligandexpo, details in [143]. The results have shown that guided minimization reaches the similar (DEMIN) or better (LR, GAMIN) accuracy than the state-of- the-art methods.

4.5.4 Theoretical Speedup As we have shown in section 4.2, solving the linear equation system (LS) presents a bottleneck in the calibration procedure. By simplifying the optimization algorithm, the number of these solvings drops, as we show here. Taking one solution for one molecule as a measure unit, we computed how many are needed for different algorithms, see Table 4.4. Absolute numbers would be multiplied by the number of molecules in the training set, we omit these as they do not influence the order. The ordering of methods’ complexities depends on the relative ordering of their settings. Generally, these apply:

∙ in all algorithms, the population size is smaller than the number of iterations, i. e., P < G,

∙ in GA and DE, the number of global iterations tends to be smaller than the number of local iterations, i. e., G < L,

∙ in GAMIN and DEMIN, the number of local iterations applied on the result vector, on offspring/trial vectors created in global iterations and on vectors from initial population decreases, i. e., L1 > L2 ≥ L3,

∙ in GDMIN, the number of starting vectors selected from the initial sample is much lower than the population size, M ≪ P.

This means the following ordering in the complexity of the scruti- nized algorithms: LR < DE ≃ NEWUOA < GDMIN ≃ DEMIN1 < DEMIN2 < DEMIN3 < GA < GAMIN1 < GAMIN2.

111 4. Simplified Optimization Method in Atomic Charges Parametrization

Table 4.4: How Many Solutions of Linear System (LS) Are Needed in Each Method? Taken from [143].P is the size of population, G is the number of global method iterations, L is the number of local method iterations, M is the number of vectors selected from population to be minimized. k ∈ (0, 1) is the percentage of population to be minimized (depending of the way of selection).

method number of LS solutions LR interval size / step NEWUOA L DE P + G GA P + PG DEMIN1 P + G + L

DEMIN2 P + GL2 + L1 DEMIN3 P(1 + kL3) + GL2 + L1 GAMIN1 P + PG + L

GAMIN2 P + G(P + (P/2)L2 + L1) GDMIN P + ML2 + L1

4.5.5 Practical Speedup and Scalability

The computation time of computations in our experiments corre- sponded with the theoretical order. It ranged from seconds (LR, DE) through minutes and tens of minutes (NEWUOA, DEMIN, GDMIN) to hours and days (GA, GAMIN). We ran all experiments in parallel with OpenMP on four cores of Intel E5-2670 2.6 GHz. They all required just a few tens of megabytes of memory. From the methods that successfully found high-quality parameters (i. e., DEMIN, GAMIN, GDMIN), GDMIN and DEMIN are compara- ble, GAMIN is much more computationally demanding, as shown in Figure 4.6. The close difference between GDMIN and DEMIN is determined by the chosen settings for these methods. Especially DEMIN1 (local minimization only on the evolution result vector) gets close to the computation time performance of GDMIN. As we decrease the number

112 4. Simplified Optimization Method in Atomic Charges Parametrization

100000 GMIN reliable GMIN minimal DEMIN reliable 10000 DEMIN minimal GAMIN minimal

minute 1000 hour day

100 Computation time [s]

10

1 set1 set2 set3

Figure 4.6: Computation time for all *MIN methods, taken from [143]. The dots denote the run of minimal successful configuration, the bars minimal reliable successful configuration.

of evolution iterations, because they do not contribute to the quality of results, we come close to the computational performance of GDMIN. It is useful to realize that DEMIN1 with no evolution iteration is basically GDMIN: we generate the population, choose the best and run local minimization on it. Guided minimization runs faster than the state-of-the-art genetic or evolutionary algorithms. It is comparable to our previous method, DEMIN, but only thanks to the fact that the evolution iterations do not contribute to the result’s quality and we can decrease their number even to 0.

4.6 Conclusion

We simplified the optimization algorithm used in atomic charges parameterization according to findings from the analysis of the op- timization space and fitness landscape. We proposed a method that omits redundant iterations from the state-of-the-art methods and thus

113 4. Simplified Optimization Method in Atomic Charges Parametrization reduces the number of bottleneck function calls while keeping or even surpassing their accuracy. Moreover, a simpler structure of code of our method makes it easier to implement, tune, and routinely use. The most similar method in terms of accuracy and computational perfor- mance, DEMIN, developed by us in [166], gets near only because it also applies local search on the best vector from the population. The evolu- tion iterations do not contribute to the accuracy and their low number in our experiments causes similar computation time performance. Each line of the guided minimization method takes the advantage of some property of the optimization problem or the fitness land- scape. Therefore, it efficiently solves the atomic charges calibration problem with just the necessary computational effort. The approach of analyzing the properties of the problem in order to find a suitable method would be beneficial for all problems not only in computational chemistry.

4.6.1 Future Work This research can be furthered in two directions. First, in the application of atomic charges calibration, we continue our work on parametrizations for more complicated training sets. Es- pecially datasets with proteins and compounds with metals have great difficulties with parametrizations. Careful preparation of the training sets and selection of the equation system is the work of our colleagues in computational chemistry. Among the possible improvements in the scope of the calibration method, we would like to try changing the objective function during the search. We have tried and failed to develop an objective function that combines the correlation and distance metrics and performs well. The change during the search could work better. And second, other problems in interdisciplinary fields exhibit similar properties in their search. A simplification led by the analysis of the optimization space can make the computation faster. An example of such problem is fitting of the SAXS curve, where also a simple optimization method surpasses in performance and accuracy of more complicated methods [94].

114 5 Conclusions

In this thesis, we have addressed the problem of acceleration of chem- ical simulations by modifying the model of their computation. By modeling the computation as a series of bottleneck function’s eval- uations, we have shifted the focus from the demanding function to the loop. We have applied this methodology to three problems from computational chemistry, two dealing with the necessity for long timescales of molecular dynamics simulations, one dealing with the accuracy of the molecular mechanics model. In the first problem we researched, we accelerated the demanding calculation of interatomic potential in a long series of timesteps in molecular dynamics simulation by parallelizing also the temporal domain. We applied parallel-in-time method PFASST to molecular dynamics, the first such combination. Although we have not perfected the prototype implementation into ready-to-use software, we systemat- ically analyzed all the issues that emerged. We supported our analysis with simulations of four molecular systems, one of them a peptide in water. It was the first attempt for the parallel-in-time MD simulation of a biomolecular system to our knowledge. Macroscopic measures of simulations (such as temperature) seem to correspond with those obtained by sequential-in-time simulations. However, the question of correctness and accuracy remains for future research. In the second problem we researched, we accelerated the demand- ing calculation of the mean square distance between many molecular structures in metadynamics simulation by approximating it thanks to the information from the previous iterations. The speed-up evalu- ated by our implementation in the standard metadynamics software Plumed almost reaches the maximal theoretical speed-up while the accuracy remains preserved. Thanks to our method, the overhead caused by enhanced sampling diminished. In the third problem we researched, we accelerated the demand- ing comparison between empirical and quantum atomic charges in a loop over iterations of the optimization method in atomic charges parametrization by omitting the redundant iterations. We rigorously analyzed the character of the optimization problem and the fitness landscape and used the findings to devise a method that is much

115 5. Conclusions simpler, faster, and more accurate than overcomplicated current ap- proaches. Our role as computer scientists in research of these problems ex- ceeded the position of programmers. We applied computer science methods and techniques to develop new methods based on system- atic analysis of the problem. Complex computational problems and their acceleration require more than high-quality coding for their effi- cient solution. They often include challenging issues in the scope of computer science that can contribute well beyond the aspect of the implementation. Successful application of our proposed methodology in all three cases shows the benefit of focusing on the loop instead of the bottleneck function. With the increasing number of computational resources and highly optimized codes thanks to the previous efforts, the limits of acceleration of a function are beginning to hit their limits. Shifting the focus might bring a new viewpoint, look beyond the known, and inspire the development of new methods and approximations that will lead to the acceleration of complex computational problems while preserving their accuracy.

116 Bibliography

[1] S. J. Aarseth, Gravitational N-body simulations. Cambridge Uni- versity Press, 2003, p. 413, isbn: 0521432723. doi: 10 . 1017 / CBO9780511535246. [2] M. J. Abraham, T. Murtola, R. Schulz, S. Páll, J. C. Smith, B. Hess, and E. Lindah, “Gromacs: High performance molecu- lar simulations through multi-level parallelism from laptops to supercomputers”, SoftwareX, vol. 1-2, pp. 19–25, 2015, issn: 23527110. doi: 10.1016/j.softx.2015.06.001. [3] A. Amadei, A. B. Linssen, and H. J. Berendsen, “Essential dy- namics of proteins.”, Proteins, vol. 17, no. 4, pp. 412–25, Dec. 1993, issn: 0887-3585. doi: 10.1002/prot.340170408. [4] Y. Andoh, N. Yoshii, K. Fujimoto, K. Mizutani, H. Kojima, A. Yamada, S. Okazaki, K. Kawaguchi, H. Nagao, K. Iwahashi, F. Mizutani, K. Minami, S. Ichikawa, H. Komatsu, S. Ishizuki, Y. Takeda, and M. Fukushima, “MODYLAS: A Highly Paral- lelized General-Purpose Molecular Dynamics Simulation Pro- gram for Large-Scale Systems with Long-Range Forces Calcu- lated by Fast Multipole Method (FMM) and Highly Scalable Fine-Grained New Parallel Processing Algorithms”, Journal of Chemical Theory and Computation, vol. 9, no. 7, pp. 3201–3209, Jul. 2013, issn: 1549-9618. doi: 10.1021/ct400203a. [5] P. F. A. L. Anet, “Dynamics of Eight-Membered Rings in the Cyclooctane Class”, Topics in Current Chemistry, vol. 45, pp. 169– 220, 1974. doi: 10.1007/3-540-06471-0_17. [6] A. Arnold, O. Lenz, S. Kesselheim, R. Weeber, F. Fahrenberger, D. Roehm, P. Košovan, and C. Holm, “ESPResSo 3.1: Molecu- lar dynamics software for coarse-grained models”, in Lecture Notes in Computational Science and Engineering, vol. 89 LNCSE, Springer, Berlin, Heidelberg, 2013, pp. 1–23, isbn: 9783642329784. doi: 10.1007/978- 3- 642- 32979- 1_1. arXiv: arXiv:1011. 1669v3. [7] M. Athar, M. Y. Lone, V. M. Khedkar, and P. C. Jha, “Pharma- cophore model prediction, 3D-QSAR and molecular docking studies on vinyl sulfones targeting Nrf2-mediated gene tran- scription intended for anti-Parkinson drug design”, Journal of

117 BIBLIOGRAPHY

Biomolecular Structure and Dynamics, vol. 34, no. 6, pp. 1282– 1297, Jun. 2016, issn: 0739-1102. doi: 10.1080/07391102.2015. 1077343. [8] E. Aubanel, “Scheduling of tasks in the parareal algorithm”, , vol. 37, no. 3, pp. 172–182, Mar. 2011, issn: 01678191. doi: 10.1016/j.parco.2010.10.004. [9] C. Audouze, M. Massot, and S. Volz, “Symplectic multi-time step parareal algorithms applied to molecular dynamics”, SIAM Journal on Scientific Computing, pp. 1–18, Feb. 2009. [Online]. Available: https://hal.archives-ouvertes.fr/hal-00358459/. [10] R. F. Bader, “Atoms in molecules”, Accounts of Chemical Research, vol. 18, no. 1, pp. 9–15, 1985. doi: 10.1002/0470845015.caa012. [11] L. Baffico, S. Bernard, Y. Maday, G. Turinici, and G. Zérah, “Parallel-in-time molecular-dynamics simulations”, Physical Re- view E, vol. 66, 057701:1–057701:4, 2002. doi: 10.1103/PhysRevE. 66.057701. [12] G. Bal, “On the Convergence and the Stability of the Parareal Algorithm to solve Partial Differential Equations”, in Domain Decomposition Methods in Science and Engineering, Springer Berlin Heidelberg, 2005, ch. XI, pp. 425–432. doi: 10.1007/3- 540- 26825-1_43. [13] G. Bal and Y. Maday, “A "Parareal" Time Discretization for Non- Linear PDE’s with Application to the Pricing of an American Put”, in Recent Developments in Domain Decomposition Methods, vol. 23, 2002, pp. 189–202. doi: 10.1007/978-3-642-56118- 4_12. [14] M. Baram, Y. Atsmon-Raz, B. Ma, R. Nussinov, and Y. Miller, “Amylin-Aβ oligomers at atomic resolution using molecular dynamics simulations: a link between Type 2 diabetes and Alzheimer’s disease.”, Physical chemistry chemical physics : PCCP, vol. 18, no. 4, pp. 2330–8, 2016, issn: 1463-9084. doi: 10.1039/ c5cp03338a. [15] A. Barducci, M. Bonomi, and M. Parrinello, “Metadynamics”, Wiley Interdisciplinary Reviews: Computational Molecular Science, vol. 1, no. 5, pp. 826–843, Sep. 2011, issn: 17590876. doi: 10. 1002/wcms.31.

118 BIBLIOGRAPHY

[16] J. Barnes and P.Hut, “A hierarchical O(N log N) force-calculation algorithm”, Nature, vol. 324, pp. 446–449, 1986. doi: 10.1038/ 324446a0. [17] L. Baweja, K. Balamurugan, V. Subramanian, and A. Dhawan, “Hydration patterns of graphene-based nanomaterials (GB- NMs) play a major role in the stability of a helical protein: a molecular dynamics simulation study.”, ACS Journal of Sur- faces and Colloids, vol. 29, no. 46, pp. 14 230–14 238, Nov. 2013, issn: 1520-5827. doi: 10.1021/la4033805. [Online]. Available: http://pubs.acs.org/doi/abs/10.1021/la4033805. [18] C. I. Bayly, P. Cieplak, W. Cornell, and P. A. Kollman, “A well- behaved electrostatic potential based method using charge re- straints for deriving atomic charges: the RESP model”, The Journal of Physical Chemistry, vol. 97, no. 40, pp. 10 269–10 280, Oct. 1993, issn: 0022-3654. doi: 10.1021/j100142a004. arXiv: 93/2091-10269{\$}04.00/0 [0022-3654]. [19] A. D. Becke, “Density-functional thermochemistry. III. The role of exact exchange”, The Journal of Chemical Physics, vol. 98, no. 7, p. 5648, 1993. doi: 10.1063/1.464913. [20] M. Bedez, Z. Belhachmi, O. Haeberlé, R. Greget, S. Moussaoui, J.-M. Bouteiller, and S. Bischoff, “A fully parallel in time and space algorithm for simulating the electrical activity of a neural tissue”, Journal of Neuroscience Methods, vol. 257, pp. 17–25, Jan. 2016, issn: 01650270. doi: 10.1016/j.jneumeth.2015.09.017. [21] R. C. Bernardi, M. C. Melo, and K. Schulten, “Enhanced sam- pling techniques in molecular dynamics simulations of bio- logical systems”, Biochemica et Biophysica Acta (BBA) - General Subjects, vol. 1850, no. 5, pp. 872–877, May 2015, issn: 03044165. doi: 10.1016/j.bbagen.2014.10.019. [22] B. Bischl, O. Mersmann, H. Trautmann, and M. Preuss, “Algo- rithm Selection Based on Exploratory Landscape Analysis and Cost-Sensitive Learning”, Proceedings of the fourteenth interna- tional conference on Genetic and evolutionary computation confer- ence - GECCO ’12, pp. 313–320, 2012. doi: 10.1145/2330163. 2330209. [23] L. S. Blackford, A. Petitet, R. Pozo, K. Remington, R. C. Whaley, J. Demmel, J. Dongarra, I. Duff, S. Hammarling, G. Henry, M. Heroux, L. Kaufman, and A. Lumsdaine, “An updated set of

119 BIBLIOGRAPHY

basic linear algebra subprograms (BLAS)”, ACM Transactions on Mathematical Software, vol. 28, no. 2, pp. 135–151, Jun. 2002, issn: 00983500. doi: 10.1145/567806.567807. [24] M. Bolten, D. Moser, and R. Speck, “Asymptotic convergence of the parallel full approximation scheme in space and time for linear problems”, ArXiv, p. 1703.07120, Mar. 2017. arXiv: 1703.07120. [25] D. Branduardi, M. De Vivo, N. Rega, V. Barone, and A. Cavalli, “Methyl phosphate dianion hydrolysis in solution characterized by path collective variables coupled with DFT-based enhanced sampling simulations”, Journal of Chemical Theory and Compu- tation, vol. 7, no. 3, pp. 539–543, Mar. 2011, issn: 15499618. doi: 10.1021/ct100547a. [26] D. Branduardi, F. L. Gervasio, A. Cavalli, M. Recanatini, and M. Parrinello, “The role of the peripheral anionic site and cation- π interactions in the ligand penetration of the human AChE gorge”, Journal of the American Chemical Society, vol. 127, no. 25, pp. 9147–9155, 2005, issn: 00027863. doi: 10.1021/ja0512780. [27] D. Branduardi, F. L. Gervasio, and M. Parrinello, “From A to B in free energy space”, Journal of Chemical Physics, vol. 126, no. 5, p. 054 103, Feb. 2007, issn: 00219606. doi: 10.1063/1.2432340. [28] W. M. Brown, S. Martin, S. N. Pollock, E. A. Coutsias, and J. P. Watson, “Algorithmic dimensionality reduction for molecular structure analysis”, Journal of Chemical Physics, vol. 129, no. 6, p. 064 118, Aug. 2008, issn: 00219606. doi: 10.1063/1.2968610. [29] J. Bulin, “Large-scale time parallelization for molecular dynam- ics problems”, Royal Institute Of Technology, Stockholm, Stock- holm, Tech. Rep., 2013, pp. 1–76. [Online]. Available: http : //www.diva- portal.org/smash/record.jsf?pid=diva2: 651381. [30] P. Bultinck, R. Vanholme, P. L. A. Popelier, F. De Proft, and P. Geerlings, “High-speed calculation of AIM charges through the electronegativity equalization method”, Journal of Physical Chemistry A, vol. 108, no. 46, pp. 10 359–10 366, 2004. doi: 10. 1021/jp046928l. [31] J. Burkardt, LATIN_RANDOM, 2004. [Online]. Available: http: //people.sc.fsu.edu/~jburkardt/cpp_src/latin_random/ latin_random.html.

120 BIBLIOGRAPHY

[32] G. Bussi, D. Donadio, and M. Parrinello, “Canonical sampling through velocity rescaling”, Journal of Chemical Physics, vol. 126, no. 1, p. 014 101, Jan. 2007, issn: 00219606. doi: 10 . 1063 / 1 . 2408420. arXiv: arXiv:0803.4060v1. [33] R. J. G. B. Campello, D. Moulavi, and J. Sander, “Density-Based Clustering Based on Hierarchical Density Estimates”, Advances in Knowledge Discovery and Data Mining, pp. 160–172, 2013, issn: 16113349, 03029743. doi: 10.1007/978-3-642-37456-2_14. arXiv: arXiv:1508.06655v1. [34] D. Case, T. Cheatham, T. Darden, H. Gohlke, R. Luo, K. Merz, A. Onufriev, C. Simmerling, B. Wang, and R. Woods, “The Amber biomolecular simulation programs.”, Journal of computational chemistry, vol. 26, no. 16, pp. 1668–1688, Dec. 2005, issn: 0192- 8651. doi: 10.1002/jcc.20290. [35] C. Cervetti, A. Rettori, M. G. Pini, A. Cornia, A. Repollés, F. Luis, M. Dressel, S. Rauschenbach, K. Kern, M. Burghard, and L. Bogani, “The classical and quantum dynamics of molecular spins on graphene”, Nature Materials, vol. 15, no. 2, pp. 164–168, Feb. 2015, issn: 1476-1122. doi: 10.1038/nmat4490. [Online]. Available: http://www.ncbi.nlm.nih.gov/pubmed/26641019. [36] J. Chaves, J. M. Barroso, P. Bultinck, and R. Carbó-Dorca, “To- ward an alternative hardness kernel matrix structure in the Electronegativity Equalization Method (EEM)”, Journal of Chem- ical Information and Modeling, vol. 46, no. 4, pp. 1657–1665, 2006. doi: 10.1021/ci050505e. [37] A. A. Chen and A. E. García, “High-resolution reversible fold- ing of hyperstable RNA tetraloops using molecular dynamics simulations.”, Proceedings of the National Academy of Sciences of the United States of America, vol. 110, no. 42, pp. 16 820–5, Oct. 2013, issn: 1091-6490. doi: 10.1073/pnas.1309392110. arXiv: arXiv:1408.1149. [38] Y. Chen, S. Kale, J. Weare, A. R. Dinner, and B. Roux, “Multi- ple Time-Step Dual-Hamiltonian Hybrid Molecular Dynamics - Monte Carlo Canonical Propagation Algorithm”, Journal of Chemical Theory and Computation, vol. 12, no. 4, pp. 1449–1458, Apr. 2016, issn: 15499626. doi: 10.1021/acs.jctc.5b00706. [39] W. Cornell, P. Cieplak, C. Bayly, I. Gould, K. Merz, D. Ferguson, D. Spellmeyer, T. Fox, J. Caldwell, and P. Kollman, “A Second

121 BIBLIOGRAPHY

Generation Force Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules”, Journal of the American Chemical Society, vol. 117, no. 19, pp. 5179–5197, May 1995, issn: 0002-7863. doi: 10.1021/ja00124a002. [40] E. A. Coutsias, C. Seok, and K. A. Dill, “Using quaternions to calculate RMSD”, Journal of Computational Chemistry, vol. 25, no. 15, pp. 1849–1857, Nov. 2004, issn: 01928651. doi: 10.1002/jcc. 20110. arXiv: arXiv:1011.1669v3. [41] C. J. Cramer, Essentials of Computational Chemistry: Theories and Models, 2. Wiley, 2004, vol. 42, pp. 334–342, isbn: 0470091819. doi: 10.1021/ci010445m. arXiv: physics/0509135v1 []. [42] R. Croce, D. Ruprecht, and R. Krause, “Parallel-in-space-and- time simulation of the three-dimensional, unsteady Navier- Stokes equations for incompressible flow”, Modeling, Simulation and Optimization of Complex Processes - HPSC 2012, pp. 13–23, 2014. doi: 10.1007/978-3-319-09063-4_2. [43] T. Darden, D. York, and L. Pedersen, “Particle Mesh Ewald: An N.log(N) method for Ewald sums in large systems”, Journal of Chemical Physics, vol. 99, pp. 10 089–10 092, 1993. doi: 10.1063/ 1.464397. [44] K. Datadien, “Investigation of H-center diffusion in sodium chloride by molecular dynamics simulation”, PhD thesis, 2013. [Online]. Available: http://essay.utwente.nl/63570/1/ BscReport_KarunDatadien_s0208663_openbaar.pdf. [45] J. D. Durrant, R. M. Bush, and R. E. Amaro, “Microsecond Molecular Dynamics Simulations of Influenza Neuraminidase Suggest a Mechanism for the Increased Virulence of Stalk- Deletion Mutants”, Journal of Physical Chemistry B, vol. 120, no. 33, pp. 8590–8599, Aug. 2016, issn: 15205207. doi: 10.1021/acs. jpcb.6b02655. [46] R. Elber, “Perspective: Computer simulations of long time dy- namics”, Journal of Chemical Physics, vol. 144, no. 6, p. 060 901, Feb. 2016, issn: 00219606. doi: 10.1063/1.4940794. [47] M. M. Elmazar, H. S. El-Abhar, M. F. Schaalan, and N. A. Farag, “Phytol/Phytanic Acid and Insulin Resistance: Potential Role of Phytanic Acid Proven by Docking Simulation and Modulation of Biochemical Alterations”, PLoS ONE, vol. 8, no. 1, F. Folli,

122 BIBLIOGRAPHY

Ed., e45638, Jan. 2013, issn: 19326203. doi: 10.1371/journal. pone.0045638. [48] M. Emmett and M. Minion, “Toward an efficient parallel in time method for partial differential equations”, Communications in Applied Mathematics and Computational Science, vol. 7, no. 1, pp. 105–132, 2012. doi: 10.2140/camcos.2012.7.105. [49] S. Engblom, “Parallel in Time Simulation of Multiscale Stochas- tic Chemical Kinetics”, Multiscale Modeling & Simulation, vol. 8, no. 1, pp. 46–68, Jan. 2009, issn: 1540-3459. doi: 10.1137/ 080733723. [50] N. J. English, M. Lauricella, and S. Meloni, “Massively paral- lel molecular dynamics simulation of formation of clathrate- hydrate precursors at planar water-methane interfaces: Insights into heterogeneous nucleation”, Journal of Chemical Physics, vol. 140, no. 20, p. 204 714, May 2014, issn: 00219606. doi: 10.1063/ 1.4879777. [51] P. Ewald, “Die Berechnung optischer und elektrostatischer Git- terpotentiale”, Annalen der Physik, vol. 369, no. 3, pp. 253–287, 1921. doi: 10.1002/andp.19213690304. [52] J. Filipovič, J. Pazúriková, A. Křenek, and V. Spiwok, “Acceler- ated RMSD Calculation for Molecular Metadynamics”, in 30th European Simulation and Modelling Conference, J. Évora-Gómez and J. J. Hernandéz-Cabrera, Eds., Ghent, Belgium: Reprod- uct NV, 2016, pp. 278–280. [Online]. Available: https://www. eurosis.org/cms/index.php?q=taxonomy/term/58. [53] P. F. Fischer, F. Hecht, and Y. Maday, “A parareal in time semi- implicit approximation of the Navier-Stokes equations”, Do- main Decomposition Methods in Science and Engineering XI, vol. 40, pp. 1–7, 2005, issn: 14397358. doi: 10.1007/3-540-26825-1_44. [54] M. J. Gander, “50 years of Time Parallel Time Integration”, in Multiple Shooting and Time Domain Decomposition, T. Carraro, M. Geiger, S. K∖"{o}rkel, and R. Rannacher, Eds., Springer, 2015, pp. 69–113. doi: 10.1007/978-3-319-23321-5_3. [55] V. Gapsys and B. L. de Groot, “Optimal Superpositioning of Flexible Molecule Ensembles”, Biophysical Journal, vol. 104, no. 1, pp. 196–207, 2013, issn: 0006-3495. doi: 10.1016/j.bpj.2012. 11.003.

123 BIBLIOGRAPHY

[56] I. Garrido, M. S. Espedal, and G. E. Fladmark, “A Convergent Algorithm for Time Parallelization Applied to Reservoir Simu- lation”, in Domain Decomposition Methods in Science and Engineer- ing, vol. 40, Berlin/Heidelberg: Springer-Verlag, 2005, pp. 469– 476, isbn: 3540225234. doi: 10.1007/3-540-26825-1_48. [57] F. J. Gaspar and C. Rodrigo, “Multigrid Waveform Relaxation for the Time-Fractional Heat Equation”, SIAM Journal on Sci- entific Computing, vol. 39, no. 4, A1201–A1224, Jan. 2017, issn: 1064-8275. doi: 10.1137/16M1090193. [58] C. Gear, “The automatic integration of ordinary differential equations”, Communications of the ACM, vol. 14, no. 3, pp. 176– 179, Mar. 1971, issn: 00010782. doi: 10.1145/362566.362571. [59] J. Glaser, T. D. Nguyen, J. A. Anderson, P. Lui, F. Spiga, J. A. Mil- lan, D. C. Morse, and S. C. Glotzer, “Strong scaling of general- purpose molecular dynamics simulations on GPUs”, Computer Physics Communications, vol. 192, pp. 97–107, Jul. 2015, issn: 00104655. doi: 10.1016/j.cpc.2015.02.028. arXiv: 1412.3387. [60] L. Greengard and V. Rokhlin, “A Fast Algorithm for Particle Simulations”, Journal of Computational Physics, vol. 73, pp. 325– 348, 1987. doi: 10.1016/0021-9991(87)90140-9. [61] J. M. A. Grime, J. F. Dama, B. K. Ganser-Pornillos, C. L. Wood- ward, G. J. Jensen, M. Yeager, and G. A. Voth, “Coarse-grained simulation reveals key features of HIV-1 capsid self-assembly”, Nature Communications, vol. 7, no. 11568, pp. 1–11, May 2016, issn: 2041-1723. doi: 10.1038/ncomms11568. [62] Gromacs, Blowing up, 2013. [Online]. Available: http://www. gromacs.org/Documentation/Terminology/Blowing_Up. [63] B. L. de Groot, D. M. van Aalten, A. Amadei, and H. J. Berend- sen, “The consistency of large concerted motions in proteins in molecular dynamics simulations.”, Biophysical journal, vol. 71, no. 4, pp. 1707–13, Oct. 1996, issn: 0006-3495. doi: 10.1016/ S0006-3495(96)79372-4. [64] J. Gu and P.E. Bourne, Structural bioinformatics. Wiley-Blackwell, 2009, p. 1035, isbn: 0470181052. [65] J. R. Gullingsrud, R. Braun, and K. Schulten, “Reconstruct- ing Potentials of Mean Force through Time Series Analysis of Steered Molecular Dynamics Simulations”, Journal of Compu-

124 BIBLIOGRAPHY

tational Physics, vol. 151, no. 1, pp. 190–211, May 1999, issn: 00219991. doi: 10.1006/jcph.1999.6218. [66] V. P. Gupta, Principles and applications of quantum chemistry. Aca- demic Press, 2005, isbn: 9780128034781. [67] G. Gurrala, A. Dimitrovski, S. Pannala, S. Simunovic, and M. Starke, “Parareal in Time for Fast Power System Dynamic Sim- ulations”, IEEE Transactions on Power Systems, vol. 31, no. 3, pp. 1820–1830, May 2016, issn: 08858950. doi: 10.1109/TPWRS. 2015.2434833. [68] T. Halgren, “Merck Molecular Force Field. I. Basis, Form, Scope, Parametrization, and Performance of MMFF94”, Journal of Com- putational Chemistry, vol. 17, pp. 490–519, 1996. doi: 10.1002/ (SICI)1096-987X(199604)17:5/6<490::AID-JCC1>3.0.CO; 2-P. [69] N. Hansen, A. Auger, R. Ros, S. Finck, and P. Pošík, “Comparing results of 31 algorithms from the black-box optimization bench- marking BBOB-2009”, Proceedings of the 12th annual conference on Genetic and evolutionary computation (GECCO ’10), pp. 1689– 1696, 2010. doi: 10.1145/1830761.1830790. [70] E. Harder, W. Damm, J. Maple, C. Wu, M. Reboul, J. Y. Xiang, L. Wang, D. Lupyan, M. K. Dahlgren, J. L. Knight, J. W. Kaus, D. S. Cerutti, G. Krilov, W. L. Jorgensen, R. Abel, and R. A. Friesner, “OPLS3: A Force Field Providing Broad Coverage of Drug-like Small Molecules and Proteins”, Journal of Chemical Theory and Computation, vol. 12, no. 1, pp. 281–296, Jan. 2016, issn: 15499626. doi: 10.1021/acs.jctc.5b00864. [71] D. J. Hardy, Z. Wu, J. C. Phillips, J. E. Stone, R. D. Skeel, and K. Schulten, “Multilevel Summation Method for Electrostatic Force Evaluation”, Journal of Chemical Theory and Computation, vol. 11, no. 2, pp. 766–779, Dec. 2015, issn: 1549-9618. doi: 10. 1021/ct5009075. [72] W. J. Hehre, A Guide to Molecular Mechanics and Quantum Chemi- cal Calculations. Irvine, CA, USA: Wavefunction, 2003, pp. 1–796, isbn: 189066118X. [73] E. R. Henry, R. B. Best, and W. A. Eaton, “Comparing a simple theoretical model for protein folding with all-atom molecular dynamics simulations.”, Proceedings of the National Academy of

125 BIBLIOGRAPHY

Sciences of the United States of America, vol. 110, no. 44, pp. 17 880– 5, 2013, issn: 1091-6490. doi: 10.1073/pnas.1317105110. [74] B. Hess, “P-LINCS: A Parallel Linear Constraint Solver for Molecular Simulation”, Journal of Chemical Theory and Compu- tation, vol. 4, no. 1, pp. 116–122, Jan. 2008, issn: 1549-9618. doi: 10.1021/ct700200b. [75] B. Hess, H. Bekker, H. Berendsen, and J. Fraaije, “LINCS: A linear constraint solver for molecular simulations”, Journal of Computational Chemistry, vol. 18, no. 12, pp. 1463–1472, Sep. 1997, issn: 0192-8651. doi: 10.1002/(SICI)1096-987X(199709)18: 12<1463::AID-JCC4>3.0.CO;2-H. [76] R. Hockney and J. Eastwood, Computer simulation using particles. Bristol, PA, USA: Taylor & Francis, Inc., 1988. [77] G. Horton, “The time-parallel Multigrid Method”, Communi- cations in Applied Numerical Methods, vol. 8, pp. 585–595, 1992. doi: 10.1002/cnm.1630080906. [78] J. Humeau, A. Liefooghe, E. Talbi, and S. Verel, “ParadisEO- MO: From fitness landscape analysis to efficient local search algorithms”, Journal of Heuristics, vol. 19, no. 6, pp. 881–915, Dec. 2013, issn: 13811231. doi: 10.1007/s10732-013-9228-8. [79] A. T. Iavarone and J. H. Parks, “Conformational change in un- solvated Trp-cage protein probed by fluorescence”, Journal of the American Chemical Society, vol. 127, no. 24, pp. 8606–8607, 2005, issn: 00027863. doi: 10.1021/ja051788u. [80] C. M. Ionescu, S. Geidl, R. Svobodová Vařeková, and J. Koča, “Rapid calculation of accurate atomic charges for proteins via the electronegativity equalization method”, Journal of Chemical Information and Modeling, vol. 53, no. 10, pp. 2548–2558, 2013. doi: 10.1021/ci400448n. [81] J. Jeffers, J. Reinders, and A. Sodani, Intel Xeon Phi Processor High Performance Programming. Elsevier, 2016, p. 632, isbn: 978- 0-12-809194-4. doi: http://dx.doi.org/10.1016/B978-0-12- 809194-4.00019-3. [82] F. Jensen, Introduction to computational chemistry, 2nd. Great Britain: John Wiley & Sons Ltd, 2007, pp. 1–620, isbn: 9780470011867. [83] A. Jewett, “Moltemplate Manual”, University of California, Santa Barbara, Tech. Rep., 2013, pp. 1–69.

126 BIBLIOGRAPHY

[84] Z. Jinhua, T. Kleinöder, and J. Gasteiger, “Prediction of pKa values for aliphatic carboxylic acids and alcohols with empirical atomic charge descriptors”, Journal of Chemical Information and Modeling, vol. 46, no. 6, pp. 2256–2266, 2006, issn: 15499596. doi: 10.1021/ci060129d. [85] D. E. Jones, A. M. Lund, H. Ghandehari, and J. C. Facelli, “Molec- ular dynamics simulations in drug delivery research: Calcium chelation of G3.5 PAMAM dendrimers”, Cogent Chemistry, vol. 2, no. 1, J. Kongsted, Ed., Sep. 2016, issn: 2331-2009. doi: 10. 1080/23312009.2016.1229830. [86] R. K. Joshi, P. Carbone, F. C. Wang, V. G. Kravets, Y. Su, I. V. Grigorieva, H. A. Wu, A. K. Geim, and R. R. Nair, “Precise and ultrafast molecular sieving through graphene oxide mem- branes.”, Science (New York, N.Y.), vol. 343, no. 6172, pp. 752–4, 2014, issn: 1095-9203. doi: 10.1126/science.1245711. arXiv: 1401.3134. [87] J. Juraszek and P. G. Bolhuis, “Sampling the multiple folding mechanisms of Trp-cage in explicit solvent”, Proceedings of the National Academy of Sciences, vol. 103, no. 43, pp. 15 859–15 864, Oct. 2006, issn: 0027-8424. doi: 10.1073/pnas.0606692103. [88] W. Kabsch, “A solution for the best rotation to relate two sets of vectors”, Acta Crystallographica Section A, vol. 32, no. 5, pp. 922– 923, Sep. 1976, issn: 16005724. doi: 10.1107/S0567739476001873. arXiv: 05677394. [89] S. Kearsley, “On the orthogonal transformation used for struc- tural comparisons”, Acta Crystallographica Section A Foundations of Crystallography, vol. 45, no. 2, pp. 208–210, Feb. 1989, issn: 01087673. doi: 10.1107/S0108767388010128. [90] K. Kedem, L. P. Chew, and R. Elber, “Unit-vector RMS (URMS) as a tool to analyze molecular dynamics trajectories”, Proteins: Structure, Function and Genetics, vol. 37, no. 4, pp. 554–564, Dec. 1999, issn: 08873585. doi: 10.1002/(SICI)1097-0134(19991201) 37:4<554::AID-PROT6>3.0.CO;2-1. [91] A. Khan, “Scalable molecular dynamics simulation using FP- GAs and multicore processors”, PhD thesis, 2013, pp. 1–219. [92] P. Koehl, “Electrostatics calculations: latest methodological ad- vances.”, Current opinion in structural biology, vol. 16, no. 2,

127 BIBLIOGRAPHY

pp. 142–151, Apr. 2006, issn: 0959-440X. doi: 10.1016/j.sbi. 2006.03.001. [93] A. Kreienbuehl, P.Benedusi, D. Ruprecht, and R. Krause, “Time- parallel gravitational collapse simulation”, Communications in Applied Mathematics and Computational Science, vol. 12, no. 1, pp. 109–128, May 2017, issn: 2157-5452. doi: 10.2140/camcos. 2017.12.109. [94] A. Křenek, T. Raček, and J. Pazúriková, “Direct SAXS Curve Fit- ting with an Ensable of Conformations”, in Molecular Machines, 2016. [95] A. Kumar and R. Purohit, “Use of Long Term Molecular Dy- namics Simulation in Predicting Cancer Associated SNPs”, PLoS Computational Biology, vol. 10, no. 4, A. D. MacKerell, Ed., e1003318, Apr. 2014, issn: 15537358. doi: 10.1371/journal. pcbi.1003318. [96] C. Kutzner, S. Páll, M. Fechner, A. Esztermann, and L. Bert, “Best bang for your buck : GPU nodes for GROMACS biomolec- ular simulations”, Journal of computational chemistry, vol. 36, no. 26, pp. 1–10, Oct. 2015, issn: 1096-987X. doi: 10 . 1002 / jcc . 24030. [97] A. Laio and F. Gervasio, “Metadynamics: a method to simu- late rare events and reconstruct the free energy in biophysics, chemistry and material science”, Reports on Progress in Physics, vol. 71, no. 12, 126601:1–126601:22, Dec. 2008, issn: 0034-4885. doi: 10.1088/0034-4885/71/12/126601. [98] A. Laio and M. Parrinello, “Escaping free-energy minima.”, Proceedings of the National Academy of Sciences of the United States of America, vol. 99, no. 20, pp. 12 562–12 566, Oct. 2002, issn: 00278424. doi: 10.1073/pnas.202427399. arXiv: 0208352 []. [99] A. Laio, A. Rodriguez-Fortea, F. L. Gervasio, M. Ceccarelli, and M. Parrinello, “Assessing the accuracy of metadynamics”, Journal of Physical Chemistry B, vol. 109, no. 14, pp. 6714–6721, Apr. 2005, issn: 15206106. doi: 10.1021/jp045424k. [100] LAMMPS, “LAMMPS User Manual”, Sandia National Labora- tories, Tech. Rep., Mar. 2014, pp. 1–1284. [101] S. Larson, C. Snow, M. Shirts, and V. Pande, “Folding @ Home and Genome @ Home : Using distributed computing to tackle previously intractable problems in computational biology”,

128 BIBLIOGRAPHY

Tech. Rep., 2002. [Online]. Available: http://arxiv.org/pdf/ 0901.0866v1.pdf. [102] D. Larsson, L. Liljas, and D. van der Spoel, “Virus capsid dis- solution studied by microsecond molecular dynamics simu- lations.”, PLoS computational biology, vol. 8, no. 5, e1002502:1– e1002502:8, Jan. 2012, issn: 1553-7358. doi: 10.1371/journal. pcbi.1002502. [103] A. Leach, Molecular modelling: principles and applications, 2nd. Dorchester: Pearson Education, 2001, pp. 1–773. [104] C. Lee and S. Ham, “Characterizing amyloid-beta protein mis- folding from molecular dynamics simulations with explicit wa- ter”, Journal of Computational Chemistry, vol. 32, no. 2, pp. 349– 355, Jan. 2011, issn: 1096-987X. doi: 10.1002/jcc.21628. [105] E. Lelarasmee, A. Ruehli, and A. Sangiovanni-Vincentelli, “The Waveform Relaxation Method for Time-Domain Analysis of Large Scale Integrated Circuits”, IEEE Transactions on Computer- Aided Design of Integrated Circuits and Systems, vol. 1, no. 3, pp. 131–145, Jul. 1982, issn: 0278-0070. doi: 10 . 1109 / TCAD . 1982.1270004. [106] E. Lewars, Computational Chemistry: Introduction to the Theory and Applications of Molecular and Quantum Mechanics, 2nd. Springer, 2010, pp. 1–680. [107] K. Lindorff-Larsen, P. Maragakis, S. Piana, and D. E. Shaw, “Picosecond to Millisecond Structural Dynamics in Human Ubiquitin”, Journal of Physical Chemistry B, vol. 120, no. 33, pp. 8313–8320, Aug. 2016, issn: 15205207. doi: 10.1021/acs. jpcb.6b02024. [108] J. Lions, Y. Maday, and G. Turinici, “Résolution d’EDP par un schéma en temps «pararéel »”, Comptes Rendus de l’Académie des Sciences - Series I - Mathematics, vol. 332, no. 7, pp. 661–668, Apr. 2001, issn: 07644442. doi: 10.1016/S0764-4442(00)01793-6. [109] A. Lodola, D. Branduardi, M. de Vivo, L. Capoferri, M. Mor, D. Piomelli, and A. Cavalli, “A catalytic mechanism for cysteine N- terminal nucleophile hydrolases, as revealed by free energy sim- ulations”, PLoS ONE, vol. 7, no. 2, M. H. Todd, Ed., e32397, Feb. 2012, issn: 19326203. doi: 10.1371/journal.pone.0032397.

129 BIBLIOGRAPHY

[110] A. D. Mackerell, Empirical force fields for biological macromolecules: Overview and issues, Oct. 2004. doi: 10.1002/jcc.20082. [On- line]. Available: http://doi.wiley.com/10.1002/jcc.20082. [111] D. MacKerell, N. Banavali, and N. Foloppe, “Development and current status of the CHARMM force field for nucleic acids.”, Biopolymers, vol. 56, no. 4, pp. 257–265, 2001, issn: 0006-3525. doi: 10.1002/1097-0282(2000)56:4<257::AID-BIP10029>3. 0.CO;2-W. [112] Y. Maday, “The parareal in time algorithm”, Technical Report R08030, Université Pierre et Marie Curie, pp. 1–24, 2008. [113] Y. Maday and G. Turinici, “The Parareal in Time Iterative Solver: a Further Direction to Parallel Implementation”, in Domain Decomposition Methods in Science and Engineering, 2005, pp. 441– 448. doi: 10.1007/3-540-26825-1_45. [114] Y. Maday and G. Turinici, “Parallel in time algorithms for quan- tum control: Parareal time discretization scheme”, International Journal of Quantum Chemistry, vol. 93, no. 3, pp. 223–228, 2003, issn: 00207608. doi: 10.1002/qua.10554. [115] J. Makino, “GRAPE-8 – An Accelerator for Gravitational N - body Simulation with 20.5Gflops / W Performance”, in Proceed- ings of Supercomputing, 2012, 104:1–104:10, isbn: 9781467308069. doi: 10.1109/SC.2012.60. [116] G. Martínez-Rosell, T. Giorgino, M. J. Harvey, and G. de Fab- ritiis, “Drug Discovery and Molecular Dynamics: Methods, Applications and Perspective Beyond the Second Timescale”, Current Topics in Medicinal Chemistry, Apr. 2017, issn: 1568-0266. doi: 10.2174/1568026617666170414142549. [117] C. Matthews, “The error in the invariant measure of numerical discretization schemes for canonical sampling of molecular dynamics”, PhD thesis, University of Edinburgh, 2013, pp. 1– 164. [118] J. R. McClean, J. A. Parkhill, and A. Aspuru-Guzik, “Feynman’s clock, a new variational principle, and parallel-in-time quan- tum dynamics.”, Proceedings of the National Academy of Sciences of the United States of America, vol. 110, no. 41, E3901–9, Oct. 2013, issn: 1091-6490. doi: 10.1073/pnas.1308069110. arXiv: 1301.2326.

130 BIBLIOGRAPHY

[119] G. Menegon, K. Shimizu, J. P. S. Farah, L. G. Dias, and H. Chaimovich, “Parameterization of the electronegativity equal- ization method based on the charge model 1”, Physical Chem- istry Chemical Physics, vol. 4, no. 24, pp. 5933–5936, Nov. 2002, issn: 14639076. doi: 10.1039/b206991a. [120] H. Merlitz and W. Wenzel, “Comparison of stochastic optimiza- tion methods for receptor-ligand docking”, Chemical Physics Letters, vol. 362, no. 3-4, pp. 271–277, Aug. 2002, issn: 00092614. doi: 10.1016/S0009-2614(02)01035-7. [121] O. Mersmann, B. Bischl, H. Trautmann, M. Preuss, C. Weihs, and G. Rudolph, “Exploratory Landscape Analysis”, Gecco- 2011: Proceedings of the 13th Annual Genetic and Evolutionary Com- putation Conference, pp. 829–836, 2011. doi: 10.1145/2001576. 2001690. [122] W. Miranker and W. Liniger, “Parallel methods for the numeri- cal integration of ordinary differential equations”, Mathematics of Computation, vol. 21, no. 99, pp. 303–320, 1967. doi: 10.1090/ S0025-5718-1967-0223106-8. [123] W. J. Mortier, S. K. Ghosh, and S. Shankar, “Electronegativity- equalization method for the calculation of atomic charges in molecules”, Journal of the American Chemical Society, vol. 108, no. 15, pp. 4315–4320, Jul. 1986. doi: 10.1021/ja00275a013. [124] R. S. Mulliken, “Electronic Population Analysis on LCAO-MO Molecular Wave Functions. IV. Bonding and Antibonding in LCAO and Valence-Bond Theories”, The Journal of Chemical Physics, vol. 23, no. 10, p. 1833, Oct. 1955, issn: 00219606. doi: 10.1063/1.1740588. [125] P. J. Needham, A. Bhuiyan, and R. C. Walker, Extension of the AMBER molecular dynamics software to Intel’s Many Integrated Core (MIC) architecture, Apr. 2016. doi: 10.1016/j.cpc.2015. 12.025. [126] J. W. Neidigh, R. M. Fesinmeyer, and N. H. Andersen, “Design- ing a 20-residue protein”, Nature Structural Biology, vol. 9, no. 6, pp. 425–430, Jun. 2002, issn: 10728368. doi: 10.1038/nsb798. [127] J. A. Nelder and R. Mead, “A Simplex Method for Function Minimization”, The Computer Journal, vol. 7, no. 4, pp. 308–313, Jan. 1965. doi: 10.1093/comjnl/7.4.308.

131 BIBLIOGRAPHY

[128] A. S. Nielsen, “Feasibility study of the parareal algorithm”, master’s thesis, Technical University of Denmark, 2012, p. 128. [Online]. Available: http://www2.imm.dtu.dk/pubdb/views/ edoc_download.php/6482/pdf/imm6/482.pdf. [129] C. Niethammer, S. Becker, M. Bernreuther, M. Buchholz, W. Eckhardt, A. Heinecke, S. Werth, H. J. Bungartz, C. W. Glass, H. Hasse, J. Vrabec, and M. Horsch, “Ls1 mardyn: The massively parallel molecular dynamics code for large systems”, Journal of Chemical Theory and Computation, vol. 10, no. 10, pp. 4455– 4464, Oct. 2014, issn: 15499626. doi: 10.1021/ct500169q. arXiv: 1408.4599. [130] J. Nievergelt, “Parallel methods for integrating ordinary dif- ferential equations”, Communications of the ACM, vol. 7, no. 12, pp. 731–733, 1964. doi: 10.1145/355588.365137. [131] K. T. No, J. A. Grant, M. S. Jhon, and H. A. S. J, “Determlnatlon of Net Atomic Charges Using a Modified Partial Equallzation of Orbltal Electronegativlty Method . 2 . Application to Ionic and Aromatic Molecules as Models for Polypeptides”, The Journal of Physical Chemistry, vol. 6, no. 6, pp. 4732–4739, May 1990, issn: 0022-3654. doi: 10.1021/j100374a066. [132] F. Noé, “Beating the millisecond barrier in molecular dynamics simulations”, Biophysical Journal, vol. 108, no. 2, pp. 228–229, 2015, issn: 15420086. doi: 10.1016/j.bpj.2014.11.3477. [133] A. Onufriev, “Implicit Solvent Models in Molecular Dynamics Simulations: A Brief Overview”, in Annual Reports in Computa- tional Chemistry, vol. 4, 2008, pp. 125–137, isbn: 9780444532503. doi: 10.1016/S1574-1400(08)00007-8. [134] Y. Ouyang, F. Ye, and Y. Liang, “A modified electronegativity equalization method for fast and accurate calculation of atomic charges in large biological molecules”, Physical Chemistry Chemi- cal Physics, vol. 11, no. 29, p. 6082, 2009. doi: 10.1039/b821696g. [135] F. Palazzesi, M. K. Prakash, M. Bonomi, and A. Barducci, “Ac- curacy of current all-atom force-fields in modeling protein disordered states”, Journal of Chemical Theory and Computation, vol. 11, no. 1, pp. 2–7, Jan. 2015, issn: 15499626. doi: 10.1021/ ct500718s. [136] S. Páll, M. J. Abraham, C. Kutzner, B. Hess, and E. Lindahl, “Tackling exascale software challenges in molecular dynam-

132 BIBLIOGRAPHY

ics simulations with GROMACS”, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 8759, pp. 3–27, 2015, issn: 16113349. doi: 10.1007/978-3-319-15976-8_1. arXiv: 1506.00716. [137] A. C. Pan, T. M. Weinreich, Y. Shan, D. P. Scarpazza, and D. E. Shaw, “Assessing the accuracy of two enhanced sampling meth- ods using egfr kinase transition pathways: The influence of collective variable choice”, Journal of Chemical Theory and Com- putation, vol. 10, no. 7, pp. 2860–2865, Jul. 2014, issn: 15499626. doi: 10.1021/ct500223p. [138] D. Pan, Y. Niu, W. Xue, Q. Bai, H. Liu, and X. Yao, “Compu- tational study on the drug resistance mechanism of hepatitis C virus NS5B RNA-dependent RNA polymerase mutants to BMS-791325 by molecular dynamics simulation and binding free energy calculations”, Chemometrics and Intelligent Laboratory Systems, vol. 154, pp. 185–193, May 2016, issn: 18733239. doi: 10.1016/j.chemolab.2016.03.015. [139] H. Park, J. Lee, and S. Lee, “Critical assessment of the automated AutoDock as a new docking tool for virtual screening”, Proteins, vol. 65, no. 3, pp. 549–54, Nov. 2006. doi: 10.1002/prot.21183. [140] T. A. Pascal, N. Karasawa, and W. A. Goddard, “Quantum me- chanics based force field for carbon (QMFF-Cx) validated to reproduce the mechanical and thermodynamics properties of graphite”, Journal of Chemical Physics, vol. 133, no. 13, p. 134 114, Oct. 2010, issn: 00219606. doi: 10.1063/1.3456543. [141] M. Pasi, J. H. Maddocks, D. Beveridge, T. C. Bishop, D. A. Case, T. Cheatham, P. D. Dans, B. Jayaram, F. Lankas, C. Laughton, J. Mitchell, R. Osman, M. Orozco, A. Pérez, D. Petkevičiute, N. Spackova, J. Sponer, K. Zakrzewska, and R. Lavery, “µABC: A systematic microsecond molecular dynamics study of tetranu- cleotide sequence effects in B-DNA”, Nucleic Acids Research, vol. 42, no. 19, pp. 12 272–12 283, Oct. 2014, issn: 13624962. doi: 10.1093/nar/gku855. [142] W. Pazner and P.-O. Persson, “Stage-parallel fully implicit Runge–Kutta solvers for discontinuous Galerkin fluid simulations”, Journal of Computational Physics, vol. 335, pp. 700–717, Apr. 2017, issn: 00219991. doi: 10.1016/j.jcp.2017.01.050.

133 BIBLIOGRAPHY

[143] J. Pazúriková, A. Křenek, and L. Matyska, “Guided optimiza- tion method for fast and accurate atomic charges computation”, in 30th European Simulation and Modelling Conference, J. Évora- Gómez and J. J. Hernández-Cabrera, Eds., Ghent, Belgium: Reproduct NV, 2016, pp. 267–274. [Online]. Available: https: //www.eurosis.org/cms/index.php?q=taxonomy/term/58. [144] J. Pazúriková, A. Křenek, V. Spiwok, and M. Šimková, “Reduc- ing the number of mean-square deviation calculations with floating close structure in metadynamics”, Journal of Chemical Physics, vol. 146, no. 11, p. 115 101, Mar. 2017, issn: 00219606. doi: 10.1063/1.4978296. [Online]. Available: http://aip. scitation.org/doi/10.1063/1.4978296. [145] J. Pazúriková and L. Matyska, “Convergence of Parareal Algo- rithm Applied on Molecular Dynamics Simulations”, in Pro- ceedings of MEMICS, H. Petr, D. Zdeněk, J. Jiří, K. Jan, K. Jan, P. Matula, and K. Pala, Eds., 2014, pp. 101–111. [146] J. Pazúriková, J. Oľha, A. Křenek, and V. Spiwok, “Acceleration of Mean Square Deviation Calculations with Floating Close Structure in Metadynamics Simulations”, Journal of Computa- tional Science, submitted, 2017. [147] PDB. [Online]. Available: http://www.pdb.org. [148] A. Perez, J. A. Morrone, C. Simmerling, and K. A. Dill, “Ad- vances in free-energy-based simulations of protein folding and ligand binding”, Current Opinion in Structural Biology, vol. 36, pp. 25–31, 2016, issn: 0959440X. doi: 10.1016/j.sbi.2015.12. 002. [149] D. Perez, E. D. Cubuk, A. Waterland, E. Kaxiras, and A. F. Voter, “Long-Time Dynamics through Parallel Trajectory Splicing”, Journal of Chemical Theory and Computation, vol. 12, no. 1, pp. 18– 28, Jan. 2016, issn: 15499626. doi: 10.1021/acs.jctc.5b00916. [150] J. R. Perilla, J. A. Hadden, B. C. Goh, C. G. Mayne, and K. Schulten, “All-Atom Molecular Dynamics of Virus Capsids as Drug Targets”, Journal of Physical Chemistry Letters, vol. 7, no. 10, pp. 1836–1844, May 2016, issn: 19487185. doi: 10.1021/acs. jpclett.6b00517. [151] D. Petrov and B. Zagrovic, “Are Current Atomistic Force Fields Accurate Enough to Study Proteins in Crowded Environments?”,

134 BIBLIOGRAPHY

PLoS Computational Biology, vol. 10, no. 5, e1003638, May 2014, issn: 15537358. doi: 10.1371/journal.pcbi.1003638. [152] PFASST++. [Online]. Available: https://github.com/Parallel- in-Time/PFASST. [153] J. Phillips, R. Braun, W. Wang, J. Gumbart, E. Tajkhorshid, E. Villa, C. Chipot, R. Skeel, L. Kalé, and K. Schulten, “Scalable molecular dynamics with NAMD”, Journal of Computational Chemistry, vol. 26, no. 16, pp. 1781–1802, Dec. 2005. doi: 10. 1002/jcc.20289. [154] S. Piana, J. L. Klepeis, and D. E. Shaw, Assessing the accuracy of physical models used in protein-folding simulations: Quantitative evidence from long molecular dynamics simulations, Feb. 2014. doi: 10.1016/j.sbi.2013.12.006. arXiv: NIHMS150003. [155] E. Picard, “Sur l’application des méthodes d’approximations successives à l’étude de certaines équations différentielles or- dinaires”, Journal de Mathématiques Pures et Appliquées, vol. 9, pp. 217–272, 1893, issn: 0021-7874. [Online]. Available: https: //eudml.org/doc/234079. [156] L. Pierce, R. Salomon-Ferrer, C. de Oliveira, J. McCammon, and R. Walker, “Routine Access to Millisecond Time Scale Events with Accelerated Molecular Dynamics.”, Journal of chemical the- ory and computation, vol. 8, no. 9, pp. 2997–3002, Sep. 2012, issn: 1549-9626. doi: 10.1021/ct300284c. [157] E. Pitzer and M. Affenzeller, “A Comprehensive Survey on Fitness Landscape Analysis”, in Recent Advances in Intelligent Engineering Systems, Springer Berlin Heidelberg, 2012, pp. 161– 191, isbn: 978-3-642-23228-2. doi: 10.1007/978-3-642-23229- 9_8. [158] S. Plimpton, “Fast Parallel Algorithms for Short Range Molecu- lar Dynamics”, Journal of Computational Physics, vol. 117, pp. 1– 19, 1995. doi: 10.1006/jcph.1995.1039. [159] ——, LAMMPS web page, 2003. [Online]. Available: http:// lammps.sandia.gov. [160] M. Powell, “Least Frobenius norm updating of quadratic mod- els that satisfy interpolation conditions”, Mathematical Program- ming, vol. Series B 1, no. 1, pp. 183–215, May 2004. doi: 10.1007/ s10107-003-0490-7.

135 BIBLIOGRAPHY

[161] ——, NEWUOA software, 2004. [Online]. Available: http://mat. uc.pt/~zhang/software.html#powell_software. [162] E. Proctor, F. Ding, and N. Dokholyan, “Discrete molecular dynamics”, Wiley Interdisciplinary Reviews: Computational Molec- ular Science, vol. 1, no. 1, pp. 80–92, Jan. 2011, issn: 17590876. doi: 10.1002/wcms.4. [163] S. Pronk, P.Larsson, I. Pouya, G. Bowman, I. Haque, K. Beauchamp, B. Hess, V. Pande, P. Kasson, and E. Lindahl, “Copernicus: a new paradigm for parallel adaptive molecular dynamics”, in Proceedings of Supercomputing, ser. SC ’11, New York, NY, USA: ACM, 2011, 60:1–60:10, isbn: 978-1-4503-0771-0. doi: 10.1145/ 2063384.2063465. [164] S. Pronk, I. Pouya, M. Lundborg, G. Rotskoff, B. Wesén, P. M. Kasson, and E. Lindahl, “Molecular Simulation Workflows as Parallel Algorithms: The Execution Engine of Copernicus, a Distributed High-Performance Computing Platform”, Journal of Chemical Theory and Computation, p. 150 512 121 147 006, 2015, issn: 1549-9618. doi: 10.1021/acs.jctc.5b00234. [165] D. Provasi, A. Negri, B. S. Coller, and M. Filizola, “Talin-driven inside-out activation mechanism of platelet αIIbβ3 integrin probed by multimicrosecond, all-atom molecular dynamics simulations”, Proteins: Structure, Function, and Bioinformatics, vol. 82, no. 12, pp. 3231–3240, Dec. 2014, issn: 08873585. doi: 10.1002/prot.24540. [166] T. Raček, J. Pazúriková, R. Svobodová Vařeková, S. Geidl, F. Falginella, V. Horský, V. Hejret, and J. Koča, “NEEMP - soft- ware for validation, accurate calculation and fast parametriza- tion of EEM charges”, Journal of Cheminformatics, vol. 8, no. 57, pp. 1–14, 2016. doi: 10.1186/s13321-016-0171-1. [Online]. Available: https://www.ncbi.nlm.nih.gov/pmc/articles/ PMC5067907/. [167] A. Randles, “Modeling Cardiovascular Hemodynamics Using the Lattice Boltzmann Method on Massively Parallel Supercom- puters”, PhD thesis, Harvard University, 2013, pp. 1–245. [168] A. Randles and E. Kaxiras, “Parallel in time approximation of the lattice Boltzmann method for laminar flows”, Journal of Computational Physics, vol. 270, pp. 577–586, Aug. 2014, issn: 10902716. doi: 10.1016/j.jcp.2014.04.006.

136 BIBLIOGRAPHY

[169] A. K. Rappé, C. J. Casewit, K. S. Colwell, W. A. Goddard, and W. M. Skiff, “UFF, a Full Periodic Table Force Field for Molecular Mechanics and Molecular Dynamics Simulations”, Journal of American Chemical Society, vol. 2, no. 114, pp. 10 024–10 035, 1992. doi: 10.1021/ja00051a040. [170] A. K. Rappe and W. A. Goddard, “Charge equilibration for molecular dynamics simulations”, The Journal of Physical Chem- istry, vol. 95, no. 8, pp. 3358–3363, Apr. 1991, issn: 0022-3654. doi: 10.1021/j100161a070. [171] A. E. Reed, R. B. Weinstock, and F. Weinhold, “Natural popu- lation analysis”, The Journal of Chemical Physics, vol. 83, no. 2, p. 735, 1985. doi: 10.1063/1.449486. [172] D. Richards, J. Glosli, B. Chan, M. Dorr, E. Draeger, J. Fattebert, W. Krauss, T. Spelce, F. Streitz, M. Surh, and J. Gunnels, “Be- yond homogeneous decomposition: scaling long-range forces on Massively Parallel Systems”, in Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, ser. SC ’09, New York, NY, USA: ACM, 2009, 60:1–60:12, isbn: 978-1-60558-744-8. doi: 10.1145/1654059.1654121. [173] F. Rothlauf, Design of Modern Heuristics: Principles and Applica- tions, ser. Natural Computing Series. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011, isbn: 978-3-540-72961-7. doi: 10.1007/ 978-3-540-72962-4. [174] C. Sagui and T. Darden, “Multigrid methods for classical molec- ular dynamics simulations of biomolecules”, Journal of Chemical Physics, vol. 114, pp. 6578–6591, 2001. doi: 10.1063/1.1352646. [175] R. T. Sanderson, “An Interpretation of Bond Lengths and a Classification of Bonds”, Science, vol. 114, no. 2973, pp. 670–672, Dec. 1951, issn: 0036-8075. doi: 10.1126/science.114.2973. 670. [176] C. R. Schwantes, D. Shukla, and V. S. Pande, “Markov state mod- els and tICA reveal a nonnative folding nucleus in simulations of NuG2”, Biophysical Journal, vol. 110, no. 8, pp. 1716–1719, Apr. 2016, issn: 15420086. doi: 10.1016/j.bpj.2016.03.026. [177] D. E. Shaw, R. O. Dror, J. K. Salmon, J. P. Grossman, K. M. Mackenzie, J. A. Bank, C. Young, M. M. Deneroff, B. Batson, K. J. Bowers, E. Chow, M. P. Eastwood, D. J. Ierardi, J. L. Klepeis, J. S. Kuskin, R. H. Larson, K. Lindorff-Larsen, P. Maragakis, M. A.

137 BIBLIOGRAPHY

Moraes, S. Piana, Y. Shan, and B. Towles, “Millisecond-scale molecular dynamics simulations on Anton”, in Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, ser. SC ’09, New York, NY, USA: ACM, 2009, 39:1– 39:11, isbn: 978-1-60558-744-8. doi: 10.1145/1654059.1654099. [178] D. E. Shaw, J. P. Grossman, J. A. Bank, B. Batson, J. A. Butts, J. C. Chao, M. M. Deneroff, R. O. Dror, A. Even, C. H. Fenton, A. Forte, J. Gagliardo, G. Gill, B. Greskamp, C. R. Ho, D. J. Ierardi, L. Iserovich, J. S. Kuskin, R. H. Larson, T. Layman, L. S. Lee, A. K. Lerer, C. Li, D. Killebrew, K. M. Mackenzie, S. Y. H. Mok, M. A. Moraes, R. Mueller, L. J. Nociolo, J. L. Peticolas, T. Quan, D. Ramot, J. K. Salmon, D. P. Scarpazza, U. Ben Schafer, N. Sid- dique, C. W. Snyder, J. Spengler, P. T. P. Tang, M. Theobald, H. Toma, B. Towles, B. Vitale, S. C. Wang, and C. Young, “Anton 2: Raising the Bar for Performance and Programmability in a Special-Purpose Molecular Dynamics Supercomputer”, in International Conference for High Performance Computing, Network- ing, Storage and Analysis, SC, vol. 2015-Janua, IEEE, Nov. 2014, pp. 41–53, isbn: 978-1-4799-5500-8. doi: 10.1109/SC.2014.9. [179] D. Shaw, M. Deneroff, R. Dror, J. Kuskin, R. Larson, J. Salmon, C. Young, B. Batson, K. Bowers, J. Chao, M. Eastwood, J. Gagliardo, J. Grossman, R. Ho, D. Ierardi, I. Kolossváry, J. Klepeis, T. Lay- man, C. McLeavey, M. Moraes, R. Mueller, E. Priest, Y. Shan, J. Spengler, M. Theobald, B. Towles, and S. Wang, “Anton, a special-purpose machine for molecular dynamics simulation”, Proceedings of Annual International Symposium on Computer Ar- chitecture, vol. 35, no. 2, pp. 1–12, Jun. 2007, issn: 0163-5964. doi: 10.1145/1273440.1250664. [180] Y. Shibuta, S. Sakane, E. Miyoshi, S. Okita, T. Takaki, and M. Ohno, “Heterogeneity in homogeneous nucleation from billion- atom molecular dynamics simulation of solidification of pure metal”, Nature Communications, vol. 8, no. 1, p. 10, Apr. 2017, issn: 2041-1723. doi: 10.1038/s41467-017-00017-5. [181] D.-A. Silva, D. R. Weiss, F. Pardo Avila, L.-T. Da, M. Levitt, D. Wang, and X. Huang, “Millisecond dynamics of RNA poly- merase II translocation at atomic resolution.”, Proceedings of the National Academy of Sciences of the United States of America,

138 BIBLIOGRAPHY

vol. 111, no. 21, pp. 7665–70, May 2014, issn: 1091-6490. doi: 10.1073/pnas.1315751111. [182] T. Simonson, G. Archontis, and M. Karplus, “Free energy sim- ulations come of age: Protein-ligand recognition”, Accounts of Chemical Research, vol. 35, no. 6, pp. 430–437, 2002, issn: 00014842. doi: 10.1021/ar010030m. [183] R. Skeel, I. Tezcan, and D. Hardy, “Multiple grid methods for classical molecular dynamics.”, Journal of computational chem- istry, vol. 23, no. 6, pp. 673–84, Apr. 2002, issn: 0192-8651. doi: 10.1002/jcc.10072. [184] J. C. Slater, “A simplification of the Hartree-Fock method”, Phys- ical Review, vol. 81, no. 3, pp. 385–390, Feb. 1951, issn: 0031899X. doi: 10.1103/PhysRev.81.385. [185] R. Speck, D. Ruprecht, R. Krause, M. Emmett, M. Minion, M. Winkel, and P. Gibbon, “A massively space-time parallel N- body solver”, in Proceedings of Supercomputing, 2012, 92:1–92:11. [186] R. Speck, D. Ruprecht, M. Emmett, M. Minion, M. Bolten, and R. Krause, “A multi-level spectral deferred correction method”, p. 2013, Jul. 2013. doi: 10.1007/s10543-014-0517-x. arXiv: 1307.1312. [187] R. Speck, D. Ruprecht, R. Krause, M. Emmett, M. Minion, M. Winkel, and P. Gibbon, “Integrating an N-body problem with SDC and PFASST”, in 21st International Conference on Domain Decomposition Methods, 2012. doi: 10.1007/978-3-319-05789- 7_61. [188] V. Spiwok and B. Králová, “Metadynamics in the conforma- tional space nonlinearly dimensionally reduced by Isomap”, Journal of Chemical Physics, vol. 135, no. 22, p. 224 504, Dec. 2011, issn: 00219606. doi: 10.1063/1.3660208. [189] V. Spiwok, Z. Sucur, and P. Hosek, Enhanced sampling tech- niques in biomolecular simulations, Nov. 2015. doi: 10.1016/j. biotechadv.2014.11.011. [190] G. Staff and E. Rønquist, “Stability of the parareal algorithm”, in Proceedings of International Domain Decomposition Conference, Springer, 2003, pp. 449–456. [Online]. Available: http://citeseerx. ist.psu.edu/viewdoc/summary?doi=10.1.1.153.6393.

139 BIBLIOGRAPHY

[191] G. Staff, “Convergence and Stability of the Parareal algorithm: A numerical and theoretical investigation”, PhD thesis, Norwe- gian University of Science and Technology, Trondheim, Norway, 2003. [192] J. Stone, D. Hardy, I. Ufimtsev,and K. Schulten, “GPU-accelerated molecular modeling coming of age.”, Journal of Molecular Graph- ics & Modelling, vol. 29, no. 2, pp. 116–125, Sep. 2010, issn: 1873- 4243. doi: 10.1016/j.jmgm.2010.06.010. [193] Y. Sugita and Y. Okamoto, “Replica exchange molecular dy- namics method for protein folding”, Chemical Physics Letters, vol. 314, no. November, pp. 141–151, Jan. 1999, issn: 1064-3745. doi: 10.1016/S0009-2614(99)01123-9. [194] Y. Sun, S. K. Halgamuge, and M. Kirley, “On the Selection of Fitness Landscape Analysis Metrics for Continuous Optimiza- tion Problems”, in Information and Automation for Sustainability (ICIAfS), 2014 7th International Conference, Colombo, 2014, isbn: 9781479945986. doi: 10.1109/ICIAFS.2014.7069635. [195] G. Sutmann and B. Steffen, “A particle-particle particle-multigrid method for long-range interactions in molecular simulations”, Computer Physics Communications, vol. 169, no. 1-3, pp. 343–346, 2005. doi: 10.1016/j.cpc.2005.03.077. [196] W. Swope, H. Andersen, P. Berens, and K. Wilson, “A computer simulation method for the calculation of equilibrium constants for the formation of physical clusters of molecules: Applica- tion to small water clusters”, Journal of Chemical Physics, no. 76, pp. 637–648, 1982. doi: 10.1063/1.442716. [197] J. B. Tenenbaum, V. de Silva, and J. C. Langford, “A global geometric framework for nonlinear dimensionality reduction.”, Science (New York, N.Y.), vol. 290, no. 5500, pp. 2319–23, Dec. 2000, issn: 0036-8075. doi: 10.1126/science.290.5500.2319. [198] A. Tervo, T. Rönkkö, T. Nyrönen, and A. Poso, “BRUTUS: op- timization of a grid-based similarity function for rigid-body molecular superposition. 1. Alignment and virtual screening applications.”, Journal of Medicinal Chemistry, vol. 48, no. 12, pp. 4076–86, Jun. 2005. doi: 10.1021/jm049123a. [199] P.Tiwary and A. van de Walle, “A review of enhanced sampling approaches for accelerated molecular dynamics”, in Springer Se- ries in Materials Science, vol. 245, Springer International Publish-

140 BIBLIOGRAPHY

ing, 2016, pp. 195–221, isbn: 978-3-319-33480-6. doi: 10.1007/ 978-3-319-33480-6_6. [200] G. Torrie and J. Valleau, “Nonphysical Sampling Distributions in Monte Carlo Free-Energy Estimation: Umbrella Sampling”, Journal of Computational Physics, vol. 23, pp. 187–199, 1977. doi: 10.1016/0021-9991(77)90121-8. [201] G. Tribello, M. Bonomi, D. Branduardi, C. Camilloni, and G. Bussi, “PLUMED 2: New feathers for an old bird”, Computer Physics Communications, vol. 185, no. 2, pp. 604–613, 2014, issn: 00104655. doi: 10.1016/j.cpc.2013.09.018. arXiv: 1310.0980. [202] D. M. F. Van Aalten, B. L. de Groot, J. B. C. Findlay, H. J. C. Berendsen, and A. Amadei, “A comparison of techniques for calculating protein essential dynamics.”, Journal of Computa- tional Chemistry, vol. 18, no. 2, pp. 169–181, Jan. 1997, issn: 0192- 8651. doi: 10.1002/(SICI)1096-987X(19970130)18:2<169:: AID-JCC3>3.0.CO;2-T. [203] S. Vandewalle and E. van de Velde, “Space-time concurrent multigrid waveform relaxation”, Annals of Numerical Mathemat- ics, vol. 1, no. 1-4, pp. 335–346, 1994. doi: 10.1007/BF01934186. [204] R. Vařeková, S. Geidl, C.-M. Ionescu, O. Skřehota, T. Bouchal, D. Sehnal, R. Abagyan, and J. Koča, “Predicting pKa values from EEM atomic charges”, Journal of Cheminformatics, vol. 5, no. 1, p. 18, 2013. doi: 10.1186/1758-2946-5-18. [205] H. Waisman and J. Fish, “A space – time multilevel method for molecular dynamics simulations”, Computational Methods in Applied Mechanics and Engineering, vol. 195, pp. 6542–6559, 2006. doi: 10.1016/j.cma.2006.02.006. [206] M. E. Wall, A. H. Van Benschoten, N. K. Sauter, P. D. Adams, J. S. Fraser, and T. C. Terwilliger, “Conformational dynamics of a crystalline protein from microsecond-scale molecular dynamics simulations and diffuse X-ray scattering”, Proceedings of the National Academy of Sciences, vol. 111, no. 50, pp. 17 887–17 892, 2014, issn: 0027-8424. doi: 10.1073/pnas.1416744111. [207] J. Wang, R. M. Wolf, J. W. Caldwell, P. A. Kollman, and D. A. Case, “Development and testing of a general Amber force field”, Journal of Computational Chemistry, vol. 25, no. 9, pp. 1157–1174, Jul. 2004, issn: 01928651. doi: 10.1002/jcc.20035. arXiv: z0024.

141 BIBLIOGRAPHY

[208] Q. Wang, S. A. Gomez, P. J. Blonigan, A. L. Gregory, and E. Y. Qian, “Towards scalable parallel-in-time turbulent flow simu- lations”, Physics of Fluids, vol. 25, no. 11, p. 110 818, Nov. 2013, issn: 10706631. doi: 10.1063/1.4819390. arXiv: 1211.2437. [209] A. Warshel and M. Levitt, “Theoretical studies of enzymic re- actions: Dielectric, electrostatic and steric stabilization of the carbonium ion in the reaction of lysozyme”, Journal of Molecular Biology, vol. 103, no. 2, pp. 227–249, May 1976, issn: 00222836. doi: 10.1016/0022-2836(76)90311-9. [210] J. Watson, “An introduction to fitness landscape analysis and cost models for local search”, Handbook of Metaheuristics, pp. 1– 26, 2010. doi: 10.1007/978-1-4419-1665-5_20. [211] M. Winkel, R. Speck, H. H∖"{u}bner, L. Arnold, R. Krause, and P. Gibbon, “A massively parallel, multi-disciplinary Barnes- Hut tree code for extreme-scale N-body simulations”, Computer Physics Communications, vol. 183, no. 4, pp. 880–889, Apr. 2012, issn: 00104655. doi: 10.1016/j.cpc.2011.12.013. [212] M. Winkel, R. Speck, and D. Ruprecht, “A high-order Boris integrator”, Journal of Computational Physics, vol. 295, pp. 456– 474, Aug. 2015, issn: 00219991. doi: 10.1016/j.jcp.2015.04. 022. arXiv: arXiv:1409.5677v1. [213] Z.-Z. Yang and C.-S. Wang, “AtomBond Electronegativity Equal- ization Method. 1. Calculation of the Charge Distribution in Large Molecules”, The Journal of Physical Chemistry A, vol. 101, no. 35, pp. 6315–6321, 1997, issn: 1089-5639. doi: 10 . 1021 / jp9711048. [214] J. Yin, A. T. Fenley, N. M. Henriksen, and M. K. Gilson, “Toward Improved Force-Field Accuracy through Sensitivity Analysis of Host-Guest Binding Thermodynamics”, Journal of Physical Chemistry B, vol. 119, no. 32, pp. 10 145–10 155, Aug. 2015, issn: 15205207. doi: 10.1021/acs.jpcb.5b04262. [215] Y. Yu, A. Srinivasan, and N. Chandra, “Scalable Time-Parallelization of Molecular Dynamics Simulations in Nano Mechanics”, in Conference on Parallel Processing, Ieee, 2006, pp. 119–126, isbn: 0-7695-2636-5. doi: 10.1109/ICPP.2006.64. [216] G. Zhao, J. Perilla, E. Yufenyuy, X. Meng, B. Chen, J. Ning, J. Ahn, A. Gronenborn, K. Schulten, C. Aiken, and P. Zhang, “Ma- ture HIV-1 capsid structure by cryo-electron microscopy and all-

142 BIBLIOGRAPHY atom molecular dynamics.”, Nature, vol. 497, no. 7451, pp. 643– 6, May 2013, issn: 1476-4687. doi: 10.1038/nature12162.

143

A Author’s Publications

The list of publications I have co-authored. I state their connection to the problems solved in the dissertation and describe my contribution.

[166] T. Raček, J. Pazúriková, R. Svobodová Vařeková, S. Geidl, F. Falginella, V. Horský, V. Hejret, and J. Koča, “NEEMP - software for validation, accurate calculation and fast parametrization of EEM charges”, Journal of Cheminformatics, vol. 8, no. 57, pp. 1–14, 2016. doi: 10.1186/s13321- 016- 0171- 1. [Online]. Available: https://www. ncbi.nlm.nih.gov/pmc/articles/PMC5067907/ Author’s contribution: 10%. The paper relates to the third problem of this dissertation. It presents a tool for calculation, parametrization and validation of partial atomic charges. I developed and evaluated a method for parametrization combing a differential evolution method and local minimization (DEMIN), a novel approach. I also contributed to the writing of the paper.

[143] J. Pazúriková, A. Křenek, and L. Matyska, “Guided optimization method for fast and accurate atomic charges computation”, in 30th European Simulation and Modelling Conference, J. Évora-Gómez and J. J. Hernández-Cabrera, Eds., Ghent, Belgium: Reproduct NV, 2016, pp. 267–274. [Online]. Available: https://www.eurosis.org/cms/ index.php?q=taxonomy/term/58 Author’s contribution: 80%. The paper relates to the third problem of this dissertation. It thoroughly compares all state-of-the-art ap- proaches for parametrization of atomic charges. Moreover, it presents a novel, simplified method GDMIN that reaches or surpasses the accuracy and runs faster than other methods. I developed GDMIN method, implemented many of state-of-the-art methods, conducted the experiments and and to a great extent wrote the paper.

[144] J. Pazúriková, A. Křenek, V. Spiwok, and M. Šimková, “Reducing the number of mean-square deviation calculations with floating close structure in metadynamics”, Journal of Chemical Physics, vol. 146, no. 11,

145 A. Author’s Publications p. 115 101, Mar. 2017, issn: 00219606. doi: 10.1063/1.4978296. [Online]. Available: http://aip.scitation.org/doi/10.1063/1.4978296 Author’s contribution: 50%. The paper relates to the second problem of this dissertation. It proposes an approximative method for calcu- lating mean square distances between many molecular structures in successive time steps called close structure method. We designed it to- gether with Aleš Křenek, further development and implementation has been done mostly by me. I contributed to experiments and evalua- tion of accuracy and also co-authored the paper.

[52] J. Filipovič, J. Pazúriková, A. Křenek, and V. Spiwok, “Acceler- ated RMSD Calculation for Molecular Metadynamics”, in 30th Eu- ropean Simulation and Modelling Conference, J. Évora-Gómez and J. J. Hernandéz-Cabrera, Eds., Ghent, Belgium: Reproduct NV, 2016, pp. 278– 280. [Online]. Available: https://www.eurosis.org/cms/index.php? q=taxonomy/term/58 Author’s contribution: 20%. The paper relates to the second problem in this dissertation. It focuses on the GPU implementation of the close structure method presented in [144]. I provided the prototype implementation of the close structure method.

[145] J. Pazúriková and L. Matyska, “Convergence of Parareal Algo- rithm Applied on Molecular Dynamics Simulations”, in Proceedings of MEMICS, H. Petr, D. Zdeněk, J. Jiří, K. Jan, K. Jan, P. Matula, and K. Pala, Eds., 2014, pp. 101–111 Author’s contribution: 80%. The paper relates to the first problem in this dissertation. It analyses the conditions of convergence of parareal method combined with molecular dynamics. I developed the method, implemented the prototype and to a great extent wrote the paper.

Submitted

I list here the papers submitted for consideration, currently in the review process.

146 A. Author’s Publications

[146] J. Pazúriková, J. Oľha, A. Křenek, and V. Spiwok, “Acceleration of Mean Square Deviation Calculations with Floating Close Struc- ture in Metadynamics Simulations”, Journal of Computational Science, submitted, 2017 Author’s contribution: 80%. The paper relates to the second problem of this dissertation. It focuses on the implementation and performance of the method presented in [144]. We dealt with many implementation issues and tuned the implementation for production use. The perfor- mance analysis showed the practical speed-up reaching the theoretical one. I implemented the code into Plumed with the help from Jaroslav Oľha, conducted all experiments and to a large extent wrote the paper.

147