Mainthesis.Pdf (3.528Mb)

Tesis Doctoral/Doktorego Tesia Adapting Hybrid Monte Carlo methods for solving complex problems in Life and Materials sciences Autor/Egilea: Directora/Zuzendaria: Mario Fernández Pendás Prof. Elena Akhmatskaya 2018 Doctoral Thesis Adapting Hybrid Monte Carlo methods for solving complex problems in Life and Materials sciences Author: Advisor: Mario Fernández Pendás Prof. Elena Akhmatskaya 2018 This research was carried out at the Basque Center for Applied Mathematics (BCAM) within the Group Modelling and Simulation in Life and Materials Sciences. This research was sup- ported by MINECO under Grants BES-2014-06864, MTM2013-46553-C3-1-P and MTM2016- 76329-R (AEI/FEDER, EU) and also by the Basque Government through the BERC 2018- 2021 program and by the Spanish Ministry of Economy and Competitiveness MINECO: BCAM Severo Ochoa accreditation SEV-2013-0323. This work has been possible thanks to the support of the computing infrastructure of the i2BASQUE academic network, the tech- nical and human support provided by IZO-SGI SGIker of UPV/EHU and European funding (ERDF and ESF), the in-house BCAM-MSLMS group’s cluster Monako, and the BCAM’s cluster Hipatia. “Pasioa da hemen exigitzea zilegi den gutxieneko hori.” Berri Txarrak i Abstract Efficient sampling is the key to success of molecular simulation of complex physical systems. Still, a unique recipe for achieving this goal is unavailable. Hybrid Monte Carlo (HMC) is a promising sampling tool offering a smart, free of discretization errors, propagation in phase space, rigorous temperature control, and flexibility. However, its inability to provide dynamical information and its weakness in simulations of reasonably large systems do not allow HMC to become a sampler of choice in molecular simulation of complex systems. In this thesis, we show that performance of HMC can be dramatically improved by introducing in the method the splitting numerical integrators and importance sampling. We propose a novel splitting integration scheme called Adaptive Integration Approach or AIA, which leads to very promising improvements in accuracy and sampling in HMC simulations. Given a simulation problem and a time step, AIA automatically chooses the optimal scheme out of the family of two-stage splitting integrators. A system-specific integrator iden- tified by our approach is optimal in the sense that it provides the best conservation of energy for harmonic forces. The role of importance sampling on the performance of HMC is studied through the modified Hamiltonian Monte Carlo (MHMC) methods, sampling with respect to a modified or shadow Hamiltonian. The particular attention is paid to Generalized Shadow Hybrid Monte Carlo (GSHMC), introduced by Akhmatskaya and Reich in 2008. To improve the performance of MHMC in general and GSHMC in particular, we develop and test the new multi-stage splitting integrators, specially formulated for sampling with respect to modified Hamiltonians. The novel adaptive two-stage integration approach or MAIA, specifically derived for MHMC is presented. We also discuss in detail the adaptation of GSHMC to the NPT ensemble and provide the thorough analysis of its performance. Moreover, for the first time, we formulate GSHMC in the grand canonical ensemble. A general framework, useful for an extension of other Hybrid Monte Carlo methods to the grand canonical ensemble, is also provided. The software development is another fundamental part of the present work. The algorithms presented in this thesis are implemented in MultiHMC-GROMACS, an in-house version of the popular software package GROMACS. We explain the details of such implementation and give useful recommendations and hints for implementation of the new algorithms in other software packages. In summary, in this thesis, we propose the new numerical algorithms that are capable of improving the accuracy and sampling efficiency of molecular simulations with Hybrid Monte Carlo methods. We show that equipping the Hybrid Monte Carlo algorithm with extra fea- tures makes it even a “smarter” sampler and, no doubts, a strong competitor to the well- established molecular simulation techniques such as molecular dynamics (MD) and Monte Carlo. The up to 60 times increase in sampling efficiency of GSHMC over MD, due to the new algorithms in simulations of selected systems, supports such a belief. iii Summary The Hybrid Monte Carlo (HMC) method is a promising sampling tool offering a smart, free of discretization errors, propagation in phase space, rigorous temperature control, and flexibility. The HMC method appeared in the late eighties in the context of lattice field theories (Duane et al., 1987). A few years later, the HMC algorithm was extended to molecular simulations (Heermann, Nielaba, and Rovere, 1990) and then to condensed-matter systems (Mehlig, Heermann, and Forrest, 1992). The HMC method aims at combining the advantages of the molecular dynamics (MD) and Monte Carlo (MC) methods. MD allows for approximating the physical dynamics of the system while MC helps to explore the phase space more globally. In fact, HMC is a Metropolis-Hastings algorithm in which proposals are constructed using the NVE Hamiltonian flow of the system. The goal of HMC is to perform an efficient sampling in the canonical ensemble which ultimately allows for an accurate estimation of ensemble averages. We consider Hamiltonians as 1 H(q; p) = pT M −1p + U(q) A + B; (1) 2 ≡ 3D 3D where M is the diagonal mass matrix, and q R , p R are the positions and momenta, 2 2 1 T −1 respectively (D is a system’s dimension). We denote by A = 2 p M p the kinetic energy, and by B = U(q) the potential energy. 3D We are interested in sampling the variable q R that is distributed according to the 2 probability π(q). The target probability density function (p.d.f.) is written as π(q) exp ( βU(q)): / − The HMC method combines an MD global move with Monte Carlo sampling in the following way. For each Monte Carlo iteration: (i) the momenta are resampled from the Maxwell- 0 0 Boltzmann distribution ρP (p); (ii) a proposed new state (q ; p ) is generated by integrating the equations of motion with an integrator Ψ∆t;L; (iii) the preservation of the desired canonical distribution π(q; p) is ensured by a Metropolis test. Its acceptance probability can be calculated as: PA ((q; p) Ψ∆t;L(q; p)) = min 1; exp ( β∆H) ; ! f − g where ∆H = H (Ψ∆t;L(q; p)) H(q; p) − is the energy error associated to the integration scheme. A joint p.d.f. π(q; p) is defined as π(q; p) = π(q)ρP (p) exp ( βH(q; p)): (2) / − Therefore, HMC can be viewed as a method that samples points in phase space by means of a Markov Chain in which stochastic and dynamical transitions alternate. iv The complete resampling in (i) can be replaced with the partial momentum update as proposed in (Horowitz, 1991). The current momenta are mixed with an independent and identically distributed (i.i.d.) Gaussian noise u (0; β−1M) to obtain ∼ N p∗ = cos ' p + sin ' u (3) u∗ = sin ' p + cos ' u; − where ' (0; π=2] controls the amount of noise introduced. The angle ' also introduces extra control over2 the sampling efficiency of the method and may lead to the superior performance over HMC. The idea was formalized in the Generalized Hybrid Monte Carlo (GHMC) method (Kennedy and Pendleton, 2001). Meaning to be an improvement of both Monte Carlo and molecular dynamics, Hybrid Monte Carlo turned out to inherit two unfortunate drawbacks. Like Monte Carlo, it does not generate dynamic information, and its performance degrades with an increase of either the system size or the time step. Therefore, the goal of the thesis is to introduce the new algorithms for HMC which can potentially minimize these limitations. In order to enhance the performance of the HMC method, two main tools are considered: the splitting numerical integrators and the importance sampling technique. The efficiency and even the feasibility of molecular dynamics simulations depend crucially on the choice of a numerical integrator. As to the role of integrators in enhancing the performance of Hybrid Monte Carlo, it has been a subject of active research in recent years (McLachlan, 1995; Blanes, Casas, and Sanz-Serna, 2014; Chao et al., 2015; Campos and Sanz-Serna, 2017; Bou-Rabee and Sanz-Serna, 2017a). The velocity Verlet algorithm is cur- rently the method of choice; its algorithmic simplicity and optimal stability properties make it very difficult to beat. Splitting integrators offer the possibility of improving on Verlet, at least in some circumstances. Those integrators evaluate the forces more than once per step and, due to their simple kick-drift structure, may be implemented easily by modifying existing implementations of the Verlet scheme. The Hamilton equations of motion, with the notations in (1), can be written as dq −1 dp = pA(q; p) = M p; = qB(q; p) = qU(q): dt r dt −∇ −∇ These equations can be integrated in closed form and their solution flows at a time t are respectively given by A −1 (q(t); p(t)) = φt (q(0); p(0)); q(t) = q(0) + t M p(0); p(t) = p(0); (4) and B (q(t); p(t)) = φ (q(0); p(0)); q(t) = q(0); p(t) = p(0) t qU(q(0)): (5) t − r A B Here φt and φt denote the exact solution flows of the partial systems, i.e., the maps that asso- ciate the exact solution value (q(t); p(t)) with each initial condition (q(0); p(0)). Sometimes (4) might also be called a drift in the position and (5) a momentum kick. Given a time step ∆t, a velocity Verlet step corresponds to a transformation in phase space (q(t + ∆t); p(t + ∆t)) = ∆t(q(t); p(t)) that can be written as B A B ∆t = φ φ φ : ∆t=2 ◦ ∆t ◦ ∆t=2 v In this thesis, for HMC methods, we study in detail mainly two-stage splitting integrators, which are the splitting schemes that perform two force evaluations per time step: B A B A B ∆t = φ φ φ φ φ ; (6) b∆t ◦ ∆t=2 ◦ (1−2b)∆t ◦ ∆t=2 ◦ b∆t where b (0; 1=4] is a parameter of ∆t.

Load more