Sampling Techniques for Computational Statistical Physics
Total Page:16
File Type:pdf, Size:1020Kb
S and/or limiting cases of the Hamiltonian dynamics; Sampling Techniques for Computational and (3) deterministic dynamics on an extended phase Statistical Physics space. Benedict Leimkuhler1 and Gabriel Stoltz2 1Edinburgh University School of Mathematics, Edinburgh, Scotland, UK Description 2Universite´ Paris Est, CERMICS, Projet MICMAC Ecole des Ponts, ParisTech – INRIA, Marne-la-Vallee,´ Applications of sampling methods arise most com- France monly in molecular dynamics and polymer modeling, but they are increasingly encountered in fluid dynamics and other areas. In this article, we focus on the treat- ment of systems of particles described by position and Mathematics Subject Classification momentum vectors q and p, respectively, and modeled by a Hamiltonian energy H D H.q;p/. 82B05; 82-08; 65C05; 37M05 Macroscopic properties of materials are obtained, according to the laws of statistical physics, as the average of some function with respect to a proba- Short Definition bility measure describing the state of the system ( Calculation of Ensemble Averages): The computation of macroscopic properties, as pre- Z dicted by the laws of statistical physics, requires sam- E.A/ D A.q; p/ .dq dp/: (1) pling phase-space configurations distributed according E to the probability measure at hand. Typically, approx- imations are obtained as time averages over trajecto- In practice, averages such as (1) are obtained by gener- ries of discrete dynamics, which can be shown to be ating, by an appropriate numerical method, a sequence i i ergodic in some cases. Arguably, the greatest interest of microscopic configurations .q ;p /i0 such that is in sampling the canonical (constant temperature) ensemble, although other distributions (isobaric, mi- n1 Z 1 X crocanonical, etc.) are also of interest. Focusing on lim A.qi ;pi / D A.q; p/ .dq dp/: (2) n!C1 n E the case of the canonical measure, three important iD0 types of methods can be distinguished: (1) Markov chain methods based on the Metropolis–Hastings algo- The Canonical Case rithm; (2) discretizations of continuous stochastic dif- For simplicity, we consider the case of the canonical ferential equations which are appropriate modifications measure: © Springer-Verlag Berlin Heidelberg 2015 B. Engquist (ed.), Encyclopedia of Applied and Computational Mathematics, DOI 10.1007/978-3-540-70529-1 1288 Sampling Techniques for Computational Statistical Physics 1 ˇH.q;p/ or degenerate diffusive processes being added to deter- .dq dp/ D Z e dqdp; Z ministic models to improve sampling efficiencies. ˇH.q;p/ Z D e dqdp; (3) Direct probabilistic methods are typically based on E a prior probability measure used to sample configura- tions, which are then accepted or rejected according 1 where ˇ D kBT . Many sampling methods designed to some criterion (as for the rejection method, for for the canonical ensemble can be extended or adapted instance). Usually, a prior probability measure which to sample other ensembles. is easy to sample should be used. However, due to If the Hamiltonian is separable (i.e., it is the sum the high dimensionality of the problem, it is extremely of a quadratic kinetic energy and a potential energy), difficult to design a prior sufficiently close to the as is usually the case when Cartesian coordinates are canonical distribution to achieve a reasonable accep- used, the measure (3) has a tensorized form, and the tance rate. Direct probabilistic methods are therefore components of the momenta are distributed accord- rarely used in practice. ing to independent Gaussian distributions. It is there- fore straightforward to sample the kinetic part of the Markov Chain Methods canonical measure. The real difficulty consists in sam- Markov chain methods are mostly based on the pling positions distributed according to the canonical Metropolis–Hastings algorithm [5, 13], which is a measure widely used method in molecular simulation. The prior Z required in direct probabilistic methods is replaced by 1 ˇV.q/ ˇV.q/ a proposal move which generates a new configuration .dq/ D Z e dq; Z D e dq; D from a former one. This new configuration is then (4) accepted or rejected according to a criterion ensuring which is typically a high dimensional distribution, with that the correct measure is sampled. Here again, many local concentrated modes. For this reason, many designing a relevant proposal move is the cornerstone sampling methods focus on sampling the configura- of the method, and this proposal depends crucially on tional part of the canonical measure. the model at hand. Since most concepts needed for sampling purposes can be used either in the configuration space or in the The Metropolis–Hastings Algorithm phase space, the following notation will be used: The The Metropolis–Hastings algorithm generates a state of the system is denoted by x 2 S Rd ,which n Markov chain of the system configurations .x /n0 D can be the position space q 2 (and then d D 3N ), having the distribution of interest .dx/ as a stationary E or the full phase space .q; p/ 2 with d D 6N . distribution. The invariant distribution has to be The measure .dx/ is the canonical distribution to be known only up to a multiplicative constant to perform sampled ( in configuration space, in phase space). this algorithm (which is the case for the canonical measure and its marginal in position). It consists in General Classification a two-step procedure, starting from a given initial From a mathematical point of view, most sampling condition x0: methods may be classified as (see [2]): 1. Propose a new state xQnC1 from xn according to the 1. “Direct” probabilistic methods, such as the standard proposition kernel T.xn; / rejection method, which generate identically and 2. Accept the propositionà with probability min independently distributed (i.i.d) configurations .xQ nC1/T.xQ nC1;xn/ 1; , and set in this case 2. Markov chain techniques .xn/T.xn; xQ nC1/ 3. Markovian stochastic dynamics xnC1 DQxnC1; otherwise, set xnC1 D xn 4. Purely deterministic methods on an extended phase- It is important to count several times a configuration space when a proposal is rejected. Although the division described above is useful to bear The original Metropolis algorithm was proposed in mind, there is a blurring of the lines between the dif- in [13] and relied on symmetric proposals in the config- ferent types of methods used in practice, with Markov uration space. It was later extended in [5] to allow for chains being constructed from Hamiltonian dynamics nonsymmetric propositions which can bias proposals Sampling Techniques for Computational Statistical Physics 1289 toward higher probability regions with respect to the element in devising efficient algorithms. It is observed target distribution . The algorithm is simple to in- in practice that the optimal acceptance/rejection rate, terpret in the case of a symmetric proposition kernel in terms of the variance of the estimator (a mean of on the configuration space (.x/ / eˇV.q/ and some observable over a trajectory), for example, is T.q;q0/ D T.q0;q/). The Metropolis–Hastings ratio often around 0.5, ensuring some balance between: is simply • Large moves that decorrelate the iterates when they are accepted (hence reducing the correlations in the r.q;q0/ D exp ˇ.V.q0/ V.q// : chain, which is interesting for the convergence to happen faster) but lead to high rejection rates (and If the proposed move has a lower energy, it is always thus degenerate samples since the same position accepted, which allows to visit more frequently the may be counted several times) states of higher probability. On the other hand, tran- • And small moves that are less rejected but do not sitions to less likely states of higher energies are not decorrelate the iterates much forbidden (but accepted less often), which is important This trade-off between small and large proposal moves to observe transitions from one metastable region to has been investigated rigorously in some simple cases another when these regions are separated by some in [16,17], where optimal acceptance rates are obtained energy barrier. in a limiting regime. Properties of the Algorithm Some Examples of Proposition Kernels The probability transition kernel of the Metropolis– The most simple transition kernels are based on ran- Hastings chain reads dom walks. For instance, it is possible to modify the current configuration by a random perturbation applied to all particles. The problem with such symmetric P.x;dx0/ D min 1; r.x; x0/ T.x;dx0/ proposals is that they may not be well suited to the 0 C .1 ˛.x// ıx.dx /; (5) target probability measure (creating very correlated successive configurations for small , or very unlikely where ˛.x/ 2 Œ0; 1 is the probability to accept a move moves for large ). Efficient nonsymmetric proposal starting from x (considering all possible propositions): moves are often based on discretizations of continuous Z stochastic dynamics which use a biasing term such as rV to ensure that the dynamics remains sufficiently ˛.x/ D min .1; r.x; y// T .x; dy/: S close to the minima of the potential. An interesting proposal relies on the Hamiltonian The first part of the transition kernel corresponds to dynamics itself and consists in (1) sampling new mo- the accepted transitions from x to x0, which occur menta pn according to the kinetic part of the canon- with probability min .1; r.x; x0//, while the term .1 ical measure; (2) performing one or several steps of S 0 ˛.x//ıx .dx / encodes all the rejected steps. the Verlet scheme starting from the previous position A simple computation shows that the Metropolis– qn, obtaining a proposed configuration .eqnC1;epnC1/; Hastings transition kernel P is reversible with respect and (3) computing rn D expŒˇ.H.eqnC1;epnC1/ to , namely, P.x;dx0/.dx/ D P.x0;dx/.dx0/.