Enhancing Sampling in Computational Statistics [0.2Cm]
Total Page:16
File Type:pdf, Size:1020Kb
Tesis Doctoral / Doktorego Tesia Enhancing Sampling in Computational Statistics Using Modified Hamiltonians Autora / Egilea: Tijana Radivojevi¢ Directores / Zuzendariek: Prof. Elena Akhmatskaya Prof. Enrico Scalas 2016 Doctoral Thesis Enhancing Sampling in Computational Statistics Using Modified Hamiltonians Author: Tijana Radivojevi¢ Supervisors: Prof. Elena Akhmatskaya Prof. Enrico Scalas 2016 This research was carried out at the Basque Center for Applied Mathematics (BCAM) within the Group Modelling and Simulation in Life and Materials Sciences. The research was supported by the Spanish Ministry of Education, Culture and Sport (MECD) within the FPU-2012 program (Formación del Profesorado Universitario) under Grant FPU12/05209 and also by the Basque Government through the BERC 2014-2017 program and by the Spanish Ministry of Economy and Competitiveness MINECO: BCAM Severo Ochoa excellence accreditation SEV-2013-0323 and Grant Retosenintegraciónnumérica: delasestructurasalgebráicasasimulacionesMontecarlo MTM2013– 46553–C3–1–P. ii Abstract The Hamiltonian Monte Carlo (HMC) method has been recognized as a powerful sampling tool in computational statistics. In this thesis, we show that performance of HMC can be dramati- cally improved by replacing Hamiltonians in the Metropolis test with modified Hamiltonians, and a complete momentum update with a partial momentum refreshment. The resulting general- ized HMC importance sampler, which we called Mix & Match Hamiltonian Monte Carlo (MMHMC), arose as an extension of the Generalized Shadow Hybrid Monte Carlo (GSHMC) method, previ- ously proposed for molecular simulation. The MMHMC method adapts GSHMC specifically to computational statistics and enriches it with new essential features: (i) the eicient algorithms for computation of modified Hamiltonians; (ii) the implicit momentum update procedure and (iii) the two-stage splitting integration schemes specially derived for the methods sampling with modified Hamiltonians. In addition, dierent optional strategies for momentum update and flip- ping are introduced as well as algorithms for adaptive tuning of parameters and eicient sam- pling of multimodal distributions are developed. MMHMC has been implemented in the in-house soware package HaiCS (Hamiltonians in Computational Statistics) written in C, tested on the popular statistical models and compared in sampling eiciency with HMC, Generalized Hybrid Monte Carlo, Riemann Manifold Hamiltonian Monte Carlo, Metropolis Adjusted Langevin Algo- rithm and Random Walk Metropolis-Hastings. The analysis of time-normalized eective sample size reveals the superiority of MMHMC over popular sampling techniques, especially in solving high-dimensional problems. iii Summary Both academia and industry have been witnessing exponential accumulation of data, which, if carefully analyzed, potentially can provide valuable insights about the underlying pro- cesses. The main feature of the resulting problems is uncertainty, which appears in many forms, either in data or assumed data models, and causes great challenges. Therefore, we need more complex models and advanced analysis tools to deal with the size and complex- ity of data. The Bayesian approach offers a rigorous and consistent manner of dealing with un- certainty and provides a means of quantifying the uncertainty in our predictions. It also allows us to objectively discriminate between competing model hypotheses, through the evaluation of Bayes factors (Gelman et al., 2003). Nevertheless, the application of Bayesian framework to complex problems runs into a computational bottleneck that needs to be ad- dressed with efficient inference methods. The challenges arise in the e.g. evaluation of intractable quantities or large-scale inverse problems linked to computationally demanding forward problems, and especially in high dimensional settings. Theoretical and algorithmic developments in computational statistics have been leading to the possibility of undertaking more complex applications by scientists and practitioners, but sampling in high dimensional problems and complex distributions is still challenging. Various methods are being used to practically address these difficulties, which techni- cally reduce to the calculation of integrals, as the core of a Bayesian inference procedure. These integrals appear either in evaluating an expected value over some complex posterior distribution or in determining the marginal likelihood of a distribution. For example, we are interested in computing an expected value (or some other moment) of a function f with respect to a distribution π Z I = f(θ)π(θ)dθ: (1) Deterministic methods aim to find analytically tractable approximations of the distributions π. We are interested however in Monte Carlo (stochastic) methods (Metropolis and Ulam, 1949), which can sample from the desired distribution, are exact in the limit of infinite number of samples, and can achieve an arbitrary level of accuracy by drawing as many samples as one requires. The integral (1) in this case is estimated as N 1 X I^ = f(θn); N n=1 v n N where random samples fθ gn=1 are drawn from π(θ). Bayesian statistics have been revolutionized by ever-growing computational capacities and development of Markov chain Monte Carlo (MCMC) techniques (Brooks et al., 2011). In this thesis, we focus on an advanced MCMC methodology, namely Hamiltonian (or hy- brid) Monte Carlo (HMC) (Duane et al., 1987; Neal, 2011), which arose as a good candi- date for dealing with high-dimensional and complex problems. Interestingly, both MCMC and HMC were originally developed in statistical mechanics; however, they made a signifi- cant impact on the statistical community almost 35 and 10 years later, respectively. The HMC method has proved to be a successful and valuable technique for a range of problems in computational statistics. It belongs to the class of auxiliary variable samplers, which defines an augmented target distribution through the use of a Hamiltonian function. HMC incorporates the gradient information of the target π(θ) and can follow this gradient over considerable distances. This is achieved by means of generating Hamiltonian trajecto- ries (integrating Hamiltonian equations of motion), which are rejected or accepted based on the Metropolis test. As a result, HMC suppresses the random walk behavior typical of the Metropolis-Hastings Monte Carlo method (Metropolis et al., 1953; Hastings, 1970). On the other hand, the performance of HMC deteriorates exponentially, in terms of ac- ceptance rates, with respect to the system’s size and the step size due to errors introduced by numerical approximations (Izaguirre and Hampton, 2004). Many rejections induce high correlations between samples and reduce the efficiency of the estimator of (1). Thus, in systems with large numbers of parameters, or latent parameters, or when the data set of observations is very large, efficient sampling might require a substantial number of eval- uations of the posterior distribution and its gradient. This may be computationally too demanding for HMC. To maintain the acceptance rate for larger systems at a high level, one should either decrease the step size or use a higher order numerical integrator, which is usually impractical for large systems. Ideally, one would like to have a sampler that increases acceptance rates, converges fast, improves sampling efficiency and whose optimal simulation parameters are not difficult to determine. In this thesis we develop, test and analyze the HMC based methodology that enhances the sampling performance of the HMC method. We introduce a new approach, called Mix & Match Hamiltonian Monte Carlo (MMHMC), which arose as an extension of the Gener- alized Shadow Hybrid Monte Carlo (GSHMC) method by Akhmatskaya and Reich (2008). GSHMC was proposed for molecular simulation and has been published, patented and suc- cessfully tested on complex biological systems. As GSHMC, the MMHMC method samples with modified Hamiltonians, but it enriches GSHMC with the new essential features and adapts it specifically to computational statistics. To the best of our knowledge, this is the first time that the method sampling with modified Hamiltonians has been implemented and applied to Bayesian inference problems in computational statistics. The MMHMC method can be defined as a generalized HMC importance sampler. It vi offers an update of momentum variables in a general form and samples with respect to a modified distribution that is determined through modified Hamiltonians. In particular, the method involves two major steps, the Partial Momentum Monte Carlo step, and the Hamiltonian Dynamics Monte Carlo (HDMC) step. The partial momentum update adds a random noise, controlled by an additional parameter, to the current momentum variable and accepts this update through the modified Metropolis test. As is the case with HMC, in the HDMC step a proposal state is generated by integrating Hamiltonian equations of motion and accepted according to the Metropolis test. The only difference in the HDMC step of MMHMC from the one in HMC is that in the Metropolis test the modified Hamil- tonian is used instead of the true Hamiltonian. This leads to higher acceptance rates of MMHMC, as symplectic numerical integrators preserve modified Hamiltonians to a higher accuracy than the true Hamiltonian. Since sampling is performed with respect to the mod- ified distribution, the importance weights are taken into account when estimating integral (1). Within this thesis, we provide new formulations of modified Hamiltonians