Mamba.Jl Documentation Release 0.12.0
Total Page:16
File Type:pdf, Size:1020Kb
Mamba.jl Documentation Release 0.12.0 Brian J Smith Oct 20, 2018 Contents 1 Overview 3 1.1 Purpose..................................................3 1.2 Features..................................................3 1.3 Getting Started..............................................4 2 Contents 5 2.1 Introduction...............................................5 2.2 Tutorial..................................................7 2.3 MCMC Types.............................................. 21 2.4 Sampling Functions........................................... 57 2.5 Examples................................................. 93 2.6 Discussion................................................ 165 2.7 Supplement................................................ 165 2.8 References................................................ 166 2.9 Indices.................................................. 166 Bibliography 167 i ii Mamba.jl Documentation, Release 0.12.0 Version 0.12.0 Requires julia releases 1.0.x Date Oct 20, 2018 Maintainer Brian J Smith ([email protected]) Contributors Benjamin Deonovic ([email protected]), Brian J Smith (brian-j- [email protected]), and others Web site https://github.com/brian-j-smith/Mamba.jl License MIT Contents 1 Mamba.jl Documentation, Release 0.12.0 2 Contents CHAPTER 1 Overview 1.1 Purpose Mamba is an open platform for the implementation and application of MCMC methods to perform Bayesian analysis in julia. The package provides a framework for (1) specification of hierarchical models through stated relationships between data, parameters, and statistical distributions; (2) block-updating of parameters with samplers provided, de- fined by the user, or available from other packages; (3) execution of sampling schemes; and (4) posterior inference. It is intended to give users access to all levels of the design and implementation of MCMC simulators to particularly aid in the development of new methods. Several software options are available for MCMC sampling of Bayesian models. Individuals who are primarily in- terested in data analysis, unconcerned with the details of MCMC, and have models that can be fit in JAGS, Stan, or OpenBUGS are encouraged to use those programs. Mamba is intended for individuals who wish to have access to lower-level MCMC tools, are knowledgeable of MCMC methodologies, and have experience, or wish to gain experi- ence, with their application. The package also provides stand-alone convergence diagnostics and posterior inference tools, which are essential for the analysis of MCMC output regardless of the software used to generate it. 1.2 Features • An interactive and extensible interface. • Support for a wide range of model and distributional specifications. • An environment in which all interactions with the software are made through a single, interpreted programming language. – Any julia operator, function, type, or package can be used for model specification. – Custom distributions and samplers can be written in julia to extend the package. • Directed acyclic graph representations of models. • Arbitrary blocking of model parameters and designation of block-specific samplers. • Samplers that can be used with the included simulation engine or apart from it, including 3 Mamba.jl Documentation, Release 0.12.0 – adaptive Metropolis within Gibbs and multivariate Metropolis, – approximate Bayesian computation, – binary, – Hamiltonian Monte Carlo (simple and No-U-Turn), – simplex, and – slice samplers. • Automatic parallel execution of parallel MCMC chains on multi-processor systems. • Restarting of chains. • Command-line access to all package functionality, including its simulation API. • Convergence diagnostics: Gelman, Rubin, and Brooks; Geweke; Heidelberger and Welch; Raftery and Lewis. • Posterior summaries: moments, quantiles, HPD, cross-covariance, autocorrelation, MCSE, ESS. • Gadfly plotting: trace, density, running mean, autocorrelation. • Importing of sampler output saved in the CODA file format. • Run-time performance on par with compiled MCMC software. 1.3 Getting Started The following julia command will install the package: julia> Pkg.add("Mamba") 4 Chapter 1. Overview CHAPTER 2 Contents 2.1 Introduction 2.1.1 MCMC Software Markov chain Monte Carlo (MCMC) methods are a class of algorithms for simulating autocorrelated draws from probability distributions [13][28][39][78]. They are widely used to obtain empirical estimates for and make infer- ence on multidimensional distributions that often arise in Bayesian statistical modelling, computational physics, and computational biology. Because MCMC provides estimates of distributions of interest, and is not limited to point estimates and asymptotic standard errors, it facilitates wide ranges of inferences and provides for more realistic pre- diction errors. An MCMC algorithm can be devised for any probability model. Implementations of algorithms are computational in nature, with the resources needed to execute algorithms directly related to the dimensionality of their associated problems. Rapid increases in computing power and emergence of MCMC software have enabled models of increasing complexity to be fit. For all its advantages, MCMC is regarded as one of the most important developments and powerful tools in modern statistical computing. Several software programs provide Bayesian modelling via MCMC. Programs range from those designed for general model fitting to those for specific models. WinBUGS, its open-source incarnation OpenBUGS, and the ‘BUGS’ clone Just Another Gibbs Sampler (JAGS) are among the most widely used programs for general model fitting [57][71]. These three provide similar programming syntaxes with which users can specify statistical models by simply stating relationships between data, parameters, and statistical distributions. Once a model is specified, the programs auto- matically formulate an MCMC sampling scheme to simulate parameter values from their posterior distribution. All aforementioned tasks can be accomplished with minimal programming and without specific knowledge of MCMC methodology. Users who are adept at both and so inclined can write software modules to add new distributions and samplers to OpenBUGS and JAGS [94][96]. General model fitting is also available with the MCMC procedure found in the SAS/STAT ® software [49]. Stan is a newer and similar-in-scope program worth noting for its accessible syn- tax and automatically tuned Hamiltonian Monte Carlo sampling scheme [98]. PyMC is a Python-based program that allows all modelling tasks to be accomplished in its native language, and gives users more hands-on access to model and sampling scheme specifications [70]. Programs like GRIMS [66] and LaplacesDemon [99] represent another class of programs that fit general models. In their approaches, users work with the functional forms of (unnormalized) probability densities directly, rather a domain specific modelling language (DSL), for model specification. Examples of programs for specific models can be found in the R catalogue of packages. For instance, the arm package pro- 5 Mamba.jl Documentation, Release 0.12.0 vides Bayesian inference for generalized linear, ordered logistic or probit, and mixed-effects regression models [34], MCMCpack fits a wide range of models commonly encountered in the social and behavioral sciences [60], and many others that are more focused on specific classes of models can be found in the “Bayesian Inference” task view on the Comprehensive R Archive Network [69]. 2.1.2 The Mamba Package Mamba [88] is a julia [4] package designed for general Bayesian model fitting via MCMC. Like OpenBUGS and JAGS, it supports a wide range of model and distributional specifications, and provides a syntax for model speci- fication. Unlike those two, and like PyMC, Mamba provides a unified environment in which all interactions with the software are made through a single, interpreted language. Any julia operator, function, type, or package can be used for model specification; and custom distributions and samplers can be written in julia to extend the package. Conversely, interactions with and extensions to OpenBUGS and JAGS can involve three different programming envi- ronments — R wrappers used to call the programs, their DSLs, and the underlying implementations in Component Pascal and C++. Advantages of a unified environment include more flexible model specification; tighter integration with supplied functions for convergence diagnostics and posterior inference; and faster development, testing, and de- bugging of extensions. Advantages of the BUGS DSLs include more concise model specification and facilitation of automated sampling scheme formulation. Indeed, sampling schemes must be selected manually in the initial release of Mamba. Nevertheless, Mamba holds other distinct advantages over existing offerings. In particular, it provides arbitrary blocking of model parameters and designation of block-specific samplers; samplers that can be used with the included simulation engine or apart from it; and command-line access to all package functionality, including its simulation API. Likewise, advantages of the julia language include its familiar syntax, focus on technical comput- ing, and benchmarks showing it to be one or more orders of magnitude faster than R and Python [3]. Finally, the intended audience for Mamba includes individuals interested in programming in julia; who wish to have low-level access to model design and implementation; and, in some cases, are able to derive full conditional distributions of model parameters