Chapter 7 Assessing and Improving Convergence of the Markov Chain

Total Page:16

File Type:pdf, Size:1020Kb

Chapter 7 Assessing and Improving Convergence of the Markov Chain Chapter 7 Assessing and improving convergence of the Markov chain Questions: . Are we getting the right answer? . Can we get the answer quicker? Bayesian Biostatistics - Piracicaba 2014 376 7.1 Introduction • MCMC sampling is powerful, but comes with a cost: dependent sampling + checking convergence is not easy: ◦ Convergence theorems do not tell us when convergence will occur ◦ In this chapter: graphical + formal diagnostics to assess convergence • Acceleration techniques to speed up MCMC sampling procedure • Data augmentation as Bayesian generalization of EM algorithm Bayesian Biostatistics - Piracicaba 2014 377 7.2 Assessing convergence of a Markov chain Bayesian Biostatistics - Piracicaba 2014 378 7.2.1 Definition of convergence for a Markov chain Loose definition: With increasing number of iterations (k −! 1), the distribution of k θ , pk(θ), converges to the target distribution p(θ j y). In practice, convergence means: • Histogram of θk remains the same along the chain • Summary statistics of θk remain the same along the chain Bayesian Biostatistics - Piracicaba 2014 379 Approaches • Theoretical research ◦ Establish conditions that ensure convergence, but in general these theoretical results cannot be used in practice • Two types of practical procedures to check convergence: ◦ Checking stationarity: from which iteration (k0) is the chain sampling from the posterior distribution (assessing burn-in part of the Markov chain). ◦ Checking accuracy: verify that the posterior summary measures are computed with the desired accuracy • Most practical convergence diagnostics (graphical and formal) appeal to the stationarity property of a converged chain • Often done: base the summary measures on the converged part of the chain Bayesian Biostatistics - Piracicaba 2014 380 7.2.2 Checking convergence of the Markov chain Here we look at: • Convergence diagnostics that are implemented in ◦ WinBUGS ◦ CODA, BOA • Graphical and formal tests • Illustrations using the ◦ Osteoporosis study (chain of size 1,500) Bayesian Biostatistics - Piracicaba 2014 381 7.2.3 Graphical approaches to assess convergence Trace plot: • A simple exploration of the trace plot gives a good impression of the convergence • Univariate trace plots (each parameter separately) or multivariate trace plots (evaluating parameters jointly) are useful • Multivariate trace plot: monitor the (log) of likelihood or log of posterior density: ◦ WinBUGS: automatically created variable deviance ◦ Bayesian SASr procedures: default variable LogPost • Thick pen test: trace plot appears as a horizontal strip when stationarity • Trace plot evaluates the mixing rate of the chain Bayesian Biostatistics - Piracicaba 2014 382 7.2.4 Graphical approaches to assess convergence Trace plot: • A simple exploration of the trace plot gives a good impression of the convergence • Univariate trace plots (each parameter separately) or multivariate trace plots (evaluating parameters jointly) are useful • Multivariate trace plot: monitor the (log) of likelihood or log of posterior density: ◦ WinBUGS: automatically created variable deviance ◦ Bayesian SASr procedures: default variable LogPost • Thick pen test: trace plot appears as a horizontal strip when stationarity • Trace plot evaluates the mixing rate of the chain Bayesian Biostatistics - Piracicaba 2014 383 Univariate trace plots on osteoporosis regression analysis (a) 1 β 0.030 0.045 0 500 1000 1500 Iteration (b) 2 σ 0.06 0.080 0.10 500 1000 1500 Iteration Bayesian Biostatistics - Piracicaba 2014 384 Multivariate trace plot on osteoporosis regression analysis Log(posterior) 155 160 165 170 175 0 500 1000 1500 Iterations Bayesian Biostatistics - Piracicaba 2014 385 Autocorrelation plot: k (k+m) • Autocorrelation of lag m (ρm): corr(θ ,θ ) (for k = 1;:::) estimated by ◦ Pearson correlation ◦ Time series approach • Autocorrelation function (ACF): function m 7! ρbm (m = 0; 1;:::) • Autocorrelation plot: graphical representation of ACF and indicates ◦ Mixing rate ◦ Minimum number of iterations for the chain to `forget' its starting position • Autocorrelation plot is a NOT a convergence diagnostic. Except for the burn-in phase, the ACF plot will not change anymore when further sampling Bayesian Biostatistics - Piracicaba 2014 386 Autocorrelation plot on osteoporosis regression analysis 2 LOGPOS β1 σ 0.5 0.0 0.5 1.0 0.5 0.0 0.5 1.0 0.5 0.0 0.5 1.0 Autocorrelation Autocorrelation Autocorrelation % % % 1.0 1.0 1.0 % % % 0 10 20 30 40 50 0 10 20 30 40 50 0 10 20 30 40 50 Lag Lag Lag Bayesian Biostatistics - Piracicaba 2014 387 Running-mean plot: k • Running mean (ergodic mean) θ : mean of sampled values up to iteration k ◦ Checks stationarity for k > k0 of pk(θ) via the mean ◦ Initial variability of the running-mean plot is always relatively high ◦ Stabilizes with increasing k in case of stationarity Bayesian Biostatistics - Piracicaba 2014 388 Running-mean plot on osteoporosis regression analysis 2 LOGPOS β1 σ Autocorrelation Autocorrelation Autocorrelation −1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0 0 10 20 30 40 50 0 10 20 30 40 50 0 10 20 30 40 50 Lag Lag Lag 2 1 β σ LOGPOS 155 160 165 170 175 0.070 0.080 0.090 0.100 0.035 0.040 0.045 0.050 0 500 1000 1500 0 500 1000 1500 0 500 1000 1500 Iterations Iterations Iterations Bayesian Biostatistics - Piracicaba 2014 389 Q-Q plot: • Stationarity: distribution of the chain is stable irrespective of part • Q-Q plot: 1st half of chain (X-axis) versus 2nd half of chain on (Y -axis) • Non-stationarity: Q-Q graph deviating from the bisecting line Bayesian Biostatistics - Piracicaba 2014 390 Q-Q plot on osteoporosis regression analysis 2 1 β σ 172 174 176 178 0.07 0.08 0.09 0.10 0.11 SECOND HALF OF 0.035 0.040 0.045 0.050 0.055 SECOND HALF OFSECOND SECOND HALF OF LOG(POS) 155 160 165 170 175 0.035 0.045 0.055 0.06 0.07 0.08 0.09 0.10 0.11 2 FIRST HALF OF LOG(POS) FIRST HALF OF β1 FIRST HALF OF σ Bayesian Biostatistics - Piracicaba 2014 391 Cross-correlation plot: • k k Cross-correlation of θ1 with θ2: correlation between θ1 with θ2 (k = 1; : : : ; n) • k k Cross-correlation plot: scatterplot of θ1 versus θ2 • Useful in case of convergence problems to indicate which model parameters are strongly related and thus whether model is overspecified Bayesian Biostatistics - Piracicaba 2014 392 Cross-correlation plot on osteoporosis regression analysis beta1 beta1 0.06 0.06 0.05 0.05 0.04 0.04 0.03 0.03 0.02 0.02 0.4 0.8 0.225 0.275 0.325 beta0 sigma Bayesian Biostatistics - Piracicaba 2014 393 7.2.5 Formal approaches to assess convergence Here: • Four formal convergence diagnostics: ◦ Geweke diagnostic: single chain ◦ Heidelberger-Welch diagnostic: single chain ◦ Raftery-Lewis diagnostic: single chain ◦ Brooks-Gelman-Rubin diagnostic: multiple chains • Illustrations based on the ◦ Osteoporosis study (chain of size 1,500) Bayesian Biostatistics - Piracicaba 2014 394 Geweke diagnostic: k ◦ Acts on a single chain (θ )k (k = 1; : : : ; n) ◦ Checks only stationarity ) looks for k0 ◦ Compares means of (10%) early , (50%) late part of chain using a t-like test • In addition, there is a dynamic version of the test: ◦ Cut off repeatedly x% of the chain, from the beginning ◦ Each time compute Geweke test • Test cannot be done in WinBUGS, but should be done in R accessing the sampled values of the chain produced by WinBUGS Bayesian Biostatistics - Piracicaba 2014 395 Geweke diagnostic on osteoporosis regression analysis • Based on chain of 1,500 iterations • R program geweke.diag (CODA) • With standard settings of 10% (1st part) & 50% (2nd part): numerical problems 2 • Early part = 20%, Geweke diagnostic for β1: Z = 0:057 and for σ : Z = 2:00 • Dynamic version of the Geweke diagnostic (with K = 20) next page, results: ◦ β1: majority of Z-values are outside [-1.96,1.96] ) non-stationarity ◦ log(posterior): non-stationary ◦ σ2: all values except first inside [-1.96,1.96] Bayesian Biostatistics - Piracicaba 2014 396 Geweke diagnostic on osteoporosis regression analysis var1 var1 var1 (a) (b) (c) Z−score Z−score Z−score −10 −5 0 5 10 −4 −2 0 2 4 −2 −1 0 1 2 0 200 400 600 0 200 400 600 0 200 400 600 First iteration in segment First iteration in segment First iteration in segment 2 (a) β1 (b) σ (c) log posterior Bayesian Biostatistics - Piracicaba 2014 397 BGR diagnostic: ◦ Version 1: Gelman and Rubin ANOVA diagnostic ◦ Version 2: Brooks-Gelman-Rubin interval diagnostic ◦ General term: Brooks-Gelman-Rubin (BGR) diagnostic ◦ Global and dynamic diagnostic ◦ Implementations in WinBUGS, CODA and BOA For both versions: ◦ 0 Take M widely dispersed starting points θm (m = 1;:::;M) ◦ k M parallel chains (θm)k (m = 1;:::;M) are run for 2n iterations ◦ The first n iterations are discarded and regarded as burn-in Bayesian Biostatistics - Piracicaba 2014 398 Trace plot of multiple chains on osteoporosis regression analysis beta1 chains 1:8 0.06 0.05 0.04 0.03 0.02 0.01 1 500 1000 1500 iteration Bayesian Biostatistics - Piracicaba 2014 399 Version 1: P n k . Within chains: chain mean θm = (1=n) k=1 θm (m = 1;:::;M) P . Across chains: overall mean θ = (1=M) m θm . ANOVA idea: P P ◦ W = 1 M s2 ; with s2 = 1 n (θk − θ )2 M Pm=1 m m n k=1 m m ◦ n M − 2 B = M−1 m=1(θm θ) . Two estimates for var(θk j y): ◦ b ≡ c k j n−1 1 Under stationarity: W and V var(θ y) = n W + nB good estimates ◦ Non-stationarity: W too small and Vb too large b Vb . Convergence diagnostic: R = W Bayesian Biostatistics - Piracicaba 2014 400 Version 1: .
Recommended publications
  • Getting Started in Openbugs / Winbugs
    Practical 1: Getting started in OpenBUGS Slide 1 An Introduction to Using WinBUGS for Cost-Effectiveness Analyses in Health Economics Dr. Christian Asseburg Centre for Health Economics University of York, UK [email protected] Practical 1 Getting started in OpenBUGS / WinB UGS 2007-03-12, Linköping Dr Christian Asseburg University of York, UK [email protected] Practical 1: Getting started in OpenBUGS Slide 2 ● Brief comparison WinBUGS / OpenBUGS ● Practical – Opening OpenBUGS – Entering a model and data – Some error messages – Starting the sampler – Checking sampling performance – Retrieving the posterior summaries 2007-03-12, Linköping Dr Christian Asseburg University of York, UK [email protected] Practical 1: Getting started in OpenBUGS Slide 3 WinBUGS / OpenBUGS ● WinBUGS was developed at the MRC Biostatistics unit in Cambridge. Free download, but registration required for a licence. No fee and no warranty. ● OpenBUGS is the current development of WinBUGS after its source code was released to the public. Download is free, no registration, GNU GPL licence. 2007-03-12, Linköping Dr Christian Asseburg University of York, UK [email protected] Practical 1: Getting started in OpenBUGS Slide 4 WinBUGS / OpenBUGS ● There are no major differences between the latest WinBUGS (1.4.1) and OpenBUGS (2.2.0) releases. ● Minor differences include: – OpenBUGS is occasionally a bit slower – WinBUGS requires Microsoft Windows OS – OpenBUGS error messages are sometimes more informative ● The examples in these slides use OpenBUGS. 2007-03-12, Linköping Dr Christian Asseburg University of York, UK [email protected] Practical 1: Getting started in OpenBUGS Slide 5 Practical 1: Target ● Start OpenBUGS ● Code the example from health economics from the earlier presentation ● Run the sampler and obtain posterior summaries 2007-03-12, Linköping Dr Christian Asseburg University of York, UK [email protected] Practical 1: Getting started in OpenBUGS Slide 6 Starting OpenBUGS ● You can download OpenBUGS from http://mathstat.helsinki.fi/openbugs/ ● Start the program.
    [Show full text]
  • BUGS Code for Item Response Theory
    JSS Journal of Statistical Software August 2010, Volume 36, Code Snippet 1. http://www.jstatsoft.org/ BUGS Code for Item Response Theory S. McKay Curtis University of Washington Abstract I present BUGS code to fit common models from item response theory (IRT), such as the two parameter logistic model, three parameter logisitic model, graded response model, generalized partial credit model, testlet model, and generalized testlet models. I demonstrate how the code in this article can easily be extended to fit more complicated IRT models, when the data at hand require a more sophisticated approach. Specifically, I describe modifications to the BUGS code that accommodate longitudinal item response data. Keywords: education, psychometrics, latent variable model, measurement model, Bayesian inference, Markov chain Monte Carlo, longitudinal data. 1. Introduction In this paper, I present BUGS (Gilks, Thomas, and Spiegelhalter 1994) code to fit several models from item response theory (IRT). Several different software packages are available for fitting IRT models. These programs include packages from Scientific Software International (du Toit 2003), such as PARSCALE (Muraki and Bock 2005), BILOG-MG (Zimowski, Mu- raki, Mislevy, and Bock 2005), MULTILOG (Thissen, Chen, and Bock 2003), and TESTFACT (Wood, Wilson, Gibbons, Schilling, Muraki, and Bock 2003). The Comprehensive R Archive Network (CRAN) task view \Psychometric Models and Methods" (Mair and Hatzinger 2010) contains a description of many different R packages that can be used to fit IRT models in the R computing environment (R Development Core Team 2010). Among these R packages are ltm (Rizopoulos 2006) and gpcm (Johnson 2007), which contain several functions to fit IRT models using marginal maximum likelihood methods, and eRm (Mair and Hatzinger 2007), which contains functions to fit several variations of the Rasch model (Fischer and Molenaar 1995).
    [Show full text]
  • Stan: a Probabilistic Programming Language
    JSS Journal of Statistical Software MMMMMM YYYY, Volume VV, Issue II. http://www.jstatsoft.org/ Stan: A Probabilistic Programming Language Bob Carpenter Andrew Gelman Matt Hoffman Columbia University Columbia University Adobe Research Daniel Lee Ben Goodrich Michael Betancourt Columbia University Columbia University University of Warwick Marcus A. Brubaker Jiqiang Guo Peter Li University of Toronto, NPD Group Columbia University Scarborough Allen Riddell Dartmouth College Abstract Stan is a probabilistic programming language for specifying statistical models. A Stan program imperatively defines a log probability function over parameters conditioned on specified data and constants. As of version 2.2.0, Stan provides full Bayesian inference for continuous-variable models through Markov chain Monte Carlo methods such as the No-U-Turn sampler, an adaptive form of Hamiltonian Monte Carlo sampling. Penalized maximum likelihood estimates are calculated using optimization methods such as the Broyden-Fletcher-Goldfarb-Shanno algorithm. Stan is also a platform for computing log densities and their gradients and Hessians, which can be used in alternative algorithms such as variational Bayes, expectation propa- gation, and marginal inference using approximate integration. To this end, Stan is set up so that the densities, gradients, and Hessians, along with intermediate quantities of the algorithm such as acceptance probabilities, are easily accessible. Stan can be called from the command line, through R using the RStan package, or through Python using the PyStan package. All three interfaces support sampling and optimization-based inference. RStan and PyStan also provide access to log probabilities, gradients, Hessians, and data I/O. Keywords: probabilistic program, Bayesian inference, algorithmic differentiation, Stan.
    [Show full text]
  • Installing BUGS and the R to BUGS Interface 1. Brief Overview
    File = E:\bugs\installing.bugs.jags.docm 1 John Miyamoto (email: [email protected]) Installing BUGS and the R to BUGS Interface Caveat: I am a Windows user so these notes are focused on Windows 7 installations. I will include what I know about the Mac and Linux versions of these programs but I cannot swear to the accuracy of my comments. Contents (Cntrl-left click on a link to jump to the corresponding section) Section Topic 1 Brief Overview 2 Installing OpenBUGS 3 Installing WinBUGs (Windows only) 4 OpenBUGS versus WinBUGS 5 Installing JAGS 6 Installing R packages that are used with OpenBUGS, WinBUGS, and JAGS 7 Running BRugs on a 32-bit or 64-bit Windows computer 8 Hints for Mac and Linux Users 9 References # End of Contents Table 1. Brief Overview TOC BUGS stands for Bayesian Inference Under Gibbs Sampling1. The BUGS program serves two critical functions in Bayesian statistics. First, given appropriate inputs, it computes the posterior distribution over model parameters - this is critical in any Bayesian statistical analysis. Second, it allows the user to compute a Bayesian analysis without requiring extensive knowledge of the mathematical analysis and computer programming required for the analysis. The user does need to understand the statistical model or models that are being analyzed, the assumptions that are made about model parameters including their prior distributions, and the structure of the data, but the BUGS program relieves the user of the necessity of creating an algorithm to sample from the posterior distribution and the necessity of writing the computer program that computes this algorithm.
    [Show full text]
  • Winbugs Lectures X4.Pdf
    Introduction to Bayesian Analysis and WinBUGS Summary 1. Probability as a means of representing uncertainty 2. Bayesian direct probability statements about parameters Lecture 1. 3. Probability distributions Introduction to Bayesian Monte Carlo 4. Monte Carlo simulation methods in WINBUGS 5. Implementation in WinBUGS (and DoodleBUGS) - Demo 6. Directed graphs for representing probability models 7. Examples 1-1 1-2 Introduction to Bayesian Analysis and WinBUGS Introduction to Bayesian Analysis and WinBUGS How did it all start? Basic idea: Direct expression of uncertainty about In 1763, Reverend Thomas Bayes of Tunbridge Wells wrote unknown parameters eg ”There is an 89% probability that the absolute increase in major bleeds is less than 10 percent with low-dose PLT transfusions” (Tinmouth et al, Transfusion, 2004) !50 !40 !30 !20 !10 0 10 20 30 % absolute increase in major bleeds In modern language, given r Binomial(θ,n), what is Pr(θ1 < θ < θ2 r, n)? ∼ | 1-3 1-4 Introduction to Bayesian Analysis and WinBUGS Introduction to Bayesian Analysis and WinBUGS Why a direct probability distribution? Inference on proportions 1. Tells us what we want: what are plausible values for the parameter of interest? What is a reasonable form for a prior distribution for a proportion? θ Beta[a, b] represents a beta distribution with properties: 2. No P-values: just calculate relevant tail areas ∼ Γ(a + b) a 1 b 1 p(θ a, b)= θ − (1 θ) − ; θ (0, 1) 3. No (difficult to interpret) confidence intervals: just report, say, central area | Γ(a)Γ(b) − ∈ a that contains 95% of distribution E(θ a, b)= | a + b ab 4.
    [Show full text]
  • Flexible Programming of Hierarchical Modeling Algorithms and Compilation of R Using NIMBLE Christopher Paciorek UC Berkeley Statistics
    Flexible Programming of Hierarchical Modeling Algorithms and Compilation of R Using NIMBLE Christopher Paciorek UC Berkeley Statistics Joint work with: Perry de Valpine (PI) UC Berkeley Environmental Science, Policy and Managem’t Daniel Turek Williams College Math and Statistics Nick Michaud UC Berkeley ESPM and Statistics Duncan Temple Lang UC Davis Statistics Bayesian nonparametrics development with: Claudia Wehrhahn Cortes UC Santa Cruz Applied Math and Statistics Abel Rodriguez UC Santa Cruz Applied Math and Statistics http://r-nimble.org October 2017 Funded by NSF DBI-1147230, ACI-1550488, DMS-1622444; Google Summer of Code 2015, 2017 Hierarchical statistical models A basic random effects / Bayesian hierarchical model Probabilistic model Flexible programming of hierarchical modeling algorithms using NIMBLE (r- 2 nimble.org) Hierarchical statistical models A basic random effects / Bayesian hierarchical model BUGS DSL code Probabilistic model # priors on hyperparameters alpha ~ dexp(1.0) beta ~ dgamma(0.1,1.0) for (i in 1:N){ # latent process (random effects) # random effects distribution theta[i] ~ dgamma(alpha,beta) # linear predictor lambda[i] <- theta[i]*t[i] # likelihood (data model) x[i] ~ dpois(lambda[i]) } Flexible programming of hierarchical modeling algorithms using NIMBLE (r- 3 nimble.org) Divorcing model specification from algorithm MCMC Flavor 1 Your new method Y(1) Y(2) Y(3) MCMC Flavor 2 Variational Bayes X(1) X(2) X(3) Particle Filter MCEM Quadrature Importance Sampler Maximum likelihood Flexible programming of hierarchical modeling algorithms using NIMBLE (r- 4 nimble.org) What can a practitioner do with hierarchical models? Two basic software designs: 1. Typical R/Python package = Model family + 1 or more algorithms • GLMMs: lme4, MCMCglmm • GAMMs: mgcv • spatial models: spBayes, INLA Flexible programming of hierarchical modeling algorithms using NIMBLE (r- 5 nimble.org) What can a practitioner do with hierarchical models? Two basic software designs: 1.
    [Show full text]
  • Spaio-‐Temporal Dependence: a Blessing and a Curse For
    Spao-temporal dependence: a blessing and a curse for computaon and inference (illustrated by composi$onal data modeling) (and with an introduc$on to NIMBLE) Christopher Paciorek UC Berkeley Stas$cs Joint work with: The PalEON project team (hp://paleonproject.org) The NIMBLE development team (hp://r-nimble.org) NSF-CBMS Workshop on Bayesian Spaal Stas$cs August 2017 Funded by various NSF grants to the PalEON and NIMBLE projects PalEON Project Goal: Improve the predic$ve capacity of terrestrial ecosystem models Friedlingstein et al. 2006 J. Climate “This large varia-on among carbon-cycle models … has been called ‘uncertainty’. I prefer to call it ‘ignorance’.” - Pren-ce (2013) Grantham Ins-tute Cri-cal issue: model parameterizaon and representaon of decadal- to centennial-scale processes are poorly constrained by data Approach: use historical and fossil data to es$mate past vegetaon and climate and use this informaon for model ini$alizaon, assessment, and improvement Spao-temporal dependence: a blessing 2 and a curse for computaon and inference Fossil Pollen Data Berry Pond West Berry Pond 2000 Year AD , Pine W Massachusetts 1900 Hemlock 1800 Birch 1700 1600 Oak 1500 Hickory 1400 Beech 1300 Chestnut 1200 Grass & weeds 1100 Charcoal:Pollen 1000 % organic matter Spao-temporal dependence: a blessing European settlement and a curse for computaon and inference 20 20 40 20 20 40 20 1500 60 O n set of Little Ice Age 3 SeQlement-era Land Survey Data Survey grid in Wisconsin Surveyor notes Raw oak tree propor$ons (on a grid in the western por$on
    [Show full text]
  • Winbugs for Beginners
    WinBUGS for Beginners Gabriela Espino-Hernandez Department of Statistics UBC July 2010 “A knowledge of Bayesian statistics is assumed…” The content of this presentation is mainly based on WinBUGS manual Introduction BUGS 1: “Bayesian inference Using Gibbs Sampling” Project for Bayesian analysis using MCMC methods It is not being further developed WinBUGS 1,2 Stable version Run directly from R and other programs OpenBUGS 3 Currently experimental Run directly from R and other programs Running under Linux as LinBUGS 1 MRC Biostatistics Unit Cambridge, 2 Imperial College School of Medicine at St Mary's, London 3 University of Helsinki, Finland WinBUGS Freely distributed http://www.mrc-bsu.cam.ac.uk/bugs/welcome.shtml Key for unrestricted use http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/WinBUGS14_immortality_key.txt WinBUGS installation also contains: Extensive user manual Examples Control analysis using: Standard windows interface DoodleBUGS: Graphical representation of model A closed form for the posterior distribution is not needed Conditional independence is assumed Improper priors are not allowed Inputs Model code Specify data distributions Specify parameter distributions (priors) Data List / rectangular format Initial values for parameters Load / generate Model specification model { statements to describe model in BUGS language } Multiple statements in a single line or one statement over several lines Comment line is followed by # Types of nodes 1. Stochastic Variables that are given a distribution 2. Deterministic
    [Show full text]
  • Package Name Software Description Project
    A S T 1 Package Name Software Description Project URL 2 Autoconf An extensible package of M4 macros that produce shell scripts to automatically configure software source code packages https://www.gnu.org/software/autoconf/ 3 Automake www.gnu.org/software/automake 4 Libtool www.gnu.org/software/libtool 5 bamtools BamTools: a C++ API for reading/writing BAM files. https://github.com/pezmaster31/bamtools 6 Biopython (Python module) Biopython is a set of freely available tools for biological computation written in Python by an international team of developers www.biopython.org/ 7 blas The BLAS (Basic Linear Algebra Subprograms) are routines that provide standard building blocks for performing basic vector and matrix operations. http://www.netlib.org/blas/ 8 boost Boost provides free peer-reviewed portable C++ source libraries. http://www.boost.org 9 CMake Cross-platform, open-source build system. CMake is a family of tools designed to build, test and package software http://www.cmake.org/ 10 Cython (Python module) The Cython compiler for writing C extensions for the Python language https://www.python.org/ 11 Doxygen http://www.doxygen.org/ FFmpeg is the leading multimedia framework, able to decode, encode, transcode, mux, demux, stream, filter and play pretty much anything that humans and machines have created. It supports the most obscure ancient formats up to the cutting edge. No matter if they were designed by some standards 12 ffmpeg committee, the community or a corporation. https://www.ffmpeg.org FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions, of arbitrary input size, and of both real and 13 fftw complex data (as well as of even/odd data, i.e.
    [Show full text]
  • Dilek Yildiz
    Dilek Yildiz Contact Wittgenstein Centre (IIASA, VID/OAW,¨ WU), Vienna Institute of Information Demography/ Austrian Academy of Sciences Welthandelsplatz 2 Level 2 1020 Vienna, Austria [email protected] Research Bayesian modelling, population projections, population estimation, big data sources, administrative Interests data sources, log-linear modelling Software Skills • R, OpenBUGS, WinBUGS, SPSS, ArcGIS Education University of Southampton, Southampton, UK Ph.D., Social Statistics and Demography, March 2016 • Thesis Title: Methods for Combining Administrative Data to Estimate Population Counts • Supervisors: Peter W. F. Smith and Peter G. M. van der Heijden Hacettepe University, Ankara, Turkey M.Sc., Economic and Social Demography, August 2011 • Thesis Title: Sampling Error Estimation by Using Different Methods and Software in Complex Samples • Advisor: A. Sinan Turkyilmaz Middle East Technical University, Ankara, Turkey B.Sc., Statistics, 2008 Research University of Southampton Experience Researcher, Using Twitter Data for Population Estimates February 2015 onwards Survey researcher, Assessing Adaptive Cluster Sampling for Accessing the Pastoralist Population in Afar, Ethiopia, in collaboration with Central Statistical Authority and UNICEF September 2015 to November 2015 Hacettepe Univerity Research Assistant, Technical Demography December 2009 to August 2012 Project Assistant, Turkey Demographic and Health Survey-2008 November 2008 to December 2009 Additional • Hierarchical Modelling of Spatial and Temporal Data, University of
    [Show full text]
  • Lecture 1. Introduction to Bayesian Monte Carlo Methods in WINBUGS
    Lecture 1. Introduction to Bayesian Monte Carlo methods in WINBUGS 1-1 Introduction to Bayesian Analysis and WinBUGS Summary 1. Probability as a means of representing uncertainty 2. Bayesian direct probability statements about parameters 3. Probability distributions 4. Monte Carlo simulation 5. Implementation in WinBUGS (and DoodleBUGS) - Demo 6. Directed graphs for representing probability models 7. Examples 1-2 Introduction to Bayesian Analysis and WinBUGS How did it all start? In 1763, Reverend Thomas Bayes of Tunbridge Wells wrote In modern language, given r Binomial(θ, n), what is Pr(θ1 <θ<θ2 r, n)? ∼ | 1-3 Introduction to Bayesian Analysis and WinBUGS Basic idea: Direct expression of uncertainty about unknown parameters eg ”There is an 89% probability that the absolute increase in major bleeds is less than 10 percent with low-dose PLT transfusions” (Tinmouth et al, Transfusion, 2004) −50 −40 −30 −20 −10 0 10 20 30 % absolute increase in major bleeds 1-4 Introduction to Bayesian Analysis and WinBUGS Why a direct probability distribution? 1. Tells us what we want: what are plausible values for the parameter of interest? 2. No P-values: just calculate relevant tail areas 3. No (difficult to interpret) confidence intervals: just report, say, central area that contains 95% of distribution 4. Easy to make predictions (see later) 5. Fits naturally into decision analysis / cost-effectiveness analysis / project prioritisation 6. There is a procedure for adapting the distribution in the light of additional evidence: i.e. Bayes theorem allows
    [Show full text]
  • A Modeler's Primer on JAGS
    A Modeler’s Primer on JAGS N. Thompson Hobbs1 and Christian Che-Castaldo2 1Natural Resource Ecology Laboratory, Department of Ecosystem Science and Sustainability, and Graduate Degree Program in Ecology, Colorado State University, Fort Collins CO, 80523 2Institute for Advanced Computational Science, Stony Brook University, Stony Brook, NY 11794 December 15, 2020 1 SESYNC Modeling Course JAGS Primer DBI-1052875, DBI-1639145, DEB 1145200 Contents 1 Aim 5 2 Introducing MCMC Samplers5 3 Introducing JAGS6 4 Installing JAGS 10 4.1 Mac OS...................................... 11 4.2 Windows...................................... 11 4.3 Linux....................................... 11 5 Running JAGS 12 5.1 The JAGS model................................. 12 5.2 Technical notes.................................. 12 5.2.1 The model statement.......................... 12 5.2.2 for loops ................................. 13 5.2.3 Specifying priors............................. 16 5.2.4 The assignment operator, <- or = .................... 17 5.2.5 Vector operations............................. 17 5.2.6 Keeping variables out of trouble..................... 17 5.3 Running JAGS from R.............................. 18 5.3.1 Stepping through a JAGS run...................... 18 6 Output from JAGS 24 6.1 coda objects.................................... 24 6.1.1 The structure of coda objects...................... 24 6.1.2 Summarizing coda objects with the MCMCvis package......... 27 6.1.3 Tabular summaries with MCMCsummary ................. 28 2 SESYNC Modeling Course JAGS Primer DBI-1052875, DBI-1639145, DEB 1145200 6.1.4 Extracting portions of coda object with MCMCchains .......... 30 6.1.5 Extracting portions of coda objects while preserving their structure using MCMCpstr .............................. 32 6.1.6 Visualizing model output......................... 35 7 Checking convergence 35 7.1 Trace and density plots.............................. 35 7.2 Gelman and Rubin diagnostics (Rhat)....................
    [Show full text]