Accelerating Monte Carlo Methods for Bayesian Inference in Dynamical Models

Linköping studies in science and technology. Dissertations. No. 1754 Accelerating Monte Carlo methods for Bayesian inference in dynamical models Johan Dahlin Cover illustration: A Markov chain generated by the Metropolis-Hastings algorithm with an autoregressive proposal on a manifold given by a parametric function. This thesis was typeset using the LATEX typesetting system originally developed by Leslie Lamport, based on TEX created by Donald Knuth. The text is set in Garamond and Cabin. The source code is set in Inconsolata. All plots are made using R (R Core Team, 2015) together with colors from the RColorBrewer package (Neuwirth, 2014). Most simulations are carried out in R and Python with the exception of Paper F and H. Linköping studies in science and technology. Dissertations. No. 1754 Accelerating Monte Carlo methods for Bayesian inference in dynamical models Johan Dahlin [email protected] [email protected] http://liu.johandahlin.com Division of Automatic Control Department of Electrical Engineering Linköping University SE–581 83 Linköping Sweden ISBN 978-91-7685-797-7 ISSN 0345-7524 Copyright (c) 2016 Johan Dahlin Printed by LiU-Tryck, Linköping, Sweden 2016 Denna avhandling tillägnas min familj! Abstract Making decisions and predictions from noisy observations are two important and challeng- ing problems in many areas of society. Some examples of applications are recommendation systems for online shopping and streaming services, connecting genes with certain diseases and modelling climate change. In this thesis, we make use of Bayesian statistics to construct probabilistic models given prior information and historical data, which can be used for decision support and predictions. The main obstacle with this approach is that it often results in mathematical problems lacking analytical solutions. To cope with this, we make use of statistical simulation algorithms known as Monte Carlo methods to approximate the intractable solution. These methods enjoy well-understood statistical properties but are often computational prohibitive to employ. The main contribution of this thesis is the exploration of different strategies for accelerating inference methods based on sequential Monte Carlo ( smc) and Markov chain Monte Carlo ( mcmc). That is, strategies for reducing the computational effort while keeping or improv- ing the accuracy. A major part of the thesis is devoted to proposing such strategies for the mcmc method known as the particle Metropolis-Hastings ( pmh) algorithm. We investigate two strategies: (i) introducing estimates of the gradient and Hessian of the target to better tailor the algorithm to the problem and (ii) introducing a positive correlation between the point-wise estimates of the target. Furthermore, we propose an algorithm based on the combination of smc and Gaussian process optimisation, which can provide reasonable estimates of the posterior but with a significant decrease in computational effort compared with pmh. Moreover, we explore the use of sparseness priors for approximate inference in over-parametrised mixed effects models and autoregressive processes. This can potentially be a practical strategy for inference in the big data era. Finally, we propose a general method for increasing the accuracy of the parameter estimates in non-linear state space models by applying a designed input signal. v Populärvetenskaplig sammanfattning Borde Riksbanken höja eller sänka reporäntan vid sitt nästa möte för att nå inflationsmålet? Vilka gener är förknippade med en viss sjukdom? Hur kan Netflix och Spotify veta vilka filmer och vilken musik som jag vill lyssna på härnäst? Dessa tre problem är exempel på frågor där statistiska modeller kan vara användbara för att ge hjälp och underlag för beslut. Statistiska modeller kombinerar teoretisk kunskap om exempelvis det svenska ekonomiska systemet med historisk data för att ge prognoser av framtida skeenden. Dessa prognoser kan sedan användas för att utvärdera exempelvis vad som skulle hända med inflationen i Sverige om arbetslösheten sjunker eller hur värdet på mitt pensionssparande förändras när Stockholmsbörsen rasar. Tillämpningar som dessa och många andra gör statistiska modeller viktiga för många delar av samhället. Ett sätt att ta fram statistiska modeller bygger på att kontinuerligt uppdatera en modell allteftersom mer information samlas in. Detta angreppssätt kallas för Bayesiansk statistik och är särskilt användbart när man sedan tidigare har bra insikter i modellen eller tillgång till endast lite historisk data för att bygga modellen. En nackdel med Bayesiansk statistik är att de beräkningar som krävs för att uppdatera modellen med den nya informationen ofta är mycket komplicerade. I sådana situationer kan man istället simulera utfallet från miljontals varianter av modellen och sedan jämföra dessa mot de historiska observationerna som finns till hands. Man kan sedan medelvärdesbilda över de varianter som gav bäst resultat för att på så sätt ta fram en slutlig modell. Det kan därför ibland ta dagar eller veckor för att ta fram en modell. Problemet blir särskilt stort när man använder mer avancerade modeller som skulle kunna ge bättre prognoser men som tar för lång tid för att bygga. I denna avhandling använder vi ett antal olika strategier för att underlätta eller förbättra dessa simuleringar. Vi föreslår exempelvis att ta hänsyn till fler insikter om systemet och därmed minska antalet varianter av modellen som behöver undersökas. Vi kan således redan utesluta vissa modeller eftersom vi har en bra uppfattning om ungefär hur en bra modell ska se ut. Vi kan också förändra simuleringen så att den enklare rör sig mellan olika typer av modeller. På detta sätt utforskas rymden av alla möjliga modeller på ett mer effektivt sätt. Vi föreslår ett antal olika kombinationer och förändringar av befintliga metoder för att snabba upp anpassningen av modellen till observationerna. Vi visar att beräkningstiden i vissa fall kan minska ifrån några dagar till någon timme. Förhoppningsvis kommer detta i framtiden leda till att man i praktiken kan använda mer avancerade modeller som i sin tur resulterar i bättre prognoser och beslut. vii Acknowledgments Science is a co-operative enterprise, spanning the generations. It’s the pass- ing of a torch from teacher to student to teacher. A community of minds reaching back from antiquity and forward to the stars. – Neil deGrasse Tyson This is my humble contribution to the collaboration that is science. This is my dent in the universe! However, I could not have reached this point and written this thesis without the support, encouragement and love from so many people over the years. We all have so much to be grateful for. We often do not have the opportunity to express this and often take things for granted. Therefore, please bare with me on the following pages in my attempt to express my gratitude for all the people that made this journey possible. To do a phd means that you spend five years on the boundary of your comfort zone. Some- times, you are on the inside of the boundary but often you are just (or even further) outside the boundary. The latter is an awesome place to be. There is nothing that develops you more than when you stretch the limits of what you think that you can achieve. However, staying at this place for a long time takes its toll and this is one of the reasons (except of course learning how to do research) for having a guide and mentor along for the journey. In my case, I got the opportunity to travel along my two amazing supervisors Thomas Schön and Fredrik Lindsten. These guys are really great supervisors and they have skilfully guided me along the way to obtain my phd. I am truly grateful for all the time, effort and energy that they have put into helping me develop as a researcher and as a person. Thomas has helped me a lot with the long-term perspective with strategy, planning, collaborations and research ideas. Fredrik has helped me with many good ideas, answering hundreds of questions regarding the intricate working of algorithms and helped me iron out subtle mistakes in papers and reports. Thank you also for all the nice times together outside of work. Especially all the running, restaurant visits and team day dinners as Thomas’ place! Along my journey, I crossed paths with Mattias Villani and Robert Kohn, who supported and guided me almost as if I was one of their own phd students. I am very grateful for our collaborations and the time, inspiration and knowledge you both have given me. A special thanks goes to Robert for the invitation to visit him at unsw Business School in Sydney, Australia. The autumn that I spent there was truly a wonderful experience in terms of research as well as from a personal perspective. Thank you Robert for you amazing hospitality, your patience and encouragement. I would like to thank all my co-authors during my time at Linköping University for some wonderful and fruitful collaborations: Christian Andersson Naesseth, Liang Dai, Daniel Hultqvist, Daniel Jönsson, Manon Kok, Robert Kohn, Joel Kronander, Fredrik Lindsten, Cristian Rojas, Jakob Roll, Thomas Schön, Andreas Svensson, Fredrik Svensson, Jonas Unger, Patricio Valenzuela, Mattias Villani, Johan Wågberg and Adrian Wills. Furthermore, many of these co-authors and Olof Sundin helped with proof-reading the thesis and con- tributed with suggestions to improve it. All remaining errors are entirely my own. To be able to write a good thesis you require a good working environment. Svante Gun- narsson and Ninna Stensgård are two very important persons in this effort. Thank you for all your support and helpfulness in all matters to help create the best possible situation for ix x Acknowledgments myself and for all the other phd students. Furthermore, I gratefully acknowledge the fi- nancial support from the projects Learning of complex dynamical systems (Contract number: 637-2014-466) and Probabilistic modeling of dynamical systems (Contract number: 621-2013- 5524) and cadics, a Linnaeus Center, all funded by the Swedish Research Council.

Accelerating Monte Carlo Methods for Bayesian Inference in Dynamical Models

Parametric and Non-Parametric Statistics for Program Performance Analysis and Comparison Julien Worms, Sid Touati

Non Parametric Test: Chi Square Test of Contingencies

Pitfalls and Options in Hypothesis Testing for Comparing Differences in Means

Inferential Statistics Katie Rommel-Esham Education 604 Probability

A Monte Carlo Method for Comparing Generalized Estimating Equations to Conventional Statistical Techniques Tor Discounting Data

Nonparametric Permutation Testing

Paper Presented at the Annual Meeting of the American Educational Research Association, New York, New York, February 1971

Scales and Statistics: Parametric and Nonparametric1 Norman H

Guiding Principles for Monte Carlo Analysis (Pdf)

Parametric Vs. Non-Parametric Statistics of Low Resolution Electromagnetic Tomography (LORETA)

An Interesting Reading "History of Statistics on Timeline"

Unit 1 Parametric and Non- Parametric Statistics