Particle Filters for Mixture Models with an Unknown Number of Components
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Spatial Duration Data for the Spatial Re- Gression Context
Missing Data In Spatial Regression 1 Frederick J. Boehmke Emily U. Schilling University of Iowa Washington University Jude C. Hays University of Pittsburgh July 15, 2015 Paper prepared for presentation at the Society for Political Methodology Summer Conference, University of Rochester, July 23-25, 2015. 1Corresponding author: [email protected] or [email protected]. We are grateful for funding provided by the University of Iowa Department of Political Science. Com- ments from Michael Kellermann, Walter Mebane, and Vera Troeger greatly appreciated. Abstract We discuss problems with missing data in the context of spatial regression. Even when data are MCAR or MAR based solely on exogenous factors, estimates from a spatial re- gression with listwise deletion can be biased, unlike for standard estimators with inde- pendent observations. Further, common solutions for missing data such as multiple im- putation appear to exacerbate the problem rather than fix it. We propose a solution for addressing this problem by extending previous work including the EM approach devel- oped in Hays, Schilling and Boehmke (2015). Our EM approach iterates between imputing the unobserved values and estimating the spatial regression model using these imputed values. We explore the performance of these alternatives in the face of varying amounts of missing data via Monte Carlo simulation and compare it to the performance of spatial regression with listwise deletion and a benchmark model with no missing data. 1 Introduction Political science data are messy. Subsets of the data may be missing; survey respondents may not answer all of the questions, economic data may only be available for a subset of countries, and voting records are not always complete. -
Kalman and Particle Filtering
Abstract: The Kalman and Particle filters are algorithms that recursively update an estimate of the state and find the innovations driving a stochastic process given a sequence of observations. The Kalman filter accomplishes this goal by linear projections, while the Particle filter does so by a sequential Monte Carlo method. With the state estimates, we can forecast and smooth the stochastic process. With the innovations, we can estimate the parameters of the model. The article discusses how to set a dynamic model in a state-space form, derives the Kalman and Particle filters, and explains how to use them for estimation. Kalman and Particle Filtering The Kalman and Particle filters are algorithms that recursively update an estimate of the state and find the innovations driving a stochastic process given a sequence of observations. The Kalman filter accomplishes this goal by linear projections, while the Particle filter does so by a sequential Monte Carlo method. Since both filters start with a state-space representation of the stochastic processes of interest, section 1 presents the state-space form of a dynamic model. Then, section 2 intro- duces the Kalman filter and section 3 develops the Particle filter. For extended expositions of this material, see Doucet, de Freitas, and Gordon (2001), Durbin and Koopman (2001), and Ljungqvist and Sargent (2004). 1. The state-space representation of a dynamic model A large class of dynamic models can be represented by a state-space form: Xt+1 = ϕ (Xt,Wt+1; γ) (1) Yt = g (Xt,Vt; γ) . (2) This representation handles a stochastic process by finding three objects: a vector that l describes the position of the system (a state, Xt X R ) and two functions, one mapping ∈ ⊂ 1 the state today into the state tomorrow (the transition equation, (1)) and one mapping the state into observables, Yt (the measurement equation, (2)). -
Applying Particle Filtering in Both Aggregated and Age-Structured Population Compartmental Models of Pre-Vaccination Measles
bioRxiv preprint doi: https://doi.org/10.1101/340661; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. Applying particle filtering in both aggregated and age-structured population compartmental models of pre-vaccination measles Xiaoyan Li1*, Alexander Doroshenko2, Nathaniel D. Osgood1 1 Department of Computer Science, University of Saskatchewan, Saskatoon, Saskatchewan, Canada 2 Department of Medicine, Division of Preventive Medicine, University of Alberta, Edmonton, Alberta, Canada * [email protected] Abstract Measles is a highly transmissible disease and is one of the leading causes of death among young children under 5 globally. While the use of ongoing surveillance data and { recently { dynamic models offer insight on measles dynamics, both suffer notable shortcomings when applied to measles outbreak prediction. In this paper, we apply the Sequential Monte Carlo approach of particle filtering, incorporating reported measles incidence for Saskatchewan during the pre-vaccination era, using an adaptation of a previously contributed measles compartmental model. To secure further insight, we also perform particle filtering on an age structured adaptation of the model in which the population is divided into two interacting age groups { children and adults. The results indicate that, when used with a suitable dynamic model, particle filtering can offer high predictive capacity for measles dynamics and outbreak occurrence in a low vaccination context. We have investigated five particle filtering models in this project. Based on the most competitive model as evaluated by predictive accuracy, we have performed prediction and outbreak classification analysis. -
Missing Data Module 1: Introduction, Overview
Missing Data Module 1: Introduction, overview Roderick Little and Trivellore Raghunathan Course Outline 9:00-10:00 Module 1 Introduction and overview 10:00-10:30 Module 2 Complete-case analysis, including weighting methods 10:30-10:45 Break 10:45-12:00 Module 3 Imputation, multiple imputation 1:30-2:00 Module 4 A little likelihood theory 2:00-3:00 Module 5 Computational methods/software 3:00-3:30 Module 6: Nonignorable models Module 1:Introduction and Overview Module 1: Introduction and Overview • Missing data defined, patterns and mechanisms • Analysis strategies – Properties of a good method – complete-case analysis – imputation, and multiple imputation – analysis of the incomplete data matrix – historical overview • Examples of missing data problems Module 1:Introduction and Overview Module 1: Introduction and Overview • Missing data defined, patterns and mechanisms • Analysis strategies – Properties of a good method – complete-case analysis – imputation, and multiple imputation – analysis of the incomplete data matrix – historical overview • Examples of missing data problems Module 1:Introduction and Overview Missing data defined variables If filled in, meaningful for cases analysis? • Always assume missingness hides a meaningful value for analysis • Examples: – Missing data from missed clinical visit( √) – No-show for a preventive intervention (?) – In a longitudinal study of blood pressure medications: • losses to follow-up ( √) • deaths ( x) Module 1:Introduction and Overview Patterns of Missing Data • Some methods work for a general -
Hierarchical Dirichlet Process Hidden Markov Model for Unsupervised Bioacoustic Analysis
Hierarchical Dirichlet Process Hidden Markov Model for Unsupervised Bioacoustic Analysis Marius Bartcus, Faicel Chamroukhi, Herve Glotin Abstract-Hidden Markov Models (HMMs) are one of the the standard HDP-HMM Gibbs sampling has the limitation most popular and successful models in statistics and machine of an inadequate modeling of the temporal persistence of learning for modeling sequential data. However, one main issue states [9]. This problem has been addressed in [9] by relying in HMMs is the one of choosing the number of hidden states. on a sticky extension which allows a more robust learning. The Hierarchical Dirichlet Process (HDP)-HMM is a Bayesian Other solutions for the inference of the hidden Markov non-parametric alternative for standard HMMs that offers a model in this infinite state space models are using the Beam principled way to tackle this challenging problem by relying sampling [lO] rather than Gibbs sampling. on a Hierarchical Dirichlet Process (HDP) prior. We investigate the HDP-HMM in a challenging problem of unsupervised We investigate the BNP formulation for the HMM, that is learning from bioacoustic data by using Markov-Chain Monte the HDP-HMM into a challenging problem of unsupervised Carlo (MCMC) sampling techniques, namely the Gibbs sampler. learning from bioacoustic data. The problem consists of We consider a real problem of fully unsupervised humpback extracting and classifying, in a fully unsupervised way, whale song decomposition. It consists in simultaneously finding the structure of hidden whale song units, and automatically unknown number of whale song units. We use the Gibbs inferring the unknown number of the hidden units from the Mel sampler to infer the HDP-HMM from the bioacoustic data. -
Coping with Missing Data in Randomized Controlled Trials
EVALUATION TECHNICAL ASSISTANCE BRIEF for OAH & ACYF Teenage Pregnancy Prevention Grantees May 2013 • Brief 3 Coping with Missing Data in Randomized Controlled Trials issing data in randomized controlled trials (RCTs) can respondents3 if there are no systematic differences between Mlead to biased impact estimates and a loss of statistical respondents in the treatment group and respondents in the power. In this brief we describe strategies for coping with control group. However, if respondents in the treatment group missing data in RCTs.1 Specifically, we discuss strategies to: are systematically different from respondents in the control group, then the impact we calculate for respondents lacks ● Describe the extent and nature of missing data. We suggest causal validity. We call the difference between the calculated ways to provide a clear and accurate accounting of the extent impact for respondents and the true impact for respondents and nature of missing outcome data. This will increase the “causal validity bias.” This type of bias is the focus of the transparency and credibility of the study, help with interpret- evidence review of teen pregnancy prevention programs ing findings, and also help to select an approach for addressing (Mathematica Policy Research and Child Trends 2012). missing data in impact analyses. We suggest conducting descriptive analyses designed to assess ● Address missing data in impact analyses. Missing outcome the potential for both types of bias in an RCT. or baseline data pose mechanical and substantive problems when conducting impact analyses. The mechanical problem With regard to generalizability, we suggest conducting two is that mathematical operations cannot be performed on analyses. -
Monte Carlo Likelihood Inference for Missing Data Models
MONTE CARLO LIKELIHOOD INFERENCE FOR MISSING DATA MODELS By Yun Ju Sung and Charles J. Geyer University of Washington and University of Minnesota Abbreviated title: Monte Carlo Likelihood Asymptotics We describe a Monte Carlo method to approximate the maximum likeli- hood estimate (MLE), when there are missing data and the observed data likelihood is not available in closed form. This method uses simulated missing data that are independent and identically distributed and independent of the observed data. Our Monte Carlo approximation to the MLE is a consistent and asymptotically normal estimate of the minimizer ¤ of the Kullback- Leibler information, as both a Monte Carlo sample size and an observed data sample size go to in¯nity simultaneously. Plug-in estimates of the asymptotic variance are provided for constructing con¯dence regions for ¤. We give Logit-Normal generalized linear mixed model examples, calculated using an R package that we wrote. AMS 2000 subject classi¯cations. Primary 62F12; secondary 65C05. Key words and phrases. Asymptotic theory, Monte Carlo, maximum likelihood, generalized linear mixed model, empirical process, model misspeci¯cation 1 1. Introduction. Missing data (Little and Rubin, 2002) either arise naturally|data that might have been observed are missing|or are intentionally chosen|a model includes random variables that are not observable (called latent variables or random e®ects). A mixture of normals or a generalized linear mixed model (GLMM) is an example of the latter. In either case, a model is speci¯ed for the complete data (x; y), where x is missing and y is observed, by their joint den- sity f(x; y), also called the complete data likelihood (when considered as a function of ). -
Missing Data
4 Missing Data Paul D. Allison INTRODUCTION they have spent a great deal time, money and effort in collecting. And so, many methods for Missing data are ubiquitous in psychological ‘salvaging’ the cases with missing data have research. By missing data, I mean data that become popular. are missing for some (but not all) variables For a very long time, however, missing and for some (but not all) cases. If data are data could be described as the ‘dirty little missing on a variable for all cases, then that secret’ of statistics. Although nearly everyone variable is said to be latent or unobserved. had missing data and needed to deal with On the other hand, if data are missing on the problem in some way, there was almost all variables for some cases, we have what nothing in the textbook literature to pro- is known as unit non-response, as opposed vide either theoretical or practical guidance. to item non-response which is another name Although the reasons for this reticence are for the subject of this chapter. I will not deal unclear, I suspect it was because none of the with methods for latent variables or unit non- popular methods had any solid mathematical response here, although some of the methods foundation. As it turns out, all of the most we will consider can be adapted to those commonly used methods for handling missing situations. data have serious deficiencies. Why are missing data a problem? Because Fortunately, the situation has changed conventional statistical methods and software dramatically in recent years. -
Dynamic Detection of Change Points in Long Time Series
AISM (2007) 59: 349–366 DOI 10.1007/s10463-006-0053-9 Nicolas Chopin Dynamic detection of change points in long time series Received: 23 March 2005 / Revised: 8 September 2005 / Published online: 17 June 2006 © The Institute of Statistical Mathematics, Tokyo 2006 Abstract We consider the problem of detecting change points (structural changes) in long sequences of data, whether in a sequential fashion or not, and without assuming prior knowledge of the number of these change points. We reformulate this problem as the Bayesian filtering and smoothing of a non standard state space model. Towards this goal, we build a hybrid algorithm that relies on particle filter- ing and Markov chain Monte Carlo ideas. The approach is illustrated by a GARCH change point model. Keywords Change point models · GARCH models · Markov chain Monte Carlo · Particle filter · Sequential Monte Carlo · State state models 1 Introduction The assumption that an observed time series follows the same fixed stationary model over a very long period is rarely realistic. In economic applications for instance, common sense suggests that the behaviour of economic agents may change abruptly under the effect of economic policy, political events, etc. For example, Mikosch and St˘aric˘a (2003, 2004) point out that GARCH models fit very poorly too long sequences of financial data, say 20 years of daily log-returns of some speculative asset. Despite this, these models remain highly popular, thanks to their forecast ability (at least on short to medium-sized time series) and their elegant simplicity (which facilitates economic interpretation). Against the common trend of build- ing more and more sophisticated stationary models that may spuriously provide a better fit for such long sequences, the aforementioned authors argue that GARCH models remain a good ‘local’ approximation of the behaviour of financial data, N. -
Bayesian Filtering: from Kalman Filters to Particle Filters, and Beyond ZHE CHEN
MANUSCRIPT 1 Bayesian Filtering: From Kalman Filters to Particle Filters, and Beyond ZHE CHEN Abstract— In this self-contained survey/review paper, we system- IV Bayesian Optimal Filtering 9 atically investigate the roots of Bayesian filtering as well as its rich IV-AOptimalFiltering..................... 10 leaves in the literature. Stochastic filtering theory is briefly reviewed IV-BKalmanFiltering..................... 11 with emphasis on nonlinear and non-Gaussian filtering. Following IV-COptimumNonlinearFiltering.............. 13 the Bayesian statistics, different Bayesian filtering techniques are de- IV-C.1Finite-dimensionalFilters............ 13 veloped given different scenarios. Under linear quadratic Gaussian circumstance, the celebrated Kalman filter can be derived within the Bayesian framework. Optimal/suboptimal nonlinear filtering tech- V Numerical Approximation Methods 14 niques are extensively investigated. In particular, we focus our at- V-A Gaussian/Laplace Approximation ............ 14 tention on the Bayesian filtering approach based on sequential Monte V-BIterativeQuadrature................... 14 Carlo sampling, the so-called particle filters. Many variants of the V-C Mulitgrid Method and Point-Mass Approximation . 14 particle filter as well as their features (strengths and weaknesses) are V-D Moment Approximation ................. 15 discussed. Related theoretical and practical issues are addressed in V-E Gaussian Sum Approximation . ............. 16 detail. In addition, some other (new) directions on Bayesian filtering V-F Deterministic -
Monte Carlo Smoothing for Nonlinear Time Series
Monte Carlo Smoothing for Nonlinear Time Series Simon J. GODSILL, Arnaud DOUCET, and Mike WEST We develop methods for performing smoothing computations in general state-space models. The methods rely on a particle representation of the filtering distributions, and their evolution through time using sequential importance sampling and resampling ideas. In particular, novel techniques are presented for generation of sample realizations of historical state sequences. This is carried out in a forward-filtering backward-smoothing procedure that can be viewed as the nonlinear, non-Gaussian counterpart of standard Kalman filter-based simulation smoothers in the linear Gaussian case. Convergence in the mean squared error sense of the smoothed trajectories is proved, showing the validity of our proposed method. The methods are tested in a substantial application for the processing of speech signals represented by a time-varying autoregression and parameterized in terms of time-varying partial correlation coefficients, comparing the results of our algorithm with those from a simple smoother based on the filtered trajectories. KEY WORDS: Bayesian inference; Non-Gaussian time series; Nonlinear time series; Particle filter; Sequential Monte Carlo; State- space model. 1. INTRODUCTION and In this article we develop Monte Carlo methods for smooth- g(yt+1|xt+1)p(xt+1|y1:t ) p(xt+1|y1:t+1) = . ing in general state-space models. To fix notation, consider p(yt+1|y1:t ) the standard Markovian state-space model (West and Harrison 1997) Similarly, smoothing can be performed recursively backward in time using the smoothing formula xt+1 ∼ f(xt+1|xt ) (state evolution density), p(xt |y1:t )f (xt+1|xt ) p(x |y ) = p(x + |y ) dx + . -
Part 2: Basics of Dirichlet Processes 2.1 Motivation
CS547Q Statistical Modeling with Stochastic Processes Winter 2011 Part 2: Basics of Dirichlet processes Lecturer: Alexandre Bouchard-Cˆot´e Scribe(s): Liangliang Wang Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal publications. They may be distributed outside this class only with the permission of the Instructor. Last update: May 17, 2011 2.1 Motivation To motivate the Dirichlet process, let us consider a simple density estimation problem: modeling the height of UBC students. We are going to take a Bayesian approach to this problem, considering the parameters as random variables. In one of the most basic models, one would define a mean and variance random parameter θ = (µ, σ2), and a height random variable normally distributed conditionally on θ, with parameters θ.1 Using a single normal distribution is clearly a defective approach, since for example the male/female sub- populations create a skewness in the distribution, which cannot be capture by normal distributions: The solution suggested by this figure is to use a mixture of two normal distributions, with one set of parameters θc for each sub-population or cluster c ∈ {1, 2}. Pictorially, the model can be described as follows: 1- Generate a male/female relative frequence " " ~ Beta(male prior pseudo counts, female P.C) Mean height 2- Generate the sex of each student for each i for men Mult( ) !(1) x1 x2 x3 xi | " ~ " 3- Generate the mean height of each cluster c !(2) Mean height !(c) ~ N(prior height, how confident prior) for women y1 y2 y3 4- Generate student heights for each i yi | xi, !(1), !(2) ~ N(!(xi) ,variance) 1Yes, this has many problems (heights cannot be negative, normal assumption broken, etc).