Bayesian Filtering: from Kalman Filters to Particle Filters, and Beyond ZHE CHEN

MANUSCRIPT 1 Bayesian Filtering: From Kalman Filters to Particle Filters, and Beyond ZHE CHEN Abstract— In this self-contained survey/review paper, we system- IV Bayesian Optimal Filtering 9 atically investigate the roots of Bayesian filtering as well as its rich IV-AOptimalFiltering..................... 10 leaves in the literature. Stochastic filtering theory is briefly reviewed IV-BKalmanFiltering..................... 11 with emphasis on nonlinear and non-Gaussian filtering. Following IV-COptimumNonlinearFiltering.............. 13 the Bayesian statistics, different Bayesian filtering techniques are de- IV-C.1Finite-dimensionalFilters............ 13 veloped given different scenarios. Under linear quadratic Gaussian circumstance, the celebrated Kalman filter can be derived within the Bayesian framework. Optimal/suboptimal nonlinear filtering tech- V Numerical Approximation Methods 14 niques are extensively investigated. In particular, we focus our at- V-A Gaussian/Laplace Approximation ............ 14 tention on the Bayesian filtering approach based on sequential Monte V-BIterativeQuadrature................... 14 Carlo sampling, the so-called particle filters. Many variants of the V-C Mulitgrid Method and Point-Mass Approximation . 14 particle filter as well as their features (strengths and weaknesses) are V-D Moment Approximation ................. 15 discussed. Related theoretical and practical issues are addressed in V-E Gaussian Sum Approximation . ............. 16 detail. In addition, some other (new) directions on Bayesian filtering V-F Deterministic Sampling Approximation . ....... 16 are also explored. V-G Monte Carlo Sampling Approximation . ....... 17 Index Terms— Stochastic filtering, Bayesian filtering, V-G.1ImportanceSampling.............. 18 Bayesian inference, particle filter, sequential Monte Carlo, V-G.2RejectionSampling................ 19 sequential state estimation, Monte Carlo methods. V-G.3SequentialImportanceSampling........ 19 V-G.4Sampling-ImportanceResampling....... 20 V-G.5StratifiedSampling................ 21 “The probability of any event is the ratio between the V-G.6MarkovChainMonteCarlo........... 22 value at which an expectation depending on the happening of the event ought to be computed, and the value of the V-G.7HybridMonteCarlo............... 23 thing expected upon its happening.” V-G.8Quasi-MonteCarlo................ 24 — Thomas Bayes (1702-1761), [29] VI Sequential Monte Carlo Estimation: Particle Filters 25 “Statistics is the art of never having to say you’re wrong. Variance is what any two statisticians are at.” VI-ASequentialImportanceSampling(SIS)Filter..... 26 —C.J.Bradfield VI-BBootstrap/SIRfilter................... 26 VI-CImprovedSIS/SIRFilters................ 27 Contents VI-DAuxiliary Particle Filter ................. 28 VI-ERejectionParticleFilter................. 29 I Introduction 2 VI-F Rao-Blackwellization ................... 30 I-AStochasticFilteringTheory............... 2 VI-GKernelSmoothingandRegularization......... 31 I-BBayesianTheoryandBayesianFiltering........ 2 VI-HDataAugmentation................... 32 I-C Monte Carlo Methods and Monte Carlo Filtering . 2 VI-H.1 Data Augmentation is an Iterative Kernel I-DOutlineofPaper..................... 3 SmoothingProcess................ 32 VI-H.2 Data Augmentation as a Bayesian Sampling II Mathematical Preliminaries and Problem Formula- Method...................... 33 tion 4 VI-I MCMC Particle Filter .................. 33 II-APreliminaries....................... 4 VI-JMixtureKalmanFilters................. 34 II-BNotations......................... 4 VI-KMixtureParticleFilters................. 34 II-CStochasticFilteringProblem.............. 4 VI-LOtherMonteCarloFilters................ 35 II-D Nonlinear Stochastic Filtering Is an Ill-posed Inverse VI-MChoicesofProposalDistribution............ 35 Problem.......................... 5 VI-M.1PriorDistribution................ 35 II-D.1InverseProblem................. 5 VI-M.2Annealed Prior Distribution ........... 36 II-D.2 Differential Operator and Integral Equation . 6 VI-M.3Likelihood..................... 36 II-D.3RelationstoOtherProblems.......... 7 VI-M.4Bridging Density and Partitioned Sampling . 37 II-EStochasticDifferentialEquationsandFiltering.... 7 VI-M.5Gradient-BasedTransitionDensity....... 38 VI-M.6EKFasProposalDistribution.......... 38 III Bayesian Statistics and Bayesian Estimation 8 VI-M.7UnscentedParticleFilter............ 38 III-ABayesianStatistics.................... 8 VI-NBayesianSmoothing................... 38 III-BRecursiveBayesianEstimation............. 9 VI-N.1Fixed-pointsmoothing.............. 38 VI-N.2Fixed-lagsmoothing............... 39 The work is supported by the Natural Sciences and Engineering VI-N.3Fixed-intervalsmoothing............ 39 Research Council of Canada. Z. Chen was also partially supported VI-OLikelihoodEstimate................... 40 by Clifton W. Sherman Scholarship. VI-PTheoreticalandPracticalIssues............. 40 The author is with the Communications Research Laboratory, McMaster University, Hamilton, Ontario, Canada L8S 4K1, e- VI-P.1ConvergenceandAsymptoticResults..... 40 mail: [email protected], Tel: (905)525-9140 x27282, VI-P.2Bias-Variance................... 41 Fax:(905)521-2922. VI-P.3Robustness.................... 43 VI-P.4AdaptiveProcedure............... 46 MANUSCRIPT 2 VI-P.5EvaluationandImplementation......... 46 its line have been proposed and developed to overcome its limitation. VIIOther Forms of Bayesian Filtering and Inference 47 VII-AConjugate Analysis Approach .............. 47 B. Bayesian Theory and Bayesian Filtering VII-BDifferential Geometrical Approach . .......... 47 VII-CInteractingMultipleModels............... 48 Bayesian theory2 was originally discovered by the British VII-DBayesian Kernel Approaches ............... 48 researcher Thomas Bayes in a posthumous publication in VII-EDynamicBayesianNetworks............... 48 1763 [29]. The well-known Bayes theorem describes the VIIISelected Applications 49 fundamental probability law governing the process of log- VIII-ATargetTracking...................... 49 ical inference. However, Bayesian theory has not gained VIII-BComputerVisionandRobotics............. 49 its deserved attention in the early days until its modern VIII-CDigitalCommunications................. 49 form was rediscovered by the French mathematician Pierre- VIII-DSpeechEnhancementandSpeechRecognition..... 50 Simon de Laplace in Théorie analytique des probailités.3 VIII-EMachineLearning..................... 50 Bayesian inference [38], [388], [375], devoted to applying VIII-FOthers........................... 50 Bayesian statistics to statistical inference, has become one VIII-GAnIllustrativeExample:Robot-ArmProblem..... 50 of the important branches in statistics, and has been ap- IX Discussion and Critique 51 plied successfully in statistical decision, detection and es- IX-AParameterEstimation.................. 51 timation, pattern recognition, and machine learning. In IX-BJointEstimationandDualEstimation......... 51 particular, the November 19 issue of 1999 Science mag- IX-CPrior............................ 52 azine has given the Bayesian research boom a four-page IX-DLocalizationMethods.................. 52 special attention [320]. In many scenarios, the solutions IX-EDimensionalityReductionandProjection....... 53 IX-FUnansweredQuestions.................. 53 gained through Bayesian inference are viewed as “optimal”. Not surprisingly, Bayesian theory was also studied in the X Summary and Concluding Remarks 55 filtering literature. One of the first exploration of iterative Bayesian estimation is found in Ho and Lee’ paper I. Introduction [212], in which they specified the principle and procedure of Bayesian filtering. Sprangins [426] discussed the itera- HE contents of this paper contain three major scien- tive application of Bayes rule to sequential parameter esti- tific areas: stochastic filtering theory, Bayesian theory, T mation and called it as “Bayesian learning”. Lin and Yau and Monte Carlo methods. All of them are closely discussed [301] and Chien an Fu [92] discussed Bayesian approach around the subject of our interest: Bayesian filtering. In to optimization of adaptive systems. Bucy [62] and Bucy the course of explaining this long story, some relevant the- and Senne [63] also explored the point-mass approximation ories are briefly reviewed for the purpose of providing the method in the Bayesian filtering framework. reader a complete picture. Mathematical preliminaries and background materials are also provided in detail for the C. Monte Carlo Methods and Monte Carlo Filtering self-containing purpose. The early idea of Monte Carlo4 can be traced back to A. Stochastic Filtering Theory the problem of Buffon’s needle when Buffon attempted in 1777 to estimate π (see e.g., [419]). But the modern Stochastic filtering theory was first established in the formulation of Monte Carlo methods started from 1940s early 1940s due to the pioneering work by Norbert Wiener in physics [330], [329], [393] and later in 1950s to statis- [487], [488] and Andrey N. Kolmogorov [264], [265], and it tics [198]. During the World War II, John von Neumann, culminated in 1960 for the publication of classic Kalman Stanislaw Ulam, Niick Metropolis, and others initialized filter (KF) [250] (and subsequent Kalman-Bucy filter in 1 the Monte Carlo method in Los Alamos Laboratory. von 1961 [249]), though many credits should be also due to Neumann also used Monte Carlo method to calculate the some earlier work by Bode and Shannon [46], Zadeh and elements of

Bayesian Filtering: from Kalman Filters to Particle Filters, and Beyond ZHE CHEN

Kalman and Particle Filtering

The Exponential Family 1 Definition

Machine Learning Conjugate Priors and Monte Carlo Methods

Applying Particle Filtering in Both Aggregated and Age-Structured Population Compartmental Models of Pre-Vaccination Measles

Polynomial Singular Value Decompositions of a Family of Source-Channel Models

Dynamic Detection of Change Points in Long Time Series

A Compendium of Conjugate Priors

36-463/663: Hierarchical Linear Models

Monte Carlo Smoothing for Nonlinear Time Series

A Geometric View of Conjugate Priors

Time Series Analysis 5

Adaptive Motion Model for Human Tracking Using Particle Filter