(2008) Fast Gaussian Process Methods for Point Process Intensity Estimation

Total Page:16

File Type:pdf, Size:1020Kb

(2008) Fast Gaussian Process Methods for Point Process Intensity Estimation Fast Gaussian Process Methods for Point Process Intensity Estimation John P. Cunningham [email protected] Department of Electrical Engineering, Stanford University, Stanford, CA, USA 94305 Krishna V. Shenoy [email protected] Department of Electrical Engineering and Neurosciences Program, Stanford University, Stanford, CA, USA 94305 Maneesh Sahani [email protected] Gatsby Computational Neuroscience Unit, UCL London, WC1N 3AR, UK Abstract tional complexity inherent in GP methods (e.g. Rasmussen Point processes are difficult to analyze because & Williams, 2006). The data size n will grow with the they provide only a sparse and noisy observa- length (e.g. total time) of the point process. Naive methods will be O(n2) in memory requirements (storing Hessian tion of the intensity function driving the process. 3 Gaussian Processes offer an attractive framework matrices) and O(n ) in run time (matrix inversions and de- within which to infer underlying intensity func- terminants). At one thousand data points (such as one sec- tions. The result of this inference is a continu- ond of millisecond-resolution data), a naive solution to this ous function defined across time that is typically problem is already quite burdensome on a common work- more amenable to analytical efforts. However, a station. At ten thousand or more, this problem is for all naive implementation will become computation- practical purposes intractable. ally infeasible in any problem of reasonable size, While applications of doubly-stochastic point processes are both in memory and run time requirements. We numerous, there is little work proposing solutions to the demonstrate problem specific methods for a class serious computational issues inherent in these methods. of renewal processes that eliminate the memory Thus, the development of efficient methods for intensity es- burden and reduce the solve time by orders of timation would be of broad appeal. In this paper, we do not magnitude. address the appropriateness of doubly-stochastic point pro- cess models for particular applications, but rather we focus on the significant steps required to make such modelling 1. Introduction computationally tractable. We build on previous work from both GP regression and large-scale optimization to create Point processes with temporally or spatially varying inten- a considerably faster and less memory intensive algorithm sity functions arise naturally in many fields of study. When for doubly-stochastic point-process intensity estimation. the intensity function is itself a random process (often a Gaussian Process), the process is called a doubly-stochastic As part of the GP intensity estimation problem we optimize or Cox point process. Application domains including eco- model hyperparameters using a Laplace approximation to nomics and finance (e.g. Basu & Dassios, 2002), neu- the marginal likelihood or evidence. This requires an iter- roscience (e.g. Cunningham et al., 2008), ecology (e.g. ative approach which divides into two major parts. First, Moller et al., 1998), and others. at each iteration we must find a modal (MAP) estimate of the intensity function. Second, we must calculate the ap- Given observed point process data, one can use a Gaussian proximate model evidence and its gradients with respect to Process (GP) framework to infer an optimal estimate of the GP hyperparameters. Both aspects of this problem present underlying intensity. In this paper we consider GP prior computational and memory problems. We develop meth- intensity functions coupled with point process observation ods to reduce the costs of both drastically. models. The problem of intensity estimation then becomes a modification of GP regression and inherits the computa- We show that for certain classes of renewal process ob- servation models, MAP estimation may be framed as a 25 th Appearing in Proceedings of the International Conference tractable convex program. To ensure nonnegativity in the on Machine Learning, Helsinki, Finland, 2008. Copyright 2008 by the author(s)/owner(s). intensity function we use a log barrier Newton method 192 Fast Gaussian Process Methods for Point Process Intensity Estimation (Boyd & Vandenberghe, 2004), which we solve efficiently truncating the GP prior in the continuous, infinite dimen- by deriving decompositions of matrices with known struc- sional function space; see Horrace, 2005). Thus, if the ob- ture. By exploiting a recursion embedded in the algorithm, servation model is also log concave in x, the MAP estimate we avoid many costly matrix inversions. We combine these x∗ is unique and can be readily found using a log barrier advances with large scale optimization techniques, such as Newton method (Boyd & Vandenberghe, 2004; Paninski, conjugate gradients (CG, as used by Gibbs & MacKay, 2004). Renewal processes are simply defined by their in- 1997) and fast fourier transform (FFT) matrix multiplica- terarrival distribution fz(z). A common construction for tion methods. a renewal process with an inhomogeneous underlying in- tensity is to use the intensity rescaling m(ti j ti−1) = To evaluate the model evidence, as well as its gradients ti x(u)du (in practice, a discretized sum of x) (Barbieri with respect to hyperparameters, we again exploit the struc- ti−1 et al., 2001; Daley & Vere-Jones, 2002). Accordingly, the ture imposed by the renewal process framework to find an R exact but considerably less burdensome representation. We density for an observation of event times y can be defined then show that a further approximation loses little in accu- N racy, but makes the cost of this computation insignificant. p(y) = p(yi j yi−1) i=1 Combining these advances, we are able to reduce a prob- Y N lem that is effectively computationally infeasible to a prob- 0 = jm (y j y − )j f (m(y j y − )) (1) lem with minimal memory load and very fast solution time. i i 1 z i i 1 i=1 O(n2) memory requirements are eliminated, and O(n3) Y computation is reduced to modestly superlinear. by a change of variables for the interarrival distribution (Papoulis & Pillai, 2002). Since m(t) is a linear trans- formation of the intensity function (our variables of inter- 2. Problem Overview est), the observation model obeys log concavity as long as Define x 2 IRn to be the intensity function (the high dimen- the distribution primitive fz(z) is log concave. Examples sional signal of interest); x is indexed by input1 time points of suitable renewal processes include the inhomogeneous n N+1 t 2 IR . Let the observed data y = fy0; : : :; yN g 2 IR Poisson, gamma interval, Weibull interval, inverse Gaus- be a set of N + 1 time indices into the vector x; that is, the sian (Wald) interval, Rayleigh interval, and other processes ith point event occurs at time yi, and the intensity at that (Papoulis & Pillai, 2002). For this paper, we choose one of these distributions and focus on its details. However, for time is xyi . Denote all hyperparameters by θ. In general, the prior and observation models are both functions of θ. processes of the form above, the implementation details are The GP framework implies a normal prior on the intensity identical up to the forms of the actual distributions. p(x j θ) = N (µ1; Σ), where the nonzero mean is a sen- To solve the GP intensity estimation, we first find a MAP sible choice because the intensity function is constrained estimate x∗ given fixed hyperparameters θ, and then we ap- to be nonnegative. Thus we treat µ as a hyperparameter proximate the model evidence p(y j θ) (for which we need (µ 2 θ). The positive definite covariance matrix Σ (also a x∗) and its gradients in θ. Iterating these two steps, we can function of θ) is defined by an appropriate kernel such as find the optimal model θ^ (we do not integrate over hyper- a squared exponential or Ornstein-Uhlenbeck kernel (see parameters). Finally, MAP estimation under these optimal Rasmussen & Williams, 2006, for a discussion of GP ker- hyperparameters θ^ gives an optimal estimate of the under- nels). The point-process observation model gives the like- lying intensity. This iterative solution for θ^ can be written: lihood p(y j x; θ). In this work, we consider renewal pro- cesses (i.e. one-dimensional point processes with indepen- dent event interarrival times), a family of point processes θ^ = argmax p(θ)p(y j θ) (2) that has both been well-studied theoretically and applied in θ n many domains (Daley & Vere-Jones, 2002). ∗ ∗ (2π) 2 ≈ argmax p(θ)p(y j x ; θ)p(x j θ) ∗ − 1 ; The GP prior is log concave in x, and the nonnegativity θ jΛ + Σ 1j 2 constraint on intensity (x 0) is convex (constraining x to be nonnegative is equivalent to solving an unconstrained where the last term is a Laplace approximation to the in- ∗ problem where the prior on the vector x is a truncated mul- tractable form of p(y j θ), x is the mode of p(y j x)p(x) ∗ 2 tivariate normal distribution, but this is not the same as (MAP estimate), and Λ = −∇x log p(y j x; θ) jx=x∗ . The log concavity of our problem in x supports the choice 1 In this work we restrict ourselves to a single input dimen- of a Laplace approximation. Each of the two major steps sion (which we call time), as it aligns with the family of renewal processes in one-dimension. Some ideas here can be extended to in this algorithm (MAP estimation and model selection) in- multiple dimensions (e.g. if using a spatial Poisson process). volves computational and memory challenges. We address these challenges in Sections 4 and 5. 193 Fast Gaussian Process Methods for Point Process Intensity Estimation 1 −2 −2 The computational problems inherent in GP methods have ( τ )diag(x1 ; : : :; xn ) being positive definite and diago- been well studied, and much progress has been made nal.
Recommended publications
  • Cox Process Representation and Inference for Stochastic Reaction-Diffusion Processes
    Edinburgh Research Explorer Cox process representation and inference for stochastic reaction-diffusion processes Citation for published version: Schnoerr, D, Grima, R & Sanguinetti, G 2016, 'Cox process representation and inference for stochastic reaction-diffusion processes', Nature Communications, vol. 7, 11729. https://doi.org/10.1038/ncomms11729 Digital Object Identifier (DOI): 10.1038/ncomms11729 Link: Link to publication record in Edinburgh Research Explorer Document Version: Publisher's PDF, also known as Version of record Published In: Nature Communications General rights Copyright for the publications made accessible via the Edinburgh Research Explorer is retained by the author(s) and / or other copyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associated with these rights. Take down policy The University of Edinburgh has made every reasonable effort to ensure that Edinburgh Research Explorer content complies with UK legislation. If you believe that the public display of this file breaches copyright please contact [email protected] providing details, and we will remove access to the work immediately and investigate your claim. Download date: 02. Oct. 2021 ARTICLE Received 16 Dec 2015 | Accepted 26 Apr 2016 | Published 25 May 2016 DOI: 10.1038/ncomms11729 OPEN Cox process representation and inference for stochastic reaction–diffusion processes David Schnoerr1,2,3, Ramon Grima1,3 & Guido Sanguinetti2,3 Complex behaviour in many systems arises from the stochastic interactions of spatially distributed particles or agents. Stochastic reaction–diffusion processes are widely used to model such behaviour in disciplines ranging from biology to the social sciences, yet they are notoriously difficult to simulate and calibrate to observational data.
    [Show full text]
  • Poisson Representations of Branching Markov and Measure-Valued
    The Annals of Probability 2011, Vol. 39, No. 3, 939–984 DOI: 10.1214/10-AOP574 c Institute of Mathematical Statistics, 2011 POISSON REPRESENTATIONS OF BRANCHING MARKOV AND MEASURE-VALUED BRANCHING PROCESSES By Thomas G. Kurtz1 and Eliane R. Rodrigues2 University of Wisconsin, Madison and UNAM Representations of branching Markov processes and their measure- valued limits in terms of countable systems of particles are con- structed for models with spatially varying birth and death rates. Each particle has a location and a “level,” but unlike earlier con- structions, the levels change with time. In fact, death of a particle occurs only when the level of the particle crosses a specified level r, or for the limiting models, hits infinity. For branching Markov pro- cesses, at each time t, conditioned on the state of the process, the levels are independent and uniformly distributed on [0,r]. For the limiting measure-valued process, at each time t, the joint distribu- tion of locations and levels is conditionally Poisson distributed with mean measure K(t) × Λ, where Λ denotes Lebesgue measure, and K is the desired measure-valued process. The representation simplifies or gives alternative proofs for a vari- ety of calculations and results including conditioning on extinction or nonextinction, Harris’s convergence theorem for supercritical branch- ing processes, and diffusion approximations for processes in random environments. 1. Introduction. Measure-valued processes arise naturally as infinite sys- tem limits of empirical measures of finite particle systems. A number of ap- proaches have been developed which preserve distinct particles in the limit and which give a representation of the measure-valued process as a transfor- mation of the limiting infinite particle system.
    [Show full text]
  • Scalable Nonparametric Bayesian Inference on Point Processes with Gaussian Processes
    Scalable Nonparametric Bayesian Inference on Point Processes with Gaussian Processes Yves-Laurent Kom Samo [email protected] Stephen Roberts [email protected] Deparment of Engineering Science and Oxford-Man Institute, University of Oxford Abstract 2. Related Work In this paper we propose an efficient, scalable Non-parametric inference on point processes has been ex- non-parametric Gaussian process model for in- tensively studied in the literature. Rathbum & Cressie ference on Poisson point processes. Our model (1994) and Moeller et al. (1998) used a finite-dimensional does not resort to gridding the domain or to intro- piecewise constant log-Gaussian for the intensity function. ducing latent thinning points. Unlike competing Such approximations are limited in that the choice of the 3 models that scale as O(n ) over n data points, grid on which to represent the intensity function is arbitrary 2 our model has a complexity O(nk ) where k and one has to trade-off precision with computational com- n. We propose a MCMC sampler and show that plexity and numerical accuracy, with the complexity being the model obtained is faster, more accurate and cubic in the precision and exponential in the dimension of generates less correlated samples than competing the input space. Kottas (2006) and Kottas & Sanso (2007) approaches on both synthetic and real-life data. used a Dirichlet process mixture of Beta distributions as Finally, we show that our model easily handles prior for the normalised intensity function of a Poisson data sizes not considered thus far by alternate ap- process. Cunningham et al.
    [Show full text]
  • Deep Neural Networks As Gaussian Processes
    Published as a conference paper at ICLR 2018 DEEP NEURAL NETWORKS AS GAUSSIAN PROCESSES Jaehoon Lee∗y, Yasaman Bahri∗y, Roman Novak , Samuel S. Schoenholz, Jeffrey Pennington, Jascha Sohl-Dickstein Google Brain fjaehlee, yasamanb, romann, schsam, jpennin, [email protected] ABSTRACT It has long been known that a single-layer fully-connected neural network with an i.i.d. prior over its parameters is equivalent to a Gaussian process (GP), in the limit of infinite network width. This correspondence enables exact Bayesian inference for infinite width neural networks on regression tasks by means of evaluating the corresponding GP. Recently, kernel functions which mimic multi-layer random neural networks have been developed, but only outside of a Bayesian framework. As such, previous work has not identified that these kernels can be used as co- variance functions for GPs and allow fully Bayesian prediction with a deep neural network. In this work, we derive the exact equivalence between infinitely wide deep net- works and GPs. We further develop a computationally efficient pipeline to com- pute the covariance function for these GPs. We then use the resulting GPs to per- form Bayesian inference for wide deep neural networks on MNIST and CIFAR- 10. We observe that trained neural network accuracy approaches that of the corre- sponding GP with increasing layer width, and that the GP uncertainty is strongly correlated with trained network prediction error. We further find that test perfor- mance increases as finite-width trained networks are made wider and more similar to a GP, and thus that GP predictions typically outperform those of finite-width networks.
    [Show full text]
  • Local Conditioning in Dawson–Watanabe Superprocesses
    The Annals of Probability 2013, Vol. 41, No. 1, 385–443 DOI: 10.1214/11-AOP702 c Institute of Mathematical Statistics, 2013 LOCAL CONDITIONING IN DAWSON–WATANABE SUPERPROCESSES By Olav Kallenberg Auburn University Consider a locally finite Dawson–Watanabe superprocess ξ =(ξt) in Rd with d ≥ 2. Our main results include some recursive formulas for the moment measures of ξ, with connections to the uniform Brown- ian tree, a Brownian snake representation of Palm measures, continu- ity properties of conditional moment densities, leading by duality to strongly continuous versions of the multivariate Palm distributions, and a local approximation of ξt by a stationary clusterη ˜ with nice continuity and scaling properties. This all leads up to an asymptotic description of the conditional distribution of ξt for a fixed t> 0, given d that ξt charges the ε-neighborhoods of some points x1,...,xn ∈ R . In the limit as ε → 0, the restrictions to those sets are conditionally in- dependent and given by the pseudo-random measures ξ˜ orη ˜, whereas the contribution to the exterior is given by the Palm distribution of ξt at x1,...,xn. Our proofs are based on the Cox cluster representa- tions of the historical process and involve some delicate estimates of moment densities. 1. Introduction. This paper may be regarded as a continuation of [19], where we considered some local properties of a Dawson–Watanabe super- process (henceforth referred to as a DW-process) at a fixed time t> 0. Recall that a DW-process ξ = (ξt) is a vaguely continuous, measure-valued diffu- d ξtf µvt sion process in R with Laplace functionals Eµe− = e− for suitable functions f 0, where v = (vt) is the unique solution to the evolution equa- 1 ≥ 2 tion v˙ = 2 ∆v v with initial condition v0 = f.
    [Show full text]
  • Cox Process Functional Learning Gérard Biau, Benoît Cadre, Quentin Paris
    Cox process functional learning Gérard Biau, Benoît Cadre, Quentin Paris To cite this version: Gérard Biau, Benoît Cadre, Quentin Paris. Cox process functional learning. Statistical Inference for Stochastic Processes, Springer Verlag, 2015, 18 (3), pp.257-277. 10.1007/s11203-015-9115-z. hal- 00820838 HAL Id: hal-00820838 https://hal.archives-ouvertes.fr/hal-00820838 Submitted on 6 May 2013 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Cox Process Learning G´erard Biau Universit´ePierre et Marie Curie1 & Ecole Normale Sup´erieure2, France [email protected] Benoˆıt Cadre IRMAR, ENS Cachan Bretagne, CNRS, UEB, France3 [email protected] Quentin Paris IRMAR, ENS Cachan Bretagne, CNRS, UEB, France [email protected] Abstract This article addresses the problem of supervised classification of Cox process trajectories, whose random intensity is driven by some exoge- nous random covariable. The classification task is achieved through a regularized convex empirical risk minimization procedure, and a nonasymptotic oracle inequality is derived. We show that the algo- rithm provides a Bayes-risk consistent classifier. Furthermore, it is proved that the classifier converges at a rate which adapts to the un- known regularity of the intensity process.
    [Show full text]
  • Gaussian Process Dynamical Models for Human Motion
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 30, NO. 2, FEBRUARY 2008 283 Gaussian Process Dynamical Models for Human Motion Jack M. Wang, David J. Fleet, Senior Member, IEEE, and Aaron Hertzmann, Member, IEEE Abstract—We introduce Gaussian process dynamical models (GPDMs) for nonlinear time series analysis, with applications to learning models of human pose and motion from high-dimensional motion capture data. A GPDM is a latent variable model. It comprises a low- dimensional latent space with associated dynamics, as well as a map from the latent space to an observation space. We marginalize out the model parameters in closed form by using Gaussian process priors for both the dynamical and the observation mappings. This results in a nonparametric model for dynamical systems that accounts for uncertainty in the model. We demonstrate the approach and compare four learning algorithms on human motion capture data, in which each pose is 50-dimensional. Despite the use of small data sets, the GPDM learns an effective representation of the nonlinear dynamics in these spaces. Index Terms—Machine learning, motion, tracking, animation, stochastic processes, time series analysis. Ç 1INTRODUCTION OOD statistical models for human motion are important models such as hidden Markov model (HMM) and linear Gfor many applications in vision and graphics, notably dynamical systems (LDS) are efficient and easily learned visual tracking, activity recognition, and computer anima- but limited in their expressiveness for complex motions. tion. It is well known in computer vision that the estimation More expressive models such as switching linear dynamical of 3D human motion from a monocular video sequence is systems (SLDS) and nonlinear dynamical systems (NLDS), highly ambiguous.
    [Show full text]
  • Modelling Multi-Object Activity by Gaussian Processes 1
    LOY et al.: MODELLING MULTI-OBJECT ACTIVITY BY GAUSSIAN PROCESSES 1 Modelling Multi-object Activity by Gaussian Processes Chen Change Loy School of EECS [email protected] Queen Mary University of London Tao Xiang E1 4NS London, UK [email protected] Shaogang Gong [email protected] Abstract We present a new approach for activity modelling and anomaly detection based on non-parametric Gaussian Process (GP) models. Specifically, GP regression models are formulated to learn non-linear relationships between multi-object activity patterns ob- served from semantically decomposed regions in complex scenes. Predictive distribu- tions are inferred from the regression models to compare with the actual observations for real-time anomaly detection. The use of a flexible, non-parametric model alleviates the difficult problem of selecting appropriate model complexity encountered in parametric models such as Dynamic Bayesian Networks (DBNs). Crucially, our GP models need fewer parameters; they are thus less likely to overfit given sparse data. In addition, our approach is robust to the inevitable noise in activity representation as noise is modelled explicitly in the GP models. Experimental results on a public traffic scene show that our models outperform DBNs in terms of anomaly sensitivity, noise robustness, and flexibil- ity in modelling complex activity. 1 Introduction Activity modelling and automatic anomaly detection in video have received increasing at- tention due to the recent large-scale deployments of surveillance cameras. These tasks are non-trivial because complex activity patterns in a busy public space involve multiple objects interacting with each other over space and time, whilst anomalies are often rare, ambigu- ous and can be easily confused with noise caused by low image quality, unstable lighting condition and occlusion.
    [Show full text]
  • Financial Time Series Volatility Analysis Using Gaussian Process State-Space Models
    Financial Time Series Volatility Analysis Using Gaussian Process State-Space Models by Jianan Han Bachelor of Engineering, Hebei Normal University, China, 2010 A thesis presented to Ryerson University in partial fulfillment of the requirements for the degree of Master of Applied Science in the Program of Electrical and Computer Engineering Toronto, Ontario, Canada, 2015 c Jianan Han 2015 AUTHOR'S DECLARATION FOR ELECTRONIC SUBMISSION OF A THESIS I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, including any required final revisions, as accepted by my examiners. I authorize Ryerson University to lend this thesis to other institutions or individuals for the purpose of scholarly research. I further authorize Ryerson University to reproduce this thesis by photocopying or by other means, in total or in part, at the request of other institutions or individuals for the purpose of scholarly research. I understand that my dissertation may be made electronically available to the public. ii Financial Time Series Volatility Analysis Using Gaussian Process State-Space Models Master of Applied Science 2015 Jianan Han Electrical and Computer Engineering Ryerson University Abstract In this thesis, we propose a novel nonparametric modeling framework for financial time series data analysis, and we apply the framework to the problem of time varying volatility modeling. Existing parametric models have a rigid transition function form and they often have over-fitting problems when model parameters are estimated using maximum likelihood methods. These drawbacks effect the models' forecast performance. To solve this problem, we take Bayesian nonparametric modeling approach.
    [Show full text]
  • Gaussian-Random-Process.Pdf
    The Gaussian Random Process Perhaps the most important continuous state-space random process in communications systems in the Gaussian random process, which, we shall see is very similar to, and shares many properties with the jointly Gaussian random variable that we studied previously (see lecture notes and chapter-4). X(t); t 2 T is a Gaussian r.p., if, for any positive integer n, any choice of coefficients ak; 1 k n; and any choice ≤ ≤ of sample time tk ; 1 k n; the random variable given by the following weighted sum of random variables is Gaussian:2 T ≤ ≤ X(t) = a1X(t1) + a2X(t2) + ::: + anX(tn) using vector notation we can express this as follows: X(t) = [X(t1);X(t2);:::;X(tn)] which, according to our study of jointly Gaussian r.v.s is an n-dimensional Gaussian r.v.. Hence, its pdf is known since we know the pdf of a jointly Gaussian random variable. For the random process, however, there is also the nasty little parameter t to worry about!! The best way to see the connection to the Gaussian random variable and understand the pdf of a random process is by example: Example: Determining the Distribution of a Gaussian Process Consider a continuous time random variable X(t); t with continuous state-space (in this case amplitude) defined by the following expression: 2 R X(t) = Y1 + tY2 where, Y1 and Y2 are independent Gaussian distributed random variables with zero mean and variance 2 2 σ : Y1;Y2 N(0; σ ). The problem is to find the one and two dimensional probability density functions of the random! process X(t): The one-dimensional distribution of a random process, also known as the univariate distribution is denoted by the notation FX;1(u; t) and defined as: P r X(t) u .
    [Show full text]
  • Gaussian Markov Processes
    C. E. Rasmussen & C. K. I. Williams, Gaussian Processes for Machine Learning, the MIT Press, 2006, ISBN 026218253X. c 2006 Massachusetts Institute of Technology. www.GaussianProcess.org/gpml Appendix B Gaussian Markov Processes Particularly when the index set for a stochastic process is one-dimensional such as the real line or its discretization onto the integer lattice, it is very interesting to investigate the properties of Gaussian Markov processes (GMPs). In this Appendix we use X(t) to define a stochastic process with continuous time pa- rameter t. In the discrete time case the process is denoted ...,X−1,X0,X1,... etc. We assume that the process has zero mean and is, unless otherwise stated, stationary. A discrete-time autoregressive (AR) process of order p can be written as AR process p X Xt = akXt−k + b0Zt, (B.1) k=1 where Zt ∼ N (0, 1) and all Zt’s are i.i.d. Notice the order-p Markov property that given the history Xt−1,Xt−2,..., Xt depends only on the previous p X’s. This relationship can be conveniently expressed as a graphical model; part of an AR(2) process is illustrated in Figure B.1. The name autoregressive stems from the fact that Xt is predicted from the p previous X’s through a regression equation. If one stores the current X and the p − 1 previous values as a state vector, then the AR(p) scalar process can be written equivalently as a vector AR(1) process. Figure B.1: Graphical model illustrating an AR(2) process.
    [Show full text]
  • FULLY BAYESIAN FIELD SLAM USING GAUSSIAN MARKOV RANDOM FIELDS Huan N
    Asian Journal of Control, Vol. 18, No. 5, pp. 1–14, September 2016 Published online in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/asjc.1237 FULLY BAYESIAN FIELD SLAM USING GAUSSIAN MARKOV RANDOM FIELDS Huan N. Do, Mahdi Jadaliha, Mehmet Temel, and Jongeun Choi ABSTRACT This paper presents a fully Bayesian way to solve the simultaneous localization and spatial prediction problem using a Gaussian Markov random field (GMRF) model. The objective is to simultaneously localize robotic sensors and predict a spatial field of interest using sequentially collected noisy observations by robotic sensors. The set of observations consists of the observed noisy positions of robotic sensing vehicles and noisy measurements of a spatial field. To be flexible, the spatial field of interest is modeled by a GMRF with uncertain hyperparameters. We derive an approximate Bayesian solution to the problem of computing the predictive inferences of the GMRF and the localization, taking into account observations, uncertain hyperparameters, measurement noise, kinematics of robotic sensors, and uncertain localization. The effectiveness of the proposed algorithm is illustrated by simulation results as well as by experiment results. The experiment results successfully show the flexibility and adaptability of our fully Bayesian approach ina data-driven fashion. Key Words: Vision-based localization, spatial modeling, simultaneous localization and mapping (SLAM), Gaussian process regression, Gaussian Markov random field. I. INTRODUCTION However, there are a number of inapplicable situa- tions. For example, underwater autonomous gliders for Simultaneous localization and mapping (SLAM) ocean sampling cannot find usual geometrical models addresses the problem of a robot exploring an unknown from measurements of environmental variables such as environment under localization uncertainty [1].
    [Show full text]