Bayesian Filtering with Online Gaussian Process Latent Variable Models

Bayesian Filtering with Online Gaussian Process Latent Variable Models Yali Wang Marcus A. Brubaker Brahim Chaib-draa Raquel Urtasun Laval University TTI Chicago Laval University University of Toronto [email protected] [email protected] [email protected] [email protected] Abstract cessful model in the context of modelling human motion is the Gaussian process latent variable model (GPLVM) [12], In this paper we present a novel non-parametric where the non-linear mapping between the latent space and approach to Bayesian filtering, where the predic- the high dimensional space is modeled with a Gaussian tion and observation models are learned in an process. This provides powerful prior models, which have online fashion. Our approach is able to han- been employed for character animation [28, 26, 15] and hu- dle multimodal distributions over both models by man body tracking [24, 16, 25]. employing a mixture model representation with In the context of tracking, one is interested in estimating Gaussian Processes (GP) based components. To the state of a dynamic system. The most commonly used cope with the increasing complexity of the esti- technique for state estimation is Bayesian filtering, which mation process, we explore two computationally recursively estimates the posterior probability of the state efficient GP variants, sparse online GP and local of the system. The two key components in the filter are GP, which help to manage computation require- the prediction model, which describes the temporal evolu- ments for each mixture component. Our exper- tion of the process, as well as the observation model which iments demonstrate that our approach can track links the state and the observations. A parametric form is human motion much more accurately than exist- typically employed for both models. ing approaches that learn the prediction and observation models offline and do not update these Ko and Fox [10] introduced the GP-BayesFilter, which models with the incoming data stream. defines the prediction and observation models in a non- parametric way via Gaussian processes. This approach is well suited when accurate parametric models are difficult 1 INTRODUCTION to obtain. Its main limitation, however, resides in the fact that it requires ground truth states (as GPs are supervised), which are typically not available. GPLVMs were employed Many real world problems involve high dimensional data. in [11] to learn the latent space in an unsupervised manner, In this paper we are interested in modeling and tracking bypassing the need for labeled data. This, however, can not human motion. In this setting, dimensionality reduction exploit the incoming stream of data available in the online techniques are widely employed to avoid the curse of di- setting as the latent space is learned offline. Furthermore, mensionality. only unimodal prediction and observation models can be Linear approaches such as principle component analysis captured due to the fact that the models learned by GP are (PCA) are very popular as they are simple to use. However, nonlinear but Gaussian. they often fail to capture complex dependencies due to their In this paper we extend the previous non-parametric filters assumption of linearity. Non-linear dimensionality reduc- to learn the latent space in an online fashion as well as to tion techniques that attempt to preserve the local structure handle multimodal distributions for both the prediction and of the manifold (e.g., Isomap [21, 8], LLE [19, 14]) can observation models. Towards this goal, we employ a mix- capture more complex dependencies, but often suffer when ture model representation in the particle filtering frame- the manifold assumptions are violated, e.g., in the presence work. For the mixture components, we investigate two of noise. computationally efficient GP variants which can update the Probabilistic latent variable models have the advantage of prediction and observation models in an online fashion, and being able to take the uncertainties into account when cope with the growth in complexity as the number of data learning the latent representations. Perhaps the most suc- points increases over time. More specifically, the sparse online GP [3] selects the active set in a online fashion to Recently, a number of GP-based Bayesian filters were pro- efficiently maintain sparse approximations to the models. posed by learning the prediction and observation models Alternatively, the local GP [26] reduces the computation using GP regression [10, 4]. This is a promising alternative by imposing local sparsity. as GPs are non-parametric and can capture complex map- pings. However, training these methods requires access to We demonstrate the effectiveness of our approach on a ground truth data before filtering. Unfortunately, the in- wide variety of motions, and show that both approaches puts of the training set are the hidden states which are not perform better than existing algorithms. In the remainder always known in real-world applications. Two extensions of the paper we first present a review on Bayesian filter- were introduced to learn the hidden states of the training ing and the GPLVM. We then introduce our algorithm and set via a non-linear latent variable model [11] or a sparse show our experimental evaluation followed by the conclu- pseudo-input GP regression [22]. However, these methods sions. require offline learning procedures, which are not able to exploit the incoming data streams. In contrast, we propose 2 BACKGROUND two non-parametric particle filters that are able to exploit the incoming data to learn better models in an online fash- In this section we review Bayesian filtering and Gaussian ion. process latent variable models. 2.1 BAYESIAN FILTERING 2.2 GAUSSIAN PROCESS DYNAMICAL MODEL Bayesian filtering is a sequential inference technique typi- The Gaussian Process Latent Variable Model (GPLVM) is cally employed to perform state estimation in dynamic sys- a probabilistic dimensionality reduction technique, which tems. Specifically, the goal is to recursively compute the places a GP prior on the observation model [12]. Wang posterior distribution of the current hidden state xt given et al. [28] proposed the Gaussian Process Dynamical the history of observations y1:t = (y1;:::; yt) up to the Model (GPDM), which enriches the GPLVM to capture current time step temporal structure by incorporating a GP prior over the dy- Z namics in the latent space. Formally, the model is: p(xtjy1:t) / p(ytjxt) p(xtjxt−1)p(xt−1jy1:t−1)dxt−1 xt = fx(xt−1) + ηx where p(xtjxt−1) is the prediction model that represents yt = fy(xt) + ηy the system dynamics, and p(ytjxt) is the observation model that represents the likelihood of an observation yt given the state x . Dy t where y 2 R represents the observation and x 2 Dx One of the most fundamental Bayesian filters is the Kalman R the latent state, with Dy Dx. The noise pro- 2 filter, which is a maximum-a-posteriori estimator for linear cesses are assumed to be Gaussian ηx ∼ N (0; σxI) and 2 i i and Gaussian models. Unfortunately, it is often not applica- ηy ∼ N (0; σyI). The nonlinear functions fx and fy i 0 i ble in practice since most real dynamical systems are non- have GP priors, i.e., fx ∼ GP(0; kx(x; x )) and fy ∼ 0 linear and/or non-Gaussian. Two popular extensions for GP(0; ky(x; x )) where kx(·; ·) and ky(·; ·) are the kernel non-linear systems are the extended Kalman filter (EKF) functions. For simplicity, we denote the hyperparameters and the unscented Kalman filter (UKF) [9]. However, sim- of the kernel functions by θ. ilar to the Kalman filter, the performance of EKF and UKF Let x1:T = (x1; ··· ; xT ) be the latent space coordi- is poor when the models are multimodal [5]. 0 0 nates from time t = 1 to time t = T0. GPDM is In contrast, particle filters that are not restricted to lin- typically learned by minimizing the negative log poste- ear and Gaussian models have been developed by using rior − log(p(x1:T0 ; θjy1:T0 )) with respect to x1:T0 , and θ sequential Monte Carlo sampling to represent the under- [28]. After x1:T0 and θ are obtained, a standard GP pre- lying posterior p(xtjy1:t) [5]. More specifically, at each diction is used to construct the model p(xtjxt−1; θ; XT0 ) T0 time step, Np particles of xt are drawn from the prediction and p(ytjxt; θ; YT0 ) with data XT0 = f(xk−1; xk)gk=2 T0 model p(xtjxt−1), and then all the particles are weighted and YT0 = f(xk; yk)gk=1. Tracking (t > T0) is then per- according to the observation model p(ytjxt). The posterior formed assuming the model is fixed and can be done using, p(xtjy1:t) is approximated using these Np weighted parti- e.g., a particle filter as described above. The major draw- cles. Finally, the Np particles are resampled for the next back of this approach is that it is not able to adapt to new step. Unfortunately, the parametric description of the dy- observations during tracking. As shown in our experimen- namic models limits the estimation accuracy of Bayesian tal evaluation, this results in poor performance when the filters. training set is small. 3 ONLINE GP PARTICLE FILTER Algorithm 1 Online GP-Particle Filter 1: Initialize model parameters Θ based on y1:T0 (1:N ) In order to solve the above-mentioned difficulties in learn- 2: Initialize particle set x P based on y T0 1:T0 ing and filtering with dynamic systems, we propose an 3: for t = T0 + 1 to T do Online GP Particle Filter framework to learn and refine 4: for i = 1 to Np do the model during tracking, i.e., the prediction p(xtjxt−1) (i) (i) 5: x ∼ p(xtjx ; Θt−1;M ) and observation p(y jx ) models are updated online in t t−1 t t w^(i) = p(y jx(i); Θ ) the particle filtering framework.

Bayesian Filtering with Online Gaussian Process Latent Variable Models

Scalable Nonparametric Bayesian Inference on Point Processes with Gaussian Processes

Deep Neural Networks As Gaussian Processes

Gaussian Process Dynamical Models for Human Motion

Modelling Multi-Object Activity by Gaussian Processes 1

Financial Time Series Volatility Analysis Using Gaussian Process State-Space Models

Gaussian-Random-Process.Pdf

Gaussian Markov Processes

FULLY BAYESIAN FIELD SLAM USING GAUSSIAN MARKOV RANDOM FIELDS Huan N

Mean Field Methods for Classification with Gaussian Processes

When Gaussian Process Meets Big Data

Stochastic Optimal Control Using Gaussian Process Regression Over Probability Distributions

The Variational Gaussian Process