An algorithm for non-parametric estimation in state-space models
Thi Tuyet Trang Chaua,1,∗, Pierre Ailliotb, Valerie´ Monbeta
aIRMAR-INRIA, University of Rennes, Rennes, France bUniv Brest, CNRS, LMBA - UMR 6205, Brest, France
Abstract State-space models are ubiquitous in the statistical literature since they pro- vide a flexible and interpretable framework for analyzing many time series. In most practical applications, the state-space model is specified through a paramet- ric model. However, the specification of such a parametric model may require an important modeling effort or may lead to models which are not flexible enough to reproduce all the complexity of the phenomenon of interest. In such situation, an appealing alternative consists in inferring the state-space model directly from the data using a non-parametric framework. The recent developments of power- ful simulation techniques have permitted to improve the statistical inference for parametric state-space models. It is proposed to combine two of these techniques, namely the Stochastic Expectation-Maximization (SEM) algorithm and Sequen- tial Monte Carlo (SMC) approaches, for non-parametric estimation in state-space models. The performance of the proposed algorithm is assessed though simula- tions on toy models and an application to environmental data is discussed. Keywords: State-space models, Non-parametric statistics, SEM algorithm, Local linear regression, Conditional particle filter
arXiv:2006.09525v1 [stat.ME] 16 Jun 2020 ∗Corresponding author Email addresses: [email protected] (Thi Tuyet Trang Chau ), [email protected] (Pierre Ailliot), [email protected] (Valerie´ Monbet) 1Present address: LSCE, CEA Saclay, 91191 Gif-sur-Yvette cedex, France
Preprint submitted to Elsevier June 18, 2020 1. Introduction
State-space models (SSMs) provide a natural framework to study time se- ries with observational noise in environment, economy, computer sciences, etc. They have a wide range of applications in data assimilation, system identifica- tion, model control, change detection, missing-data imputation [see e.g. 16]. The general SSM which is considered in this paper is defined through the following equations,
( Xt = m(Xt−1,Zt) + ηt, [hidden] (1a)
Yt = Ht(Xt) + εt, [observed]. (1b)
The dynamical model m describes the time evolution of the latent process
{Xt}. It may depend on some covariates (or control) denoted {Zt}. The opera- tor Ht links the latent state to the observations {Yt}. The random sequences {ηt} and {εt} model respectively the random components in the dynamical model and the observational error. Throughout this paper, we make the classical assump- tions that Ht is known (typically Ht(x) = x) and that {ηt} and {εt} are inde- iid pendent sequences of Gaussian distributions such that ηt ∼ N (0,Qt(θ)) and iid εt ∼ N (0,Rt(θ)) where θ denotes the parameters involved in the parameteri- zation of the covariance matrices. In this paper, we are interested in situations where the dynamical model m is unknown or numerically intractable. To deal with this issue, a classical approach consists in using a simpler parametric model to replace m. However, it is generally difficult to find an appropriate parametric model which can reproduce all the com-
2 plexity of the phenomenon of interest. In order to enhance the flexibility of the methodology and simplify the modeling procedure, non-parametric approaches have been proposed to estimate m. Such non-parametric SSMs were originally introduced in Tandeo et al. [32], Lguensat et al. [24] for data assimilation in oceanography and meteorology. In these application fields, a huge amount of historical data sets recorded using remote and in-situ sensors or obtained through numerical simulations are now available and this promotes the development of data-driven approaches. It was proposed to build a non-parametric estimate mb of m using the available obser- vations and plug this non-parametric estimate into usual filtering and smoothing algorithms to reconstruct the latent space X1:T = (X1,...,XT ) given observations y1:T = (y1,...,yT ). Numerical experiments on toy models show that replacing m by mb leads to similar results if the sample size used to estimate m is large enough to ensure that mb is ”close enough” to m. Some applications to real data are dis- cussed in Fablet et al. [18]. Various non-parametric estimation methods have been considered to build sur- rogate nonlinear dynamical models in oceanography and meteorology. The more natural one is probably the nearest neighbors method known as the Nadaraya- Watson approach in statistics [19] and analog methods in meteorology [35]. In [24], better results were obtained with a slightly more sophisticated estimator known as local linear regression (LLR) in statistics [12] and constructed analogs in meteorology [33]. More recently, it has been proposed to use other machine learning (ML) tools such as deep learning [see5] or sparse regression [see7] to
3 better handle high dimensional data and mimic the behaviour of numerical meth- ods used to approximate the solutions of physical models. In the above mentioned references, it is generally assumed that a sequence x1:T = (x1,...,xT ) of ”perfect” observations with no observational error is avail- able to estimate the dynamical model m. However, in practical applications, only a sequence y1:T of the process {Yt} with observational errors is given to fit the model. The main contribution of this paper is to propose a method to build non- parametric estimate of m in this context. A simple approach would consist in ”forgetting” the observational errors and computing directly a non-parametric es- timate based on the sequence y1:T but this may lead to biased estimates. This is illustrated on Figure1 obtained with the toy SSM defined as
iid Xt = sin(3Xt−1) + ηt, ηt ∼ N (0,Q) (2) iid Yt = Xt + εt, εt ∼ N (0,R)
with Q = R = 0.1. The left plot shows the scatter plot (xt−1,xt) for a simu- lated sequence (x1,...,xT ) of the latent process {Xt} and the corresponding non- parametric estimate mb based on this sample which is reasonably close to m. The right plot shows the scatter plot (yt−1,yt) of the corresponding sequence with ob- servation noise. Note that Yt is obtained by adding a random noise to Xt and this has the effect of blurring the scatter plot by moving the points both horizontally and vertically. It leads to a biased estimate of m when the non-parametric esti- mation method is computed on this sequence. In a regression context, it is well
4 known from the literature on errors-in-variables models that observational errors in covariates lead, in most cases, to a bias towards zero of the estimator of the regression function [see 10]. One of the classical approaches to reduce the bias is to introduce instrumental variables which help to get information about the obser- vational error. This approach has been adapted to linear first order auto-regressive models in Meijer et al. [28] and further studied in Lee et al. [23]. Carroll et al. [10] gives an overview of different methods to build consistent estimators in the context of regression. Among them, we notice the local polynomial regression and the Bayesian method for non-parametric estimation but, as far as we know, they have not been generalized for time series.