Kernel Regression for Travel Time Estimation Via Convex Optimization

Joint 48th IEEE Conference on Decision and Control and ThB05.4 28th Chinese Control Conference Shanghai, P.R. China, December 16-18, 2009 Kernel regression for travel time estimation via convex optimization Sébastien Blandin , Laurent El Ghaoui and Alexandre Bayen Abstract—We develop an algorithm aimed at estimating travel Sensors such as loop detectors are widely available on time on segments of a road network using a convex optimiza- highways. They provide accurate measurements of density tion framework. Sampled travel time from probe vehicles are or speed [13], but are extremely sparse on arterials. Thus assumed to be known and serve as a training set for a machine learning algorithm to provide an optimal estimate of the travel hardly any information of arterial traffic is available in real- time for all vehicles. A kernel method is introduced to allow for time. It is only recently that the growth of mobile sensors a non-linear relation between the known entry times and the has been shown to offer reliable information for traffic travel times that we want to estimate. To improve the quality monitoring [12], [29]. One can hope to use this source of of the estimate we minimize the estimation error over a convex information to provide accurate estimate of travel time on combination of known kernels. This problem is shown to be a semi-definite program. A rank-one decomposition is used to arterials. convert it to a linear program which can be solved efficiently. Travel time estimation on highways has been investigated with different tools. Efforts have been made from the model- I. INTRODUCTION ing side, assuming local knowledge of density or speed, and producing an estimate given by a deterministic or stochastic Travel time estimation on transportation networks is a model [5], [14], [20]. This problem has also been addressed valuable traffic metric. It is readily understood by practi- using data analysis and machine learning techniques with tioners, and can be used as a performance measure [3] for various types of learning methods [18], [19], [21], [28], [30]. traffic monitoring applications. Note however that the travel Arterial travel time estimation is more complex because time estimation problem is easier to address on highways the continuum approximation of the road might not apply at than on arterial roads. This can be intuitively interpreted intersections, where the dynamics is not easily modeled [1]. by the fact that properties of highways can be considered Information about the state of traffic on arterials is also lim- to be ‘more spatially invariant’ than the ones of arterial ited because of the sparsity of sensors. Some attempts have roads. Indeed the latter present complex features such as been made to estimate travel time on arterials, but in practice intersections and signalization forcing to stop resulting in it is not always possible to know the traffic lights cycles or spatially discontinuous properties. In this article, we propose to obtain a dedicated fleet of probe vehicles, often needed a new method to estimate travel time on road segments for estimation [24], [25]. However the ubiquity of GPS now without any elaborated model assumption. This method is enables one to realistically assume the knowledge of sampled shown to belong to a specific class of convex optimization travel times, an assumption for example verified in sections of problems and provides a non-linear estimate of the travel Northern California with the Mobile Millennium system [11]. time. The kernel regression method introduced allows for We propose to focus on arterial travel time estimation estimation improvement through the online extension of the using machine learning techniques and convex optimization. set of kernels used. In particular, we assess the performance We use kernel methods [23] to provide a non-linear estimate of this technique through a rank one kernel decomposition. of travel time on an arterial road segment. We assume the Highway traffic modeling is a mature field. Macroscopic knowledge of the travel times of a subset of vehicles and models date back to [10], [16], [22] and usually fall under the estimate the travel time of all vehicles. We use convex theory of scalar conservation laws [9]. Microscopic models optimization [2] to improve the performance of the non-linear take into account vehicle driving behaviors and can be estimate through kernel regression. The regression is done derived from the car-following model [17]. Flow models and on a set of kernels chosen according to their usually good driving behavior models on arterials are still the focus of performances, or physical criteria. The kernel framework [6] significant ongoing research. enables the addition of features to the set of covariates in the regression problem. The kernel regression gives the possibil- S. Blandin is a Ph.D. student, Systems Engineering, Department of Civil ity to select the most relevant features via optimization. and Environmental Engineering, University of California, Berkeley, CA This article is organized as follows. In section II, we 94720-1710 USA (e-mail: [email protected]). Corresponding author. L. El Ghaoui is a Professor, Department of Electrical Engineering and describe the optimization problem, introducing the regular- Computer Science, University of California Berkeley, CA 94720-1710 USA ization parameter and the kernel in a learning setting. In (e-mail: [email protected]). section III, we pose the kernel regression problem and show A. Bayen is an Assistant Professor, Systems Engineering, Department of Civil and Environmental Engineering, University of California Berkeley, CA that it can be written as a convex optimization problem, 94720-1710 USA (e-mail: [email protected]). transform it into a linear program, which can be solved 978-1-4244-3872-3/09/$25.00 ©2009 IEEE 4360 ThB05.4 efficiently and we describe the general learning algorithm For ρ large enough, the problem is well-posed and over-fitting used. In section IV, we present the simulation dataset used with respect to θ is prevented [8]. for validating the method and the results obtained. In particular, we discuss the theoretical results stating that kernel C. Kernel methods regression enables to obtain a better estimate on the validation The regression method described in section II-A in the set. Finally, based on these results, we enumerate in section V linear case can be extended to the non-linear case through ongoing extensions to this work. the use of a kernel. One can consider a mapping function φ(·) and consider the linear regression problem between the II. PROBLEM STATEMENT mapped covariates φ(xi) and the outputs yi. This is the A. Travel time estimation main principle of kernel methods, which consist in using We investigate travel time estimation for a given road a feature space, in which the dataset is represented, and segment. Assuming a set of entry times and travel times on to consider linear relations between objects in this feature the section, we apply machine learning techniques to use space, and between these features and the outputs. Given a the knowledge of a subset of the pairs entry time, travel positive semi-definite matrix K = (kij ), we define the kernel R time, in order to produce an estimate of travel time for every function by Kf : X × X → such that Kf(xi, xj ) = kij. entry time. The dataset used for validation is described later This implicitly defines a feature mapping φ(·) between the in section IV-A. We assume the knowledge of a dataset of input set X and a Hilbert space H by φ(·) : X → H such + + size N which reads S = {(xi, yi) ∈ R × R |i = 1 ··· N} that hφ(xi), φ(xj )iH = Kf(xi, xj). In the following we note where for each value of the index i = 1 ··· N, xi is an entry Xmap a matrix representation of the mapping φ(·) (thus the time on the road section and yi is the realized travel time i-th column of Xmap is φ(xi)). When φ(·) has scalar values, (known as the a-posteriori travel time in the transportation Xmap is a row vector. community) for entry time xi. We would like to learn a Remark 1: One does not have to define a mapping func- function h : R+ → R+ which given S, would provide an tion φ(·) to define a kernel matrix, but can simply consider a estimate of the travel time y for any x ∈ R+. This is a positive semi-definite matrix and use it as a kernel. It is also typical regression problem as described in [6]. The well- possible to define a kernel matrix from a mapping φ(·) and T known unconstrained least-squares method can be formulated one of its matrix representation Xmap as K = Xmap Xmap. as an optimization problem: The inner product in H naturally appears to be given by the Gram matrix K, called the kernel. Kernel techniques [6], T 2 minθ ky − x θk2 (1) [23] have several benefits: 1 where y ∈ RN× is the vector of realized travel time or • They enable to work with any types of features of the output, xT ∈ RN×1 is the vector of entry time or input. One initial data-set, which has a priori no particular structure, must note that here x is a row vector and y is a column vector in a Hilbert space. so θ is a scalar. The well-known solution of this problem can • They guarantee a reasonable computational cost for be computed as: the algorithm by allowing a complexity related to the number of points represented and not the number of T † θopt = (x x ) x y (2) features used (this is known as the kernel trick and is described in Remark 4).

Load more