Beijing City Lab
Total Page:16
File Type:pdf, Size:1020Kb
Beijing City Lab Wang J, Gu Q, Wu J, Xiong Z, 2016, Traffic speed prediction and congestion source exploration: A deep learning method. Beijing City Lab. Working paper #75 Traffic Speed Prediction and Congestion Source Exploration: A Deep Learning Method Jingyuan Wang†, Qian Gu†, Junjie Wu‡, Zhang Xiong† † ACTA Engineering Research Center, School of Computer Science and Engineering, Beihang Univeristy, Beijing 100191, China ‡ Information Systems Department, School of Economics and Management, Beihang University, Beijing 100191, China Email: {jywang, gqian, wujj, xiongz}@buaa.edu.cn Abstract—Traffic speed prediction is a long-standing and • How to model the abrupt changes of traffic speeds in case critically important topic in the area of Intelligent Transportation of emerging events such as morning peaks and traffic Systems (ITS). Recent years have witnessed the encouraging accidents? potentials of deep neural networks for real-life applications of various domains. Traffic speed prediction, however, is still These indeed motivate our study. Specifically, in this paper, in its initial stage without making full use of spatio-temporal we propose a deep learning method using an Error-feedback traffic information. In light of this, in this paper, we propose Recurrent Convolutional Neural Network (eRCNN) for contin- a deep learning method with an Error-feedback Recurrent Convolutional Neural Network structure (eRCNN) for continuous uous traffic speed prediction. The novel contributions of our traffic speed prediction. By integrating the spatio-temporal traffic study are summarized as follows. speeds of contiguous road segments as an input matrix, eRCNN First, we take the matrix containing the spatio-temporal explicitly leverages the implicit correlations among nearby seg- traffic speeds of contiguous road segments as the input of ments to improve the predictive accuracy. By further introducing separate error feedback neurons to the recurrent layer, eRCNN eRCNN. By this means, the complicated interactions of traffic learns from prediction errors so as to meet predictive challenges speeds among nearby road segments can be captured by rising from abrupt traffic events such as morning peaks and eRCNN naturally without elaborative characterization, which traffic accidents. Extensive experiments on real-life speed data is crucial to the high-accuracy prediction of eRCNN. of taxis running on the 2nd and 3rd ring roads of Beijing city demonstrate the strong predictive power of eRCNN in Second, we introduce separate error-feedback neurons to comparison to some state-of-the-art competitors. The necessity the recurrent layer of eRCNN, for the purpose of capturing of weight pre-training using a transfer learning notion has also the prediction errors from the output layer. This empowers been testified. More interestingly, we design a novel influence eRCNN the ability to model the abrupt changes in traffic function based on the deep learning model, and showcase how to speeds due to some emerging traffic events like the morning leverage it to recognize the congestion sources of the ring roads in Beijing. peaks and traffic accidents. Third, we put forward a novel weight pre-training method, I. INTRODUCTION which adopts a transfer-learning notion by clustering similar Traffic speed prediction, as a sub-direction of traffic predic- yet contiguous road segments into a group for the generation tion in the area of Intelligent Transportation Systems (ITS), has of a same set of initial weights. This “sharing scheme” not long been regarded as a critically important way for decision only helps to reduce the learning process of eRCNN for every making in transportation navigation, travel scheduling, and road segment, but also improves the chance of finding better traffic management. Traditional models, including autoregres- optimal solutions. sion methods [1] and supervised learning methods such as Finally, we design a novel influence function based on support vector regression [2] and artificial neural networks [3], the deep learning model, and illustrate how to leverage it to all treat traffic speed prediction as a time-series forecasting recognize the congestion sources of the ring roads in Beijing. problem, and thus run into the bottleneck gradually. To the best of our knowledge, we are among the earliest In recent years, with the rapid development of deep learning to explore how to learn road congestion sources from deep techniques, more and more researchers in ITS began to adopt learning models. deep neural networks for high-accuracy traffic prediction. The Extensive experiments on real-life speed data of taxis run- rich studies along this line, however, are mostly concerned ning on the 2nd and 3rd ring roads of Beijing city demonstrate with traffic flow and congestion predictions [4], [5]. Traffic the strong predictive power of eRCNN, even with the pres- speed prediction, therefore, is still an open problem in the ence of state-of-the-art competitors. The inclusion of spatio- deep-learning era, with two notable challenges as follows: temporal information of contiguous segments, the introduction • How to characterize the latent interactions of road seg- of error-feedback neurons to the recurrent layer, and the weight ments in traffic speeds so as to improve the predictive pre-training of similar segments, all give a positive boost to performance of a deep neural network? the high accuracy of eRCNN. The segment to be predicted Traffic direction vs-m,t ĊĊ vs-1,t vs,t vs+1,t ĊĊ vs+m,t v:,t v:,t-1 Ċ Ċ ĊĊ V7= Fig. 1. The framework of the eRCNN model. v:,t-n vs-m,: ĊĊ vs,: ĊĊ vs+m,: II. THE ERCNN FRAMEWORK Fig. 2. The structure of the input spatio-temporal matrix. Fig. 1 shows the framework of the eRCNN model con- taining five network layers, including the input layer (LI), B. The CNN-based Feature Extracting the convolution layer (LC), the pooling layer (LP), the error- In the eRCNN model, we adopt a CNN-based network feedback recurrent layer (LeR), and the output layer (LO). The function of the input layer is to organize the original structure to extract features from the spatio-temporal input traffic speed data as a spatio-temporal input matrix, which matrix. The CNN structure contains a convolution layer and can be processed by the CNN layers of eRCNN. The function a pooling layer, and then we introduce the two layers in this of the convolution layer and the pooling layer is to extract subsection. features from the spatio-temporal input matrix. The function of 1) The Convolution Layer: The convolution layer is a core the error-feedback recurrent layer is to compensate prediction part of the CNN model [6]. The convolution layer connects errors using predicting results of previous periods. The output the spatio-temporal input matrix with several trainable filters, layer uses a modified rectified linear unit to generate the with each being a i×i weight matrix. We define the k-th filter (C) (C) predictions of traffic speeds. as Wk . The convolution layer uses the Wk to zigzag scan the input matrix to calculate a convolution neuron matrix. The (p, q) element of the convolution neuron matrix generated by A. The Spatio-Temporal Input Matrix the filter k is calculated by i i In order to exploit spatial and temporal correlation informa- p,q x,y p+x,q+y tion, we construct a spatio-temporal input matrix in the input ck = σ bk + wk m , (2) x=0 y=0 layer. Given a road segment s, we define the traffic speed of x,y the segment s at the time t as vs,t. When we use the proposed where bk is a bias for the filter k, wk is the (x, y) element of model to predict v , the spatio-temporal matrix for the (C) p+x,q+y s,t+1 Wk , m is the (p + x, q + y) element of the spatio- input layer is defined as temporal matrix V, and σ(·) is a sigmoid activation function defined as: vs−m,t vs−m,t−1 ··· vs−m,t−n 1 σ(x)= . x . ··· . 1+ e v v ··· v More details about the implementation of the CNN’s convo- s−1,t s−1,t−1 s−1,t−n V = vs,t vs,t−1 ··· vs,t−n . (1) lution layer could be found in [7]. vs+1,t vs+1,t−1 ··· vs+1,t−n 2) The Pooling Layer: The pooling layer is another impor- . tant component of the CNN model, which is used to reduce . ··· . the dimension of the convolution neuron matrix through an v v ··· v s+m,t s+m,t−1 s+m,t−n average down sampling method. In the proposed eRCNN As shown in Fig. 2, the column vector v:t contains traffic model, the pooling layer divides the convolution neuron matrix speed data of all the segments in a range that m segments into j×j disjoint regions, and uses the averages of each region upstream and downstream of the segment s at the time t, and to represent the characteristic of the convolution neurons in the row vector vs,: contains traffic speed data of the segment s the region. Through the processing of the pooling layer, the from time t to t−n. In this way, the input matrix V contains all dimension of the spatio-temporal matrix is reduced as about the speed information that is spatially and temporally adjacent 1/(j ×j) of its original size. The output of the pooling layer is to the variate to be predicted, i.e., vs,t+1. a feature vector generated through vectoring the down sampled convolution neuron matrix, which is denoted as p. previous steps together in the same group of neurons as in traditional RNN. On the contrary, the input is connected into C.