Traffic Incident Detection Using Probabilistic Topic Model
Total Page:16
File Type:pdf, Size:1020Kb
Traffic Incident Detection Using Probabilistic Topic Model Akira Kinoshita Atsuhiro Takasu, Jun Adachi The University of Tokyo National Institute of Informatics 2-1-2 Hitotsubashi, Chiyoda, 2-1-2 Hitotsubashi, Chiyoda, Tokyo, Japan Tokyo, Japan [email protected] {takasu,adachi}@nii.ac.jp ABSTRACT economic loss to society. A technology that can detect traf- Traffic congestion is quite common in urban settings, and is fic incidents in real time and alert people accordingly would not always caused by traffic incidents. In this paper, we pro- therefore be a desirable way of reducing these ill e↵ects. pose a simple method for detecting traffic incidents by using Against this background, there have been many studies probe-car data to compare usual and current traffic states, on AID, e.g., [2, 13]. Most of the approaches exploit data thereby distinguishing incidents from spontaneous conges- sent from stationary sensors and cameras installed on roads. tion. First, we introduce a traffic state model based on a However, the installation and maintenance of such sensors probabilistic topic model to describe traffic states for a vari- is expensive, with only the main routes likely to have them ety of roads, deriving formulas for estimating the model pa- [17]. On the other hand, probe-car data (PCD), on which rameters from observed data using an expectation–maximi- we focus in this paper, are becoming increasingly important, zation algorithm. Next, we propose an incident detection as the number of probe cars and the size of the associated method based on our model, which issues an alert when a data archives increase. PCD includes timestamps and the car’s behavior is sufficiently di↵erent from usual. We con- locations of vehicles, and may contain additional values such ducted an experiment with data collected on the Shuto Ex- as the probe cars’ speed and direction. Although a PCD sys- pressway in Tokyo over the 2011 calendar year. The results tem cannot monitor all cars, it enables traffic administrators showed that our method discriminates successfully between to watch a vast area at a lower cost than by using stationary anomalous car trajectories and the more usual, slowly mov- sensors. In addition, a PCD system can follow a probe car’s ing traffic. However, our method does sometimes classify sequence of movements in detail, which is hard to achieve abnormally fast-moving cars as traffic incidents. via stationary sensors, and trajectory mining can be applied to the collected data. Using PCD for freeways, it is easy to detect any reduction Categories and Subject Descriptors in speed, which sometimes implies congestion, by analyzing H.2.8 [Database Management]: Database Applications— the speeds of the probe cars. However, this method is less Data mining, Spatial databases and GIS applicable to local streets where there are many crossings and traffic lights that cause cars to stop frequently but nor- mally. Moreover, speed reduction is not always an abnormal General Terms circumstance, even on freeways, and is not always caused by Algorithms incidents such as accidents, which we would regard as sud- den and unusual traffic events in this paper. Keywords There are two types of congestion: spontaneous and ab- normal [2]. Detecting spontaneous congestion is less impor- Anomaly detection, automatic incident detection, proba- tant, as it originates in road design and urban planning. bilistic topic model, probe-car data, traffic state estimation Any road may have potential bottlenecks, such as upslopes, curves, junctions, tollgates, and narrow sections. Vehicles 1. INTRODUCTION are likely to slow down at the bottlenecks, with vehicular Automatic incident detection (AID) is a crucial technol- gaps shortening and drivers in the following cars having to ogy in intelligent transport systems, particularly in terms brake. Congestion will then occur even without a traffic of reducing congestion on freeways [10]. Traffic incidents of- incident [7]. Spontaneous congestion also occurs when the ten cause traffic congestion, causing great inconvenience and traffic demands exceed the traffic capacity of such bottle- necks, and it is not resolved until the demand drops below the capacity [12]. The drivers may be familiar with the lo- cations of such potential bottlenecks, and they can avoid them. On the other hand, abnormal congestion originates in traffic incidents, which need to be detected in real time to prevent or resolve any sudden heavy congestion. (c) 2014, Copyright is with the authors. Published in the Workshop Pro- In this paper, we propose an AID method for detecting ceedings of the EDBT/ICDT 2014 Joint Conference (March 28, 2014, traffic incidents by discovering abnormal car movements, Athens, Greece) on CEUR-WS.org (ISSN 1613-0073). Distribution of this paper is permitted under the terms of the Creative Commons license CC- distinguishing such movements from those occurring in spon- by-nc-nd 4.0. 323 taneous congestion. Our method measures di↵erences be- tween the current and usual traffic states, and has two as- Table 1: Notation Notation Definition pects; namely, traffic state estimation and anomaly detec- K Number of traffic states. tion. First, we employ a probabilistic topic model [4] to k Index of a traffic state. model generation of PCD, which is influenced by hidden traf- S Number of segments. fic situations, such as “smooth” and “congested.” The model s Index of a segment. introduces a single set of several hidden component states, x n-th data in the s-th segment. that are associated with probabilistic distributions over the sn N Number of observations in the s-th segment. PCD values, and all the road segments have their respec- s θ Parameter of the k-th distribution. tive mixing coefficients. Using archived PCD, maximum- k π Mixing coefficient vector in segment s. likelihood parameters of the model are estimated by an ex- s Λ ( πs s=1, ,S , θk k=1, ,K ). pectation–maximization (EM) algorithm. The estimated ··· ··· σ(s, x) Tra{ ffi}c state in{s when} x was observed. model reflects the usual state over the whole observation σ(s) Usual traffic state in s. period. Our incident detection method simply follows the d(s, x) Divergence of σ(s, x) from σ(s). intuitive meaning of “anomaly.” To detect incidents, the X Set of data observed in the s-th segment, proposed method estimates the hidden state behind an ob- s i.e., X = x ,x , ,x . served PCD value and compares this current state with the s s1 s2 sNs X Whole set{ of data, i.e.,··· X = }X , ,X . usual state. If the current state is significantly di↵erent from 1 S X Data sequence from car c, { ··· } the usual state, it is recognized as an anomaly. c i.e., X = (s ,x ), (s ,x ), , (s ,x ) . We conducted an experiment using PCD observed for three c 1 1 2 2 Nc Nc D(X ) Divergenceh of X . ··· i of the routes of the Shuto Expressway system in Tokyo over c c the 2011 calendar year. The experiment showed that the proposed method can be e↵ective for AID. The main contributions of this paper are as follows. features: v(d, t, l), v(d, t, l) v(d, t 1,l), v(d, t, l 1) and v(d, t, l +1) v(d, t, l), where− link l− 1 is the next− link up- We propose a method for estimating traffic states by stream of l,− and l + 1 is the next link− downstream. These • applying a probabilistic topic model to PCD, whereby feature vectors are filtered using the heuristics above and an- road segments are characterized in terms of their ex- alyzed by distance-based outlier detection. In another AID pected performance. study, Akatsuka et al. [2] proposed an alternative feature vector. From the viewpoint of machine learning, AID can We propose a new method for detecting anomalous car • be regarded as a classification problem. Abdulhai et al. [1] trajectories according to the di↵erences between the used neural networks, and Yuan et al. [18] used support estimated states behind the trajectory and the usual vector machines, to classify the observed vectors from sta- states indicated by the learned model, whereby the tionary sensors as being incident based or otherwise. AID detection is conducted adaptively in terms of the seg- can also be regarded as an application of the change-point ments. detection problem in time-series analysis, with Wang et al. Our experiment showed that the usual traffic state [13] developing a hybrid method using time-series analysis • could be estimated using the observed PCD, and that and machine learning. our AID method had good selectivity for anomalous In this paper, we regard the AID problem as an anomaly behavior by cars encountering incidents. detection problem. Previous work exploits characteristics of congested traffic, such as slowdown, in which vehicular speed decreases even in the absence of a traffic incident. We 2. RELATED WORK take another approach to follow the intuitive meaning of Although many studies have considered the traffic state “anomaly”; namely, an event di↵erent than usual. For this estimation problem, there is no general agreement about a purpose, the traffic should be described by a probabilistic formal definition of a“traffic state.” Some research estimates model. We therefore exploit the idea of probabilistic topic the traffic state in terms of vehicular speed [11, 19], and models, which was originally studied in the field of natural this kind of estimation characterizes states, i.e., quantized language processing [5, 4]. The proposed method estimates speeds, as“free”or“congested”[6]. Yoon et al. [17] proposed both a set of traffic states over an entire route and the mix- two feature values based on vehicular speed to detect a“bad” ing coefficients for each road segment, with a traffic state traffic state, i.e., slow traffic.