Spatio-Temporal Segmentation of Metro Trips Using Smart Card Data
Total Page:16
File Type:pdf, Size:1020Kb
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2015.2409815, IEEE Transactions on Vehicular Technology JOURNAL OF LATEX CLASS FILES, VOL. 13, NO. 9, SEPTEMBER 2014 1 Spatio-Temporal Segmentation of Metro Trips Using Smart Card Data Fan Zhang1, Juanjuan Zhao1,ChenTian2, Chengzhong Xu1,3,XueLiu4,andLeiRao5 1Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, China 2School of Electronic Information and Communications, Huazhong University of Science and Technology, China 3Department of Electrical and Computer Engineering, Wayne State University, Detroit, MI 48202, USA 4Department of Computer Science, McGill University, Canada 5General Motors Research Lab, USA Contactless smart card systems have gained universal prevalence in modern metros. In addition to its original goal of ticketing, the large amount of transaction data collected by the smart card system can be utilized for many operational and management purposes. This paper investigates an important problem: how to extract spatio-temporal segmentation information of trips inside a metro system. More specifically, for a given trip, we want to answer several key questions: how long it takes for a passenger to walk from the station gantry to the station platform? How much time he/she waits for the next train? How long he/she spends on the train? How long it takes to transfer from one line to another? etc. This segmentation information is important for many application scenarios such as travel time prediction, travel planning, and transportation scheduling. However, in reality we only assume that only each trip’s tap-in and tap-out time can be directly obtained; all other temporal endpoints of segments are unknown. This makes the research very challenging. To the best of our knowledge, we are the first to give a practical solution to this important problem. By analyzing the tap-in/tap-out events pattern, our intuition is to pinpoint some special passengers whose transaction data can be very helpful for segmentation. A novel methodology is proposed to extract spatio-temporal segmentation information: first for non-transfer trips by deriving the boarding time between the gantry and the platform, then for with-transfer trips by deriving the transfer time. Evaluation studies are based on a large scale real-system data of Shenzhen metro system, which is one of the largest metro systems in China and serves millions of passengers daily. On-site investigations validate that our algorithm is accurate and the average estimation error is only around 15%. Index Terms—Trip segmentation, intelligent transportation systems, metro systems, smart card, smart city. I. INTRODUCTION before the departure of the next available train, Alice waits L2 Nowadays, metro systems [1] have become one of the most seconds; she travels in the train for L3 seconds (including the preferred public transit services [2]. Compared with other ser- dwell time at the intermediate stations) and arrives the tap-out vices, metro has the benefits of high efficiency, large volume, platform; walking from the platform to the tap-out gantry takes and fast speed. Recently, contactless smart card systems have L4 seconds; she waits at the tap-out gantry for L5 seconds, due gained universal prevalence in modern metros. Compared with to passenger congestion, before she finally leaves the system. traditional magnetic cards or paper tickets, smart card is a Figure 1(b) is Alice’s one-transfer trip from Line 1 to Line more secure and convenient method for authentication and fare 2 (the wide line), and there are more segments: L6 seconds collection. for transferring to a Line 2 platform; L7 seconds have passed The transaction data collected by the smart card system can before the next Line 2 train departs; Alice also spends L8 be utilized for many operational and management purposes. seconds on the second train. Spatio-temporal segmentation of For metro systems, usually both the tap-in and tap-out records trips with multiple transfers can be modelled similarly. In this for each trip are saved in the database. Each record contains at model, L1, L4 and L6 are of particular importance. They are least the time, the station and the card’s ID of the transaction. dominated by the building structures of each metro station, With this data, we can measure user travel behaviors and their hence independent of moving passengers or trains. For the possible variances for the purpose of customer managemen- convenience of presentation, we use the term boarding time t [3], [4], [5]. We can also analyze the access data to improve to refer to both L1 and L4, and use the term transfer time for transit planning or scheduling [6], [7], [8]. L6. This paper investigates an important emerging problem: giv- More specifically, for a given trip, we want to segment it en the smart card tap-in and tap-out data, how to extract spatio- to several travel segmentations, with both location and time temporal segmentation information of metro trips? Shown in information. This spatio-temporal segmentation information is Figure 1(a) is an example of a metro user, say, Alice’s non- important to many application scenarios. With successfully de- transfer trip in Line 1 (the dotted line): after tap-in at the metro duced L1, L4 and L6 values of every station, we have developed gantry, Alice spends L1 seconds to walk to the tap-in platform; several exciting motivation cases ( details in Section II). For most modern metro systems, only each trip’s tap-in and Copyright (c) 2015 IEEE. Personal use of this material is permitted. tap-out time can be directly obtained, and all other temporal However, permission to use this material for any other purposes must be obtained from the IEEE by sending a request to [email protected]. endpoints of the segments are unknown. By subtracting the Corresponding author: Chen Tian (email: [email protected]). tap-in time from the tap-out time, the whole trip duration is 0018-9545 (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2015.2409815, IEEE Transactions on Vehicular Technology JOURNAL OF LATEX CLASS FILES, VOL. 13, NO. 9, SEPTEMBER 2014 2 spatio−temporal density of shenzhen metro line 4 FTKA FM 1750 HZZX 7UDLQ SMZX 1500 SNG LHB 1250 SML ML BSL 1000 SZB 750 passager number HS ST LS 500 LH 7UDLQ QH 7:00 8:00 9:00 10:00 7UDLQ time Fig. 2: Spatio-temporal passenger density of Line 4 trains from 7:00 AM to 10:00 AM. tation information). The measured results validate that our algorithm is accurate and the average estimation error is only around 15% (Section VII). Fig. 1: Spatio-temporal segmentation of Alice’s (a) non- transfer trip, and (b) one-transfer Trip Section III presents the overview of our solution. Section- s IV to VII discuss in details of each challenge and solution. Section VIII discusses the related work. Section IX concludes the paper. composed of: L1 + L2 + L3 + L4 + L5 for a non-transfer trip, or L1 + L2 + L3 + L6 + L7 + L8 + L4 + L5 for a one-transfer trip, or even more segments for a multi-transfer trip. Given II. MOTIVATING APPLICATIONS only the start and end time, the problem is how to deduce the In this section, we present several exciting use cases, as boundaries between two consecutive segments. To solve this the motivation example for this paper. They all are our problem, we provide an efficient yet effective spatio-temporal ongoing projects, with the objective of transform our designed segmentation solution. algorithms to real applications that benefit the public trans- The intuition of our solution is as follows: there are some portation. special passengers (termed as Border-Walkers in the paper) in the system and we utilize their transaction data for segmenta- A. Realtime Train Density Estimation tion. To the best of our knowledge, we are the first to give a Many passengers care more about comfort than travel practical solution to this important problem. The contributions duration [9], [10]. In a metro system, if the realtime crowdness of this paper include: information of the following (several) trains can be informed • We define the role and illustrate the special functions of to passengers waiting on the platform (e.g., via the in-site Border-Walkers; a set of novel algorithms are proposed LED Display), some might change their travel plans by getting to identify Border-Walkers by analyzing the tap-in/tap-out on an earlier or later while more comfortable train with less events pattern (Section IV). population aboard. • We further propose a novel methodology to extract spatio- Currently this service is just unavailable. With only the tap- temporal segmentation information, first for non-transfer in and tap-out information, the authority has no method, yet, to trips by deriving the boarding time between gantry and figure out the approximated population in each running train. platform, then for with-transfer trips by deriving the Shown in Figure 2 is the passenger spatio-temporal density transfer time (Section V). of Line 4 from a typical day: between morning peak hours • We study our approach using a large scale data collected (i.e., 7:00 AM to 10:00 AM), and between stations BSC and from Shenzhen metro system, which is one of the largest SNG; the deeper the red color, the more people on a train.