Arxiv:1605.08390V1 [Cs.AI] 19 Apr 2016 Corresponding Author: Fan Zhang (Email: [email protected])

1 Estimation of Passenger Route Choice Pattern Using Smart Card Data for Complex Metro Systems Juanjuan Zhao, Fan Zhang, Member, IEEE, Lai Tu, Chengzhong Xu, Fellow, IEEE, Dayong Shen, Chen Tian, Xiang-Yang Li, Fellow, IEEE, and Zhengxi Li Nowadays, metro systems play an important role in meeting the urban transportation demand in large cities. The understanding of passenger route choice is critical for public transit management. The wide deployment of Automated Fare Collection(AFC) systems opens up a new opportunity. However, only each trip’s tap-in and tap-out timestamp and stations can be directly obtained from AFC system records; the train and route chosen by a passenger are unknown, which are necessary to solve our problem. While existing methods work well in some specific situations, they don’t work for complicated situations. In this paper, we propose a solution that needs no additional equipment or human involvement than the AFC systems. We develop a probabilistic model that can estimate from empirical analysis how the passenger flows are dispatched to different routes and trains. We validate our approach using a large scale data set collected from the Shenzhen metro system. The measured results provide us with useful inputs when building the passenger path choice model. Index Terms—Metro systems, Smart card, Data mining, Intelligent transportation systems. I. INTRODUCTION HengGang g OWADAYS, metro systems play an important role in n QingHu TangKeng o L HeAo LiuYue g AiLian DaYun LongHua n JiXiang YongHu NanLian a meeting the urban transportation demand in large cities. LongSheng DanZhuTou u h N DaFen S ShangTang Due to its fast speed, high efficiency, large volume and HongLangBei XingDong LiuXianDong XiLi DaXueCheng TangLang ZhangLingPo HongShan MuMianWan JiChangDong BuJi HouRui BaiShiLong CaoPu punctuality, the urban metro has become the first choice of ShuiBei LongChengGuangChang GuShu LingZhi MinLe WuHe MinZhi BanTian TianBei XiXiang ShangMeiLin YangMei ZhangLong CuiZhu many people. In Shenzhen, China, in mid-June 2015, there XiaShuiJing PingZhou FanShen LianHuaBei LianHuaCun HuaXin TongXinLing HongLing ShangShuiJing ShaiBu were around 3.5 million metro trips every day, which was BaoTi LaoJie BaoAnZhongXin ShenKang AnTuoShan QiaoXiang XiangMi HuangBeiLing XinXiu HuBei BaoHua XinAn QiaoChengBei around one third of the total public traffic. Fig. 1 illustrates GuoMao the metro operating map of Shenzhen. With further expansion LuoHu DaXin LinHai GangSha TaoYuan LiYuMen ZhuZiLin Station XiangMiHu Ke KeXueGuan GaoXinYuan of the metro system, the amount of passengers may increase Yu FuMin HuaQiangLu QianHaiWan a CheGongMiao n HuaQiaoCheng Ho QiaoChengDong ShenZhenDaXue uH Tranfer Station De ai FuTianKouAn rapidly. On one hand, the increasing usage of metros can ngLia YiTian Line 1 HaiYu ng Line 2 Wa e effectively help reduce the traffic pressure on surface roads. On Do nSha Line 3 ngJ Shu iaoTo Line 4 Ha iWan u Line 5 the other hand, it also brings dramatic increasing of passenger iSha She ngSh Ch KouGa iJie demand on metro systems. iWan ng The traffic patterns of large metro systems are usually Fig. 1: Metro graph of Shenzhen very complex. Under the condition of network operation and seamless transfer in current metro systems, the train and route chosen by a passenger are unknown. It is common to destination station, a.k.a multi-path in transportation systems. have more than one route between the origin station and the As shown in Figure 2(a), there are two routes from station O to station D. This means that for an OD pair with more than Copyright (c) 2015 IEEE. Personal use of this material is permitted. one route, we don’t know how passengers are distributed over However, permission to use this material for any other purposes must be obtained from the IEEE by sending a request to [email protected]. these routes and trains. arXiv:1605.08390v1 [cs.AI] 19 Apr 2016 Corresponding author: Fan Zhang (email: [email protected]). This missing information at a fine granularity could be Juanjuan Zhao and Fan Zhang are with Shenzhen Institutes of Advanced important for both passengers and metro operators. From the Technology, Chinese Academy of Sciences, China and Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences (e-mail: operators’ point of view, understanding the flow distribution [email protected]; [email protected]). of passengers in the whole metro network is important for Lai Tu is with School of Electronic Information and Communica- improving the service reliability. The potential applications can tions, Huazhong University of Science and Technology, China (e-mail: tu- [email protected];). be a mobile application of trip planning for metro passengers, Chengzhong Xu is with Wayne State University and Shenzhen Institutes a monitoring system for metro operators, a route suggestion of Advanced Technology, Chinese Academy of Sciences, China (e-mail: and emergency management system for urban administrators [email protected]). Dayong Shen is with Research Center for Computational Experiments and etc. This paper aims to develop a solution to calculate the Parallel Systems, National University of Defense Technology, China (e-mail: probability of each route chosen for an OD pair, which can be [email protected]). used to estimate the passengers flow at a granularity of trains Chen Tian is with State Key Laboratory for Novel Software Technology, Nanjing University, China (e-mail: [email protected]). of each line, as shown in Figure 2(d). Xiang-Yang Li is with School of Computer Science and Technol- Traditional approaches are not scalable. To understand the ogy, University of Science and Technology of China, China (e-mail: xi- passengers’ route choice behavior, one traditional method is [email protected]). Zhengxi Li is with Department of Automation, North China University of to conduct field surveys at train stations, by asking passengers Technology, China (e-mail: [email protected]). which route they will take to reach their destinations. There 2 (a) (b) (d) 6833 D Train i Route 1 1 O Train j D 5666 O B 2 Train i+1 4500 Route 2 A Line1 Line2 3333 Station (c) Tranfer Staion 1 2166 Line 1 Line 2 Train k D 8:00~8:30 O 2 Train i 1000 Line 3 Line1 Line3 Fig. 2: (a)An OD with Multiple Routes (b)Trains matching for route 1 (c)Trains matching for route 2 (d)An illustration of traffic monitor application based on the proposed model are limitations of this method: firstly, most surveys are con- the time table constraints, we further derive the probability ducted with focus on a part of the passengers at particular that passengers may choose each plan, i.e., ftri; trjg or locations within a limited time window, hence the results are ftri+1; trjg. often limited in diversity, scale and accuracy; secondly, it is The contributions of this paper include: both labor-intensive and time-consuming in conducting such • We define two kinds of time-dependent polynomial distri- surveys. butions of the number of trains waited for by passengers. The wide deployment of Automated Fare Collection(AFC) The first is the number of trains that a passenger waits at systems opens up a new opportunity for metro network his/her original station. The other is the number of trains analysis: the transaction records from AFC can reveal the a passenger waits when he/she transits at the transfer Origin (O) and the Destination (D) of every passenger’s trip, station. A set of algorithms are proposed to calculate the as passengers are required to tap their smart cards or RFID parameters of the two distributions. based tickets each time they enter the O station or exit the • We further propose a probabilistic model that can es- D station. Passengers’ flows can be coarsely demonstrated by timate how the passenger flows are distributed among OD (origin-destination) pairs. However, AFC records failed different routes and trains. to expose the passengers’ routes directly. Even in cases that • We then deploy the algorithms on a cloud platform and the route of an OD is unique, the AFC records are still not develop supporting modules for the system level solution. able to show which train a passenger takes. There are too • Finally we validate our approach using a large scale many factors that can affect a passenger’s final plan, i.e., data set collected from the Shenzhen metro system. The trains or train combinations one takes. For example, if the train measured results provide us with useful inputs when we fails to have enough capacity to accommodate all passengers build the passenger path choice model. waiting on a platform, some passengers would have to wait For the rest of this paper, we discuss the related work in for another train. This phenomenon, known as “travelers left Section II. The overview of this study is given in Sections III. behind” is quite common during rush hours or at large stations. Section IV discusses the solution in details. We present system There are already some studies using transaction records from design and the algorithm implementation on a cloud platform AFC to understand the passengers’ route and train choice in Section V. Section VI presents the experimental studies. behavior [1], [2]. Although these methods work well in some Finally, Section VII concludes the paper. specific situations, they don’t work for complicated situations, such as the case where there are various “left-behinds” at II. RELATED WORK different stations caused by the imbalance of geographical Building users route choice model is an important research distribution of passengers. Also, usually the walking time direction in the field of transportation [3], which is the basis between the charge gate and that platform, and the walking for traffic management policies-making.

Arxiv:1605.08390V1 [Cs.AI] 19 Apr 2016 Corresponding Author: Fan Zhang (Email: [email protected])

Customs, Immigration and Quarantine Arrangements of the Hong Kong Section of the Guangzhou

The Operator's Story Case Study: Guangzhou's Story

Annual Report 2016

A Model Layout Region Optimization for Feeder Buses of Rail Transit

Shenzhen Futian District

Guangshen Railway Company Limited 2017 Social Responsibility Report

China Railway Signal & Communication Corporation

Annual Results Presentation 2019

Information for Prospective Candidates

A Hybrid Method for Predicting Traffic Congestion During Peak Hours In

Dwelling in Shenzhen: Development of Living Environment from 1979 to 2018

A Data-Driven Urban Metro Management Approach for Crowd Density Control