<<

Trans-Sense: Real Time Transportation Schedule Estimation Using Smart Phones

Ali AbdelAziz* Amin Shoukry Walid Gomaa Moustafa Youssef Egypt-Japan University Egypt-Japan University Egypt-Japan University University of Science and Technology of Science and Technology of Science and Technology Alexandria, Egypt. and Aswan University. and Alexandria University. and Alexandria University. [email protected] [email protected] [email protected] [email protected]

Abstract—Developing countries suffer from traffic congestion, time waiting (either for getting on a vehicle or In-Vehicle poorly planned road/rail networks, and lack of access to public Time (IVT)) to the scheduled time [11]. The second, adopted transportation facilities. This context results in an increase in fuel by transportation models, assumes that average waiting times consumption, pollution level, monetary losses, massive delays, and are half the service headway given random passenger arrivals less productivity. On the other hand, it has a negative impact [9], [10]. The definition adopted in this paper is the first one. on the commuters feelings and moods. Availability of real-time The total time spent in a trip can be estimated based on the transit information - by providing public transportation vehicles locations using GPS devices - helps in estimating a passenger’s access, regress, waiting and IVT times [12]. Recently, many waiting time and addressing the above issues. However, such transit agencies are established to provide a suitable public solution is expensive for developing countries. This paper aims transportation service and take advantage of the increasingly at designing and implementing a crowd-sourced mobile phones- high-amenity transit stations and stops to mitigate the burden based solution to estimate the expected waiting time of a pas- of wasted waiting time. Also, there are many facilities on senger in public transit systems, the prediction of the remaining social media that provide train tracking services using manual time to get on/off a vehicle, and to construct a real time public input from users. Nonetheless, these systems depend on either transit schedule. data available by the service providers, dedicated devices Trans-Sense has been evaluated using real data collected for attached to the transportation vehicles, and/or manual user over 800 hours, on a daily basis, by different Android phones, and input. All limit their deployment, especially in develipping using different transit lines at different time spans. The countries. results show that Trans-Sense can achieve an average recall and precision of 95.35% and 90.1%, respectively, in discriminating lightrail stations. Moreover, the empirical distributions governing In this paper, we present Trans-Sense, a system that takes the different time delays affecting a passenger’s total trip time advantage of the ubiquitous devices available with the com- enable predicting the right time of arrival of a passenger to her muters, while avoiding expensive solutions based on GPS ded- destination with an accuracy of 91.81%. In addition, the system icated devices attached to public transit vehicles. We conduct estimates the stations dimensions with an accuracy of 95.71%. a case study of an LRT system that takes into consideration a uniquely systematic perspective, including a wide range of I.INTRODUCTION stations and lines types, different seasons, times within the day and with many users equipped with various mobile Using public transportation means such as light rail transit phones. Our results show the effectivness of the proposed (LRT), tram, , or metro is a normal daily activity for work technique in capturing different timing haractersitics of the or leisure. However, traveling time is usually considered as a trasit system dynamics. wasteful time and has a negative impact on the commuters’ feelings [1], [2]. Often, passengers engage in some activities This paper is organized as follows. Section II gives a brief to make their transportation/commuting time more productive. introduction about the investigated tramway system as well as Normally, public transportation follows a fixed schedule that an overview of the proposed Trans-Sense system. Section III arXiv:1906.07575v1 [cs.CY] 14 Jun 2019 maybe disturbed for many reasons including unpredictable describes the five main components of Trans-Sense. Finally, traffic problems; especially in developing countries; resulting the paper is concluded in Section V. in massive delays, high fuel wastage, wasted human resources, and finally monetary and economical losses. These obstacles II.PROPOSED SYSTEM affect the competitiveness of public transit, which is more eco friendly than private vehicles [3], [4]. A. Alexandria Tram System A growing interest in analyzing traffic problems in Alexandria tramway started in 1860. It is the oldest in developing countries has been witnessed over the years [5]– Egypt and Africa, and is among the oldest in the world. [8]. Systems like [9], [10] are among the recent studies Figure 1 and Table I show the Ramleh Tram System, one of that attempted to estimate passengers’ waiting time Alexandria LRT systems. It runs from the east to the west of distributions and perceptions, depending on service and Alexandria (from to Ramleh station) and vice versa. It stations characteristics. There are many definitions of the includes 39 stations. This system includes four different lines waiting time. The first expresses it as the ratio of the actual (four colors/Identifiers ID) that share the same rail in some parts of their journey and split into different rails in other *Electrical Engineering department, Aswan faculty of Engineering, Aswan parts. In several locations, cars are allowed to cross the tram University, Egypt, 81542. rails. Therefore, a tram driver has to stop until it is safe to pass w , w ws1 wseg12, wseg21 ws2 lg21 lg12 wf1 wlg13, wlg31 ws3

TABLE I: Different stations names and their refrences. (a) ws1 ws2 wf1 ws3

Ref. Station Name Ref. Station Name wsg12 wlg21 wlg13 Station 1 Station 2 Traffic 1 Station 3 S 1 Ramleh S 21 Gleem wsg21 wlg12 wlg3 S 2 Al-Qa’ed Ibrahim S 22 Fonoon Gamila (b) S 3 S 23 Zizinia S 4 Soter S 24 San Stefano Fig. 2: (a) Waiting time between 3 successive stations. The S 5 Shoban S 25 Thrwat route is direct between the first pair of stations while it includes S 6 Shatbi S 26 Louran a traffic light between the next pair. (b) corresponding state S 7 Game3a S 27 AlSraya diagram. S 8 Camb S 28 S 9 Ibrahemya S 29 El Seyoof time. At peak times a tram can be waiting at a station while S 10 Al-Riada Al-Sughra S 30 Victoria another tram, directly behind the first, is waiting to enter the S 11 Al-Riada Al-Kubra S 31 Zananeri same station. The waiting time of the second tram in this case S 12 Kliopatra Al-Sughra S 32 SidiGaber St. is the buffering waiting time. S 13 Kliopatra Hamammat S 33 Wezara S 14 SidiGaber Sheikh S 34 Felming Segment time Ehwsgi, is the average time taken by a tram to S 15 Moustafa Kamel S 35 Bakous from a given station directly to the next station (there S 16 Mahfouz S 36 Safr are no cross roads or traffic lights between these two stations). S 17 Roshdi S 37 Shods Leg time Ehwlgi, is the average time taken by a tram to travel S 18 Bokla S 38 Genakelese from a station/traffic signal to the next station/traffic signal on S 19 Hedaya S 39 Genakelese2 its route. This is for stations separated by traffic lights. S 20 Sapa Basha Trip time EhwT i, is the average time for a passenger to travel from a source to a destination. It is an aggregation of multiple segments time, stations delays, traffic delays, legs time and or a traffic light is green. Because most traffic lines are human segments time as in the following equation: controlled, the waiting time at these locations is unpredictable. N N−1 M With the continuous increase in the population in Alexandria, X X X EhwT i = ( ( (Ehwsi i + Ehwbfij i + Ehwsgij i migration from rural areas, the impact of the heavy traffic on (1) schedule is considerable. i=1 j=i+1 k=0 + Ehwlgik i + Ehwfk i + Ehwlgkj i))) where, i is the current station, j is the next station, k current S_1 traffic light, N number of stations from source to destination, M is the number of traffic light signals. LRT Line Numbers (ID’s)

Fig. 1: The investigated tram system. Two spliting areas can B. Data Collection. be identified that correlate with tram IDs. The data collection took place with different mobile devices including, Samsung Galaxy S, Samsung Note 4, HTC M9 plus, HTC E9 plus and Google Nexus. Hundred thousands of data Whenever a tram moves from a station to the next one, one traces have been collected from different users at different of the following scenarios is expected (Figure 2): times and seasons for about 18 months. 1) The is direct and cannot be interrupted. 2) A cross road lies next to the starting station. An extra C. System Overview waiting time is required until it is safe to move. (e.g., In this section, we present the architecture of the Trans- stations S 5 and S 9 stations in Figure 8). Sense system. It consists of three different subsystems as 3) At least one cross-road controlled either by a human shown in Figure 3: the MonoSense, R-Sense and Railway- and/or traffic light interrupts the path. Hence, an extra Sense systems. Trans-Sense relies on crowd sensing approach delay time is expected. to collect and accumulate data sent from the smart phones’ sensors; carried by different users; to a service running on The following terminology is needed in the rest of this the cloud. The temporal and spatial information extracted paper: from the users are correlated with their speeds and loca- Mandatory stop, is a at a station. tions. The system starts by collecting a stream of GPS data Potential stop, is a possible tram stop at a traffic light. (d = d0; d1; ··· ; dt; ··· ), where each dt is an ordered pair All the following time dependent parameters are estimated as (Lat ; Long ) representing a user latitude and longitude at time hxi t t averages/ expectations, the operator E denotes the expec- t. The input data stream is first filtered and smoothed, then it tation operator. is passed to MonoSense. Traffic delay Ehwf i, is the average tram waiting time at a traffic signal. The MonoSense transportation mode detection system [13], Station delay Ehwsi, is the average tram waiting time at a [14] differentiates between walking, riding and driving users. station. If the sensed data represents a moving person/ rider, it is then Tram buffering delay Ehwbf i, is the average tram buffering passed to R-Sense. (unusual Tram ID Tram Time Arrival Current speed - - - is a well-known ’multipath effect’ Tram ID Detection ] σ

Route Direction +3 µ , σ

Current Estimate of Time Dbscan -3 Tram ID & Direction & ID Tram Component Detection

with Time History µ ,[ Backtracking Fuse Times mean Total

Speed Trip Time . It uses the empirical three sigma rule of : specifies the size (radius) of the (circular)  Station : specifies the minimum number of points in History Component History Bootstrapping Bootstrapping History Time History Time Leg Time Traffic Time Traffic Waiting Time Time History Time Inside Segment Time Segment Transit Schedule Estimation) Schedule Transit Dynamic Time Estimation Component Estimation Time Dynamic ( MinPts the neighborhoodincluded of in a a cluster. given pointEpsilon in orderneighborhood. to be [23]–[25] as shown in Figure 5. 1 Sense - Moreover, misleading readings are due to The filtering phase uses a smoothing filter with non- Station Locations • • The filtered data (98.65% of the raw data) is then passed The main function of this component is to extract the It states that for a normal distribution 99.7% of the data lie within a range

Database Unit 1 of three standard deviations of the [20]–[22], where the direct pathFor to the the above GPS receiver reasons, islead spikes blocked. and to spurious substantial changeshandle deviations occur these and in errors,coarse we the apply outliar measured three removal locations.and different duplicates (Phase To filtering removal (Phase phases: 1), 3).outliar The noise data coarse filtering that filtering removes identifiable is (Phase values) significantly 2), differentthumb than other overlapping windows todata smooth the points data intotering and groups clusters algorithm using similar (DBSCAN) the density [26]. based spatial clus- clustering algorithm thatincluding the requires number of only clusters: two parameters not Finally, dissimilar remaining pointsobtained are clusters, removed. therepoints Among at are the the same clusters locationduplicate with that entries lower variance. that have They are represent multiple removed as data shown in Figureto 5 (b,d). the stationbetween location stations detection andbehavior. component traffic to lights according discriminate to passengers’ B. Stations Locations Detection Component. semantics hidden inoccupies the a filtered GPSnumber relatively points. of large Sincesame GPS area, a measurements. physical it station GPS GPS location, corresponds characteristics measurements, can to [22]in at or vary a receiving the the by large capabilities GPSTherefore, (differences meters we receiver cluster either these accuracyplaces data due into correspond among meaningful to to places. the either These sensing stations devices). (at which passengers are Station Location Station Railway system architecture Platform Length Length Platform Estimation Unit , i s

Station Location Platform the on Population

Detection Unit w h E Stationary points Current Tram Location, Stations history Stations Location, Tram Current

Detection Unit Trans-Sense Station Location Detection Component Detection Location Station ) (for a user

Spatial & Temporal Outlier Removal sub-system. The SV , then passed to Fig. 3: are detailed in the OMPONENTS [15], [16]. Specifically, C Temporal outliars are the Spatial data contains noise and Driving Sense YSTEM Railway-Sense Walking - Railway-Sense Stationary MonoSense (Bus/ (Bus/ Cars) R S system architecture, is described. (Tram/Train) Rail Railway-Sense Road Transport Mode Detection) Mode Transport ( Rail/ Road Transport Detection) Transport Road Rail/ (e.g. glitches) ( that correspond to a tram waiting at a system estimates and predicts the ex-

Pre-Processing- Smoothing- Filtering i lg operation is given below: Figure 3 shows w ) (for a user riding a tram, when will she h . system identifies whether a rider is on board E , VV Railway-Sense i Railway-Sense sg Client w h output is passed to Sensor III. Railway-Sense R-Sense E Sampling , Sense - i f Railway-Sense 2) Spatial outliars removal: The main goal of this component is to reduce the noise and 1) Temporal outliars removal: The five components of The The Railway-Sense w h Trans outliars mainly due torate the position urban information canyonerrors effect and are or/ more inaccu- often[19]; when where the GPS aamong signal different user bounces nearby off areas reading [17]– as changes shown in vastly Figures back 4, 5. and forth E station, a trafficdue delay, segment to many and reasons leg such times. as Outliars accidents occur or tram failure delays. outliars in the waiting time in the following parameters remove spatial and temporalmeasurements. ouliars in the input raw sensors’ A. Preprocessing Component. following subsections. waiting inVehicle-View a ( station,arrive when at her will destimation). Answeringthe the these estimation/ queries prediction necessitates next ofvehicle. the tram The remaining history time arrive) to componentused and get (described to on/off in a aggregate Section previous III-D)ID, riders’ is direction traces detection over componentis time. (described finally The in Section Tram used III-E) reflects to its identify the lineponent tram number). of direction In the and the its following ID section, ( each that com- of a rail transportationto vehicle or not. If yes, the data is passed pected arrival times ofthe the LRT users.the detailed A components brief of explanation the R-Sense of the station locationdiscriminate stations’ detection from componentSpeed traffic lights’ (Section Estimation stops. III-B) Theanswers component to Dynamic two (described basicusers’ in vehicle locations Section( scheduling and III-C) queries views: based Station-View on ( the       

       

                

                                             

                                          (a) Original data including noise (b) Data after phase1 (c) Data after phase2 (d) Data after phase3 Fig. 4: Distribution of GPS data recieved from Tram users before and after the three preprocessing phases

E9PlusOrange Data With Noise 31.3

31.25

31.2

31.15

31.1

31.05 Latitude

31

30.95

30.9 data1 30.85 29.55 29.6 29.65 29.7 29.75 29.8 29.85 29.9 29.95 30 Longitude (a) (b) (c) (d)

E9PlusOrange Data With Noise Removed Noise Phase2, DBSCAN Clustering Removed Noise Phase3, DBSCAN Clustering E9PlusOrangeRemoved Noise Phase1 ( IQR ) ( = 0.0005, Noise = 1.0868 %, MinPts = 300) ( = 0.0005, Noise = 1.9345 %, MinPts = 300) 18000 18000 18000 18000 16000 16000 16000 16000 14000 15000 14000 14000 14000 15000 15000 15000 12000 12000 12000 12000 10000 10000 10000 10000 10000 10000 10000 10000 5000 8000 5000 8000 5000 8000 5000 8000 6000 0 6000 6000 6000 0 0 0 4000 31.2 4000 4000 4000 31.24 31.1 29.9 2000 29.98 31.24 29.98 31.24 29.98 29.96 2000 29.96 2000 29.96 2000 31 29.8 31.22 29.7 29.94 31.22 29.94 31.22 29.94 30.9 0 Latitude 29.6 Longitude 31.2 29.92 0 29.92 0 29.92 0 Latitude Longitude Latitude 31.2 Longitude Latitude 31.2 Longitude (e) (f) (g) (h) Fig. 5: Data in different phases: (a) Original Data with noise (b) After applying Phase 1 (c) After applying Phase 2 (d) After applying Phase 3 (e), (f) (g) (h) Histograms of original data, after Phase 1, after Phase 2 and after Phase 3, respectively.

waiting for a tram ) or moving/ stationary trams. Stationary Minpts*, Epsilon* and DT* that achieve maximum TPR (True trams are those waiting at stations or traffic lights. In order to Positive Rate)= 0.9535 and minimum FPR (False Positive discriminate between these two tram states, we make use of Rate)= 0.2432 and DT= 0.0003 have been recorded and listed the travel environment as well as passengers behavior. in Table II.

1) Stationary Points Detection Unit: The main purpose Parameter/Metric Value of this unit is the automatic detection of the stationary points MinPts 100 (detection of passengers’ inside a stand still tram that is wait- Epsilon  0.0002◦ Distance Threshold DT 0.0003◦ ing at a station or traffic light). The unit starts by the stream TPR (True Positive Rate) 0.9535 of preprocessed random points. These points are clustered FPR (False Positive Rate) 0.2432 into stationary or moving points. Points that have nearly zero Precision 90.1% Recall 95.35% speed are considered as stationary points. Further, stationary points are clustered into either stations or traffic lights. We Note that , Dt are in degrees: 1◦ ≈ 111 km (110.57 eql 111.70 polar). have used Dbscan to detect the stationary points. The optimal 10 ≈ 1.85 km (= 1 nm) 0.01◦ ≈ 1.11 km. parameters Minpts* and Epsilon* for the collected data have 100 ≈ 30.9 m 0.0001◦ ≈ 11.1 m. been found based on ground truth data (the true locations of . the tram stations and traffic lights) and repetitive application TABLE II: Optimal clustering parameters and corresponding of the Dbscan algorithm with a range of values for Minpts performance metrics and Epsilon. The centroids of the found clusters are calculated and matched to the true centroids. If a matching is less than a small distance threshold (DT), it is considered as a positive 2) Station Location Detection Unit.: The function of hit. Figure 6 illustrates several ROC curves corresponding to this component is to discriminate between stations and traffic different trials of Dbscan algorithm (and different values of lights based on either temporal or spatial methods. From the Minpts, Epsilon, DT). From Figure 6, the optimal parameters temporal point of view, the real data shows that, in general, DBSCAN Clustering ( = 0.0001, MinPts = 60) ROC Space, Distance Threshold= 31.229 * Station Location (Ground Truth) S __ 17 Station 0.0003 1 31.2285 Waiting before S __ 17 Station 0.9

0.8 31.228

0.7

31.2275 0.6 Latitude

0.5 S __ 16 Station 31.227 0.4

31.2265

(TPR or Spensitivity) or (TPR 0.3

0.2 31.226 29.9475 29.948 29.9485 29.949 29.9495 29.95 29.9505 29.951 29.9515 0.1 Longitude

0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Fig. 7: The red box denotes a tram waiting at a traffic light FPR or (1-Specificity) between S 16 and S 17 stations (blue boxes) Fig. 6: Optimal Clustering Parameters

The minimum bounding rectangle (MBR) of the clustered points representing a station, is used to estimate the platform the magnitude of Ehwsi is greater than Ehwf i. Figure 8 shows length. The length of the larger side of the bounding rectangle stations’ and traffic lights’ samples waiting times (Ehwsi, represents the platform length. Given that (Long1, Lat1) and Ehwf i) of trams in one of the tram lines. At S 5 station (Long2, Lat2) are the coordinates of the end points of the one may note that the waiting times are, in general, longer larger side, Eq (2) shows how to calculate a station platform than other stations because all types of waiting times (Ehwsi, length(D) using Haversine Formula: [30]. Ehwf i and Ehwbf i) usually occur at this station. From the spatial point of view, both stations and traffic lights (Dlong,Dlat) =(Long2, Lat2) − (Long1, Lat1) D are places of frequent stops. However, the frequency of stops A = cos(Lat ) ∗ cos(Lat ) ∗ (sin( long ))2 at stations is considerably larger than the stops at traffic lights. 1 2 2 Moreover, stations and traffic lights can be differentiated based D + (sin( lat ))2 on the passengers in/out-flow patterns to/from the trams, at 2√ these locations. Commuters engage in different activities while C =2 ∗ atan2( A, p(1 − A)) getting on/off trams such as moving up/down, standing, sitting, turning, etc... These activities can discriminate between traffic D =R ∗ C(where R is the radius of the Earth) lights and stations. Since travelers are expected to get off/ on (2) (from/ to) a tram only at stations while remaining relatively We could correctly estimate stations’ platforms lengths with motionless at traffic lights, the corresponding passengers’ an accuracy of 95.7%. For example, the true platform length activity patterns can discriminate between these two situations. of S 17 station is 70 meters, while the estimated length is 67 These activity patterns [27], [28] (illustrated in Table III) can meters. As a result, we can integrate new stations to Google be recognized using either dedicated wearable sensors or the maps and other similar services. ubiquitous sensors in smart phones via pervasive computing [29]. 4) Stations locations database unit.: The function of this unit is to store stations’ locations (Long, Lat) and dimensions Activity Activity Description and add new stations directly after finding them from previous Standing, walking, turning Left or right, walking, climbing components. up stairs, turning left or right, walking again till reaching Getting on a Tram 1 seat, turning left or right 90◦ or 180◦, seating down, sitting for long time. C. Dynamic Time Estimation Component. Sitting, standing up, turning left or right 90◦ or 180◦, Getting off a Tram2 walking till reaching the door, turning left or right to be in front of the stairs, getting down, walking on the ground. This component estimates the expectation of each of the 1 Note that Getting On Sequence will start from steady standing in the station. random parameters defined in Eq 1. Towards this goal the 2 Note that Getting Off Sequence will start from steady sitting state in the Tram. distribution governing each parameter is plotted and its ex- TABLE III: Getting on/off passengers’ activity patterns. pectation is calculated. For example, Figure 8 plots samples of the waiting time (Ehwsi) for different stations for many trips. A GPS sample observation is identified as belonging to a passenger in a tram waiting at a station if the spatial 3) Platform Length Estimation Unit.: The function of this unit is to estimate the platform length after splitting 1This formula does not take into consideration the (ellipsoidal) shape of the Earth. It will tend to overestimate trans-polar distances and underesti- stations from traffic lights using the previous unit. Platform mate trans-equatorial distances. The values used for the radius of the Earth length detection identifies whether the tram reached a station (3961miles, 6373km) are optimized for locations around 39 degrees from or just buffering outside the station, as shown in Figure 7. the equator (roughly the Latitude of Washington, DC, USA).     $%"# "μ  =  

   

        #" %&!$%&      '                %!       (a) S 6 Tram Station (b) S 6’s CDF            !"    μ  =  

    

             "#!"# 

                  "  Fig. 8: Stations’ and traffic lights’ samples waiting times (c) S 7 Tram Station (d) S 7 CDF ( hwsi, hwf i) of trams in one of the tram lines. E E     coordinates lie within the station boundaries (identified in "# !  μ  =    the previous sub-section). The time stamp of this observation    belongs to the set of time stamps that define the tram waiting   time interval. In some stations in Figure 8, such as S 5 station, 

! #$ "#$  the waiting time is long because it is directly followed by               a traffic light. Based on real data, it ranges from 30 to 400 #   seconds. (e) S 8 Tram Station (f) S 8 CDF Figure 9 shows the the PDF (propabilty density function) of the waiting time Ehwsi random variable for three different Fig. 9: Waiting time distribution over a long time for several stations as well as their CDFs. The KsTest2 has been conducted tram stations. successfully for normality check. The red marks show the maximum absolute difference between the calculated and the   hypothesized CDFs based on the following equation.     D∗ = (|f(x) − G(x)|)   max b (3)   x     where fb(x) is the empirical CDF and G(x) is the CDF of     the hypothesized distribution. Station waiting times Ehwsi as  well as travel times Ehwsgi affect a whole trip schedule. If     an instance of a segment time duration wsg > Ehwsgi, this   implies that an extra delay of an amount of wsg − Ehwsgi   is expected. To estimate Ehwsgi, we have collected history          data for a long period of time. Figure 10 plots samples of   the traveling time between consecutive stations without traffic Fig. 10: Travel time observations between successive stations lights ( hwsgi) interruptions or with traffic lights interruptions E (wsg) or between stations and traffic lights (wlg). (Ehwlgi). Moreover, Figure 11 a, c and b, d show the PDF and CDF of (Ehwsgi) between consecutive stations without traffic lights interruptions. However, Figure 11 e, f show the same S 9 station (destination), the expected trip time would be: distribution between two consecutive given stations with traffic light interruption (Ehwlgi). Since the random parameters ws, Ehwti = µS 7Station + µS 7/S 8Segment + µS 8Station wsg, wf , wbf and wlg have been empirically found to follow + µS 8/S 9Segment (5) normal distributions with means µs, µsg, µf , µbf , µlg and 2 2 2 2 2 = 44 + 29 + 76 + 28 = 177 seconds variances σ s, σ sg, σ f , σ bf , σ lg, respectively, a trip time follows a normal distribution of the form: with standard error of: n n ! X X 2 2 2 2 N ciµi, ci σi (4) σtotal = (σ S 7Station + σ S 7/S 8Segment i=1 i=1 2 2 1 + σ S 8Station + σ S 8/S 9Segment) 2 (6) p If a passenger rides a tram from S 7 station (source) heading = 202 + 122 + 342 + 142 = 14.5 seconds.

2One-sample Kolmogorov-Smirnov test, which returns a test decision for the null hypothesis that the data in random variable x comes from a standard Note that in Eq. 6 the waiting time at the destination station normal distribution, against the alternative that it does not come from such a should not be taken into consideration. The Trans-Sense esti- distribution mates the arrival time with an accuracy of 91.81%.     [33] experimented with a lightweight system for the "# !  μ  =   prediction of bus arrival times that does not need any GPS   equipment nor GPS-enabled mobile phones. Instead, it relies    on using commodity mobile phones to sense the nearby 

 ! #% "#% celltower IDs and to record the beep audio responses, of the IC 

  transit card readers deployed for collecting bus fees. A beep            !$   sound is used to confirm that a passenger is in a bus, while a (a) S 7 to S 8 (b) S 7 to S 8 CDF celltower sequence is associated with a bus route.

   EasyTracker [34] enables bus tracking by analysing GPS  "# !  μ  =   traces collected from installed mobile phone in each bus. This

  requires a lot of time to converge. In addition, they cannot be 

 applied to vehicles that go inside tunnels. This is not the case    for Trans-Sense that can identify the get on/get off sequences   ! #% "#%

 using the passenger behaviour.

                 !$   The UrbanEye system [35] describes some peculiarities of (c) S 8 to S 9 (d) S 8 to S 9 CDF bus transit systems especially in developing countries. These include chaotic stoppage patterns and unpredictable speed     variations. However, analyzing lightrail systems can lead to  !  μ  =   better results as their motion and stopping patterns are more    predictable.

    Estimating the location of a cellular phone or transportation    !# !#  vehicle can be based on different sensors including GPS,               sensor-based landmarks [16], [32], or cellular signals [13],  "  [14], [36], [37]. In this paper, we leveraged the GPS system (e) S 17 to S 18 (f) S 17 to S 18 CDF in smart phones. However, other localization systems with comparable accuracy to GPS and lower energy-consumption, Fig. 11: Travel time distribution between different successive e.g., can also be used. stations in the given LRT system. V. CONCLUSION D. History Component. This paper proposes Trans-Sense that estimates the ex- At initalization time, the system is bootstrapped using his- pected waiting time of a passenger for getting on/ off to/ from torical data. Historical data has been collected from many users a LRT. Trans-Sense components are explained in detail. The over a long time duration. This data is useful in estimating basic idea is that a phone speed can be correlated with the GPS and storing the random parameters µs, µsg, µf , µbf , µlg and data to extract the semantics, hidden in the filtered samples, 2 2 2 2 2 variances σ s, σ sg, σ f , σ bf and σ lg characterizing the such as the average tram waiting time at stations, traffic lights obtained distributions for the investigated random parameters. and travel time between stations. This makes it possible to Because of the increase in population and traffic densities as construct a real time tram schedule for the investigated system. well as possible service upgrade, there is a need to update Over 800 hours of daily passengers’ traces have been collected the former parameters by a kind of stochastic adaptation using different tram lines at different time periods. Trans- (incremental learning)3. Sense achieved an average recall and precision of 95.35% and 90.1%, respectively, in discriminating between stations E. Tram ID and Direction Detection Component. and traffic lights. Moreover, Trans-Sense is able to calculate the stations dimensions with an accuracy of 95.714% and In the adopted LRT system, there are only two possible can incorporate more stations based only on the information directions for a tram: from east to west or from west to east. A provided from GPS. The system estimates the right time of tram route direction can be estimated by getting the difference arrival with an accuracy of 91.81%. between two sets of GPS readings. As for the tram ID, it can be identified using map matching [31], [32] at splitting areas The Trans-Sense working scenario, decribed in this paper, (see Figure 1). can be easily applied to trains/ metros whose stopping pat- terns are predictable as well as similar lightrail transportation IV. RELATED WORK systems with larger number of lines/ routes. Generalization to vehicles with chaotic stoppage patterns as well as the exact Special devices have been installed in transporation ve- strategy of the stochastic learning for the adaptation of the hicles to estimate the tranportation system parameters. This learned distributions are future research topics. includes GPS devices or specific smart phones attached to the tracked transportation vehicles. However, independent GPS REFERENCES devices are relatively expensive and using a small number [1] E. St-Louis, K. Manaugh, D. v. Lierop, and A. El-Geneidy, “The of smart phones (typically one per vehicle) does not provide happy commuter: A comparison of commuter satisfaction across enough data to estimate the required variables. modes,” Transportation Research Part F: Traffic Psychology and Behaviour, vol. 26, pp. 160 – 170, 2014. [Online]. Available: 3 The exact strategy of this type of learning is under investigation. http://www.sciencedirect.com/science/article/pii/S1369847814001107 [2] Y. Tyrinopoulos and C. Antoniou, “Public transit user satisfaction: [20] B. Townsend and P. Fenton, “A practical approach to the reduction of Variability and policy implications,” Transport Policy, vol. 15, no. 4, pseudorange multipath errors in a l1 gps receiver,” in Proceedings of pp. 260 – 272, 2008. [Online]. Available: http://www.sciencedirect. the 7th International Technical Meeting of the Satellite Division of the com/science/article/pii/S0967070X08000346 Institute of Navigation, Salt Lake City, UT, USA, 1994. [3] K. E. Watkins, B. Ferris, A. Borning, G. S. Rutherford, and [21] R. D. Van Nee, J. Siereveld, P. C. Fenton, and B. R. Townsend, “The D. Layton, “Where Is My Bus? Impact of mobile real-time multipath estimating delay lock loop: approaching theoretical accuracy information on the perceived and actual wait time of transit limits,” in Position Location and Navigation Symposium, 1994., IEEE. riders,” Transportation Research Part A: Policy and Practice, IEEE, 1994, pp. 246–251. vol. 45, no. 8, pp. 839 – 848, 2011. [Online]. Available: [22] J. Soubielle, I. Fijalkow, P. Duvaut, and A. Bibaut, “Gps positioning http://www.sciencedirect.com/science/article/pii/S0965856411001030 in a multipath environment,” IEEE Transactions on Signal Processing, [4] A. El-Geneidy, J. Hourdos, and J. Horning, “Bus Transit Service vol. 50, no. 1, pp. 141–150, 2002. Planning and Operations in a Competitive Environment,” Journal of [23] D. J. Wheeler, D. S. Chambers et al., Understanding statistical process Public Transportation, vol. 12, no. 3, pp. 39–59, Sep. 2009. [Online]. control. SPC press, 1992. Available: http://scholarcommons.usf.edu/jpt/vol12/iss3/3/ [24] Z. Akbari and R. Unland, Automated Determination of the Input [5] E. A. Vasconcellos, Urban Transport Environment and Equity: The case Parameter of DBSCAN Based on Outlier Detection. Cham: Springer for developing countries. Routledge, 2014. International Publishing, 2016, pp. 280–291. [Online]. Available: [6] P. Muhammad Tahir Masood PhD, “Transportation problems in de- https://doi.org/10.1007/978-3-319-44944-9 24 veloping countries pakistan: a case-in-point,” International Journal of [25] K. Black, Business statistics: for contemporary decision making. John Business and management, vol. 6, no. 11, p. 256, 2011. Wiley & Sons, 2011. [7] V. Jain, A. Sharma, and L. Subramanian, “Road traffic congestion in [26] M. Ester, H.-P. Kriegel, J. Sander, and X. Xu, “Density-based spatial the developing world,” in Proceedings of the 2Nd ACM Symposium clustering of applications with noise,” in Int. Conf. Knowledge Discov- on Computing for Development, ser. ACM DEV ’12. New ery and Data Mining, vol. 240, 1996. York, NY, USA: ACM, 2012, pp. 11:1–11:10. [Online]. Available: http://doi.acm.org/10.1145/2160601.2160616 [27] M. Elhamshary, M. Youssef, A. Uchiyama, H. Yamaguchi, and T. Hi- gashino, “Crowdmeter: Congestion level estimation in railway stations [8] B. Strauch, Investigating human error: Incidents, accidents, and com- using smartphones,” in IEEE International Conference on Pervasive plex systems. CRC Press, 2017. Computing and Communications (PerCom 2018), 03 2018. [9] Y. Fan, A. Guthrie, and D. Levinson, “Waiting time perceptions at transit [28] ——, “Activity recognition of railway passengers by fusion of stops and stations: Effects of basic amenities, gender, and security,” low-power sensors in mobile phones,” in Proceedings of the 23rd Transportation Research, Part A: Policy and Practice, vol. 88, pp. 251– SIGSPATIAL International Conference on Advances in Geographic 264, 6 2016. Information Systems, ser. SIGSPATIAL ’15. New York, NY, USA: [10] J. B. Ingvardson, O. A. Nielsen, S. Raveau, and B. F. Nielsen, ACM, 2015, pp. 57:1–57:2. [Online]. Available: http://doi.acm.org/10. “Passenger arrival and waiting time distributions dependent on train 1145/2820783.2820847 service frequency and station characteristics: A smart card data [29] E. Moustafa, U. Akira, Y. Hirozumi, and H. Teruo, “Landmarksense: analysis,” Transportation Research Part C: Emerging Technologies, A mobile sensing system for automatic detection of railway stations vol. 90, pp. 292 – 306, 2018. [Online]. Available: http://www. landmarks,” Osaka University, Tech. Rep. 20, dec 2015. sciencedirect.com/science/article/pii/S0968090X1830319X [30] B. Chamberlain, “Nasa jet propulsion laboratory (jpl) - space [11] M. Wardman, “Roundtable summary and conclusions,” p. 74, 2014. mission and science news, videos and images.” [Online]. Available: [12] T. Nygaard, M.F., “Waiting time strategy for passen- https://andrew.hedges.name/experiments/haversine/ gers,” in In: Proceedings from the Annual Transport Conference at [31] H. Aly and M. Youssef, “semmatch: Road semantics-based accurate Aalborg University, ser. ACM DEV ’12, 2016, pp. 14:1–11:10. map matching for challenging positioning data,” in Proceedings [13] A. M. AbdelAziz and M. Youssef, “The diversity and scale matter: of the 23rd SIGSPATIAL International Conference on Advances Ubiquitous transportation mode detection using single cell tower in- in Geographic Information Systems, ser. SIGSPATIAL ’15. New formation,” in 2015 IEEE 81st Vehicular Technology Conference (VTC York, NY, USA: ACM, 2015, pp. 5:1–5:10. [Online]. Available: Spring), May 2015, pp. 1–5. http://doi.acm.org/10.1145/2820783.2820824 [14] A. M. AbdelAziz, “Poster: Monosense: An energy efficient [32] R. Mohamed, H. Aly, and M. Youssef, “Accurate real-time map match- transportation mode detection system,” in Proceedings of the 14th ing for challenging environments,” IEEE Transactions on Intelligent Annual International Conference on Mobile Systems, Applications, Transportation Systems, vol. 18, no. 4, pp. 847–857, April 2017. and Services Companion, ser. MobiSys ’16 Companion. New [33] P. Zhou, Y. Zheng, and M. Li, “How long to wait?: Predicting York, NY, USA: ACM, 2016, pp. 3–3. [Online]. Available: bus arrival time with mobile phone based participatory sensing,” http://doi.acm.org/10.1145/2938559.2948798 in Proceedings of the 10th International Conference on Mobile [15] A. Mao, C. G. Harrison, and T. H. Dixon, “Noise in gps coordinate Systems, Applications, and Services, ser. MobiSys ’12. New time series,” Journal of Geophysical Research: Solid Earth, vol. 104, York, NY, USA: ACM, 2012, pp. 379–392. [Online]. Available: no. B2, pp. 2797–2816, 1999. http://doi.acm.org/10.1145/2307636.2307671 [16] H. Aly and M. Youssef, “Dejavu: An accurate energy-efficient outdoor [34] J. Biagioni, T. Gerlich, T. Merrifield, and J. Eriksson, “Easytracker: localization system,” in Proceedings of the 21st ACM SIGSPATIAL Automatic transit tracking, mapping, and arrival time prediction International Conference on Advances in Geographic Information using smartphones,” in Proceedings of the 9th ACM Conference Systems, ser. SIGSPATIAL’13. New York, NY, USA: ACM, 2013, on Embedded Networked Sensor Systems, ser. SenSys ’11. New pp. 154–163. [Online]. Available: http://doi.acm.org/10.1145/2525314. York, NY, USA: ACM, 2011, pp. 68–81. [Online]. Available: 2525338 http://doi.acm.org/10.1145/2070942.2070950 [17] A. Giremus, J.-Y. Tourneret, and V. Calmettes, “A particle filtering [35] R. Verma, A. Shrivastava, B. Mitra, S. Saha, N. Ganguly, S. Nandi, and approach for joint detection/estimation of multipath effects on gps S. Chakraborty, “Urbaneye: An outdoor localization system for public measurements,” IEEE Transactions on Signal Processing, vol. 55, no. 4, transport,” IEEE INFOCOM 2016 - The 35th Annual IEEE International pp. 1275–1285, 2007. Conference on Computer Communications, pp. 1–9, 2016. [18] T. Kos, I. Markezic, and J. Pokrajcic, “Effects of multipath reception [36] M. Ibrahim and M. Youssef, “CellSense: An accurate energy-efficient on gps positioning performance,” in Elmar, 2010 Proceedings. IEEE, GSM positioning system,” IEEE TVT, 2012. 2010, pp. 399–402. [37] M. Abbas, M. Elhamshary, H. Rizk, M. Torki, and M. Youssef, [19] M. M. Chansarkar, “Resolving time ambiguity in gps using over- “Wideep: Wifi-based accurate and robust indoor localization system determined navigation solution,” Sep. 9 2003, uS Patent 6,618,670. using deep learning,” 01 2019.