Optimizing travel routes using temporal networks constructed from GPS data
Tatsuro Mukai, Yuichi Ikeda Graduate School of Advanced Integrated Studies in Human Survivability Kyoto University Kyoto, Japan [email protected], [email protected]
Abstract—Because of the complexity of urban transportation mation terminals, such as smartphones, are now widely used networks and the temporal changes in traffic conditions, it and carried by most individuals at all times. Therefore, it is difficult to assess real-time traffic situations. However, the has become possible to compose big data from the global development of information terminals has made it easier to obtain personal mobility information. In this study, we propose positioning system (GPS) measured using these devices and methods for evaluating the mobility of people in a city using use the data to analyze movement [6]. global positioning system data. There are two main methods for To analyze human mobility, network science is useful [7][8]. evaluating movement. One is to create a temporal network from Networks whose connection status changes with time are real data and check the change in travel time according to time called temporal networks [9]. They are more difficult to eval- zones or seasons. Temporal networks are difficult to evaluate because of their time complexity, and in this study, we proposed uate than static networks because of their time-varying nature, an evaluation method using the probability density function of but since they represent real-world problems well, they have travel time. The other method is to define a time-dependent been used in various fields such as interpersonal communica- traveling salesman problem and find an efficient traveling route tion [10][11][12][13], transmission to an unspecified number by finding the shortest path. By creating a time-dependent of people using SNS and the web [14][15][16][17], physi- traveling salesman problem in an existing city and solving it, a traveler can choose an efficient route by considering traffic cal contact [18][19][20][21], and cytology [22][23][24][25]. conditions at different times of the day. We used 2 months of Temporal networks are also preferred to represent movement data from Kyoto City to conduct a traffic evaluation as a case as a network, to capture changes in traffic conditions over study. time. Kujala et al. used temporal networks to evaluate the Index Terms—Temporal network, Dynamic network, Route public transportation system in Helsinki [26]. They constructed recommendation the networks from timetables and open map information and evaluated public transportation by focusing on travel time and I.INTRODUCTION the number of transfers. One of the major issues confronting urban areas is trans- Besides, the shortest path problem can be used to find portation planning [1]. Smooth transportation contributes sig- an efficient travel route. In recent years, the shortest path nificantly to residents’ life satisfaction, but transportation problem is as a real-world problem with a time component planning is becoming more difficult because of the global [27]. Heuristic solutions to such a problem, such as the time- population growth in urban areas. In addition, the development dependent traveling salesman problem (TDTSP), have been of tourism has increased the number of visitors to these areas, devised [28][29][30][31]. and a known tourism problem has emerged [2]. This negatively In this study, we propose two methods: one is to evaluate affects both visitors and residents [2][3][4]. human mobility using temporal networks constructed from arXiv:2106.00328v1 [cs.SI] 1 Jun 2021 Three reasons make road traffic in urban areas difficult GPS data, and the other is to search for the shortest path by [5]. First, the urban road networks are typically compli- constructing and solving the TDTSP. In both methods, location cated. Cities are often densely populated with tunnels and information measured from smartphones is converted into a elevated structures that, with numerous roads, form a complex timetable of location transitions called ”transfer connections”, transportation network. Second, travelers may be heading for from which an optimal set of paths to the destination is different destinations. This creates a great deal of uncertainty obtained to construct mobility networks. In the evaluation of in transportation planning as the number of travelers increases. movement using temporal networks, networks are visualized Third, the road traffic situation changes over time. Depending and compared according to seasons, time zones, and whether on traffic congestion, travelers may change their transportation the moving person is a resident or a visitor. In a methodology and traveling route choices, and travel times for the same of finding the shortest path, the time weights of the TDTSP are traveling route can often vary significantly. determined from the set of optimal paths, and the ant colony However, thanks to the development of information termi- optimization (ACO) method is used to solve the problem. nals in recent years, it has become easier to obtain mobility In the application of the ACO method, a congestion level is data and provide mobility information to individuals. Infor- obtained from GPS and used in the calculation. The main contributions of this study include the following: destination, update the label. Repeat until you have scanned • By applying existing methods for constructing temporal all of C. networks and evaluating public transportation, we con- 2) pCSA: An extension of the CSA for more complex struct temporal networks from the GPS of smartphones. scenarios is the profile CSA (pCSA). This algorithm returns a With the GPS, it is possible to take into account other set of Pareto-optimal (departure time, arrival time) paths from means of transportation other than public transportation, each stop to a certain destination. Pareto-optimal here means and it is expected that the evaluation will be realistic, that the departure time is optimized to be slower and the arrival including delays in transportation because of congestion. time is optimized to be faster. With the development of information terminals in recent This algorithm uses an array C of timetable connections years, GPS has been commonly used, so it can be easily sorted in descending order of departure time. Whengiven a extended to other cities. Temporal networks can describe destination pt as input, it creates the empty set of Pareto- real-world problems more precisely than can static net- optimal paths for each stop, c ∈ C is scanned in turn, and works, but evaluating temporal networks is difficult. In check if pt is reachable. If it is reachable and (τdep(c), τ∗) is this study, we proposed a new evaluation method for Pareto-optimal compared to the set of paths held by the starting mobile networks using probability density functions. point of c, it is added to the set. Here τdep(c) represents the • In this study, we determined the time weights of edges departure time of c and τ∗ represents the arrival time to pt. in the TDTSP from GPS data based on the method used Do this until you have scanned all of C. to create temporal networks. Although existing studies 3) mcpCSA: The mcpCSA is an extension of the pCSA. have created the TDTSP from GPS data, the weights The mcpCSA extends the Pareto-optimal (departure time, were determined originally. In addition, to apply ACO arrival time) treated in the pCSA to the multi-criteria Pareto- the TDTSP, we decide a congestion level from the GPS optimal like (departure time, arrival time, number of transfers). and calculate the transition probability using this level. B. Time Dependent Traveling Salesman Problem Selecting a route is difficult if the destination is crowded at certain times of the day. To reduce traffic congestion, it is important to recommend efficient travel routes to facilitate movement. To determine the The rest of the paper is organized as follows. In Section II, travel route, we formulate and solve the TDTSP. we introduce the basic methodology. In Section III, we discuss 1) Problem Formulation: The TDTSP is a shortest path improvements to the basic methodology. In Section. IV, we problem on a time-dependent network represented by the describe the proposed method. Following that we discuss the directed graph G = (V, E, W, T ). Here V represents the set of experimental results in Section V. Finally, we conclude our nodes, E represents the set of edges, W represents the set of study in Section VI. time weights, and the time interval T is the set of time-varying II.BASIC METHODOLOGY periods. The weight w(vi, vj, τ) ∈ W depends on the starting point A. Temporal network configuration method vi ∈ V , the ending point vj ∈ V , and the time τ ∈ T . Although networks are used in various fields, several real- P ath(v1, vn) = [v1, v2., vn] represents the transition per- world networks change over time. Networks, those in which mutation of nodes from v1 ∈ V to vn ∈ V . the edge connections change with time are called temporal When the time starting from v1 is τ0 ∈ T , the sum networks [9]. Temporal networks are often more realistic than of the time weights of P ath(v1, vn), fτ0 (P ath(v1, vn)), is static networks and have been used in road traffic analysis. calculated recursively as follows, Kujala et al. [26] constructed a temporal network from pub- lic transport timetables to model the movement of people. f (v , v ) = w(v , v , τ ) , (1) They used a multi-criteria profile connection scan algorithm τ0 1 2 1 2 0 f (v , v ) = w(v , v , τ + f (v , v )) (mcpCSA) to construct the network. This algorithm is used τ0 1 i i−1 i 0 τ0 1 i−1 (2) to compute optimal travels in a dynamic public transportation +fτ0 (v1, vi−1). network, which can represent many events occurring between The TDTSP is a combinatorial optimization problem to find nodes as a temporal network. a path such that fτ0 (P ath(v1, vn)) is minimized. 1) CSA: The mcpCSA is an extension of the connection 2) ACO: The ACO method [32] can produce a solution to scan algorithm (CSA). The CSA can calculate the fastest time the TDTSP [28][29][30][31]. to reach each stop from a given starting point. In the ACO method, each edge e(vi, vj) ∈ E has a Instead of using a graph as Dijkstra’s algorithm does, this pheromone σi,j(τ) for each time τ ∈ T . The agent k at node algorithm uses a single array C of timetable connections sorted k i ∈ V at time τ calculates the transition probability pi,j(τ) to by departure time. Upon receiving the origin ps and departure a next node according to the pheromone, time τ as input, labels τ(p), which represent the shortest arrival time at each stop, are initialized to infinity. Then c ∈ C is α β ( [σi,j (τ)] ·[ηi,j (τ)] P α β if j ∈ Ω scanned to determine whether it is reachable. If it is reachable k [σi,s(τ)] ·[ηi,s(τ)] pi,j(τ) = s∈Ω . (3) and the arrival time of c improves the label τ(parr(c)) of the 0 else TABLE I: Number of daily IDs and logs of GPS data per day in each month of 2019. Month The number of ID per day The number of logs per day February 31,456 1,672,421 April 32,498 1,599,644
The next node may be randomly determined using a certain probability, to avoid a local solution. Here Ω is the node that agent k has not visited yet, ηij(τ) is the heuristic information, and parameters α and β are constants that control the relative importance of pheromone versus heuristic information ηij. The heuristic information ηij is calculated using the follow- ing equation, Q ηi,j(τ) = . (4) w(vi, vj, τ)
Q represents a constant and w(vi, vj, τ) is the weight of e(i, j) at time τ.
III.IMPROVEMENTS In this study, we have independently improved the method Fig. 1: The total number of data logs by time zone. Notably, described in Section II to suit our purposes. number of measurements is high during the daytime when people are actively moving and low at night. A. Improvement of pCSA In this study, we used the pCSA to construct a temporal network. It is an algorithm to calculate the optimal traveling route in a dynamic public transportation network. Since this The update of the basic pheromone is represented by the research uses GPS, we processed the GPS data so that the following equation, pCSA could use it. Transfer connections are formed from GPS movements of σ0(i,j) = ρ · σ0(i,j) + ∆σi,j. (7) the same terminal. GPS data show the location by latitude and longitude, but the granularity is too fine, so we divide the map ρ is the rate at which the pheromone remains after evapo- into meshes and form transfer connections by mesh transitions. ration. The increment of pheromone ∆σi,j is determined by ant k, B. Improvement of ACO m In this study, we set a congestion level and obtain the X k ∆σi,j = ∆σi,j, (8) pheromone for each period from the congestion level. k=1 ( Q The pheromone σi,j(τ) is given as follows: k γk if ant k pass the road e (i, j) ∆σi,j = . (9) σ 0 else σ (τ) = 0(i,j) . (5) i,j θ (τ) j Q is a constant and γkis the total moving cost of ant k. σ0(i,j) is the basic pheromone of e(i, j), and θ(τ) is the congestion level of the destination at time τ. IV. METHODOLOGY Congestion is given as a number between 0 and 1, and the With the development of information terminals, it has be- more congested a location is, the more difficult it is to be come possible to track individual movements. Even handheld chosen. Congestion is expressed by the following equation, ,including smartphones, can use the GPS. In this trend, it is becoming possible to analyze big data of lj(τ) θ (τ) = . (6) personal mobility data. It is expected that with large amounts j l j(max) of data, mobility can be evaluated in greater detail, such as We can detect the number of people at node j for each time when and where people are more likely to move. from the GPS data. lmax is the number of people at node j In this study, we propose methods for analyzing mobility per time when the number of people staying at node j was using temporal networks and creating the TDTSP to search the highest, and l(τ) is the number of people at node j during for the shortest path with meta-heuristics, using actual GPS time τ. data of individuals. B. Movement analysis with temporal network We constructed a temporal network from GPS data and analyzed the movement of people. Figure 3 shows the research flowa, and the details are as follows. • Sort GPS data by the terminal measured. • Transfer connections formed from the same terminal GPS data, and an array C is created by sorting them in descending order of the departure time. • Calculate sets of optimal paths to the destination pt with the pCSA. • Analyze the resulting set. To use the pCSA, GPS data are converted into transfer connections as described above. In this study, we divided the study area into 50-m square meshes and formed transfer connections between the meshes. From these transfer connections, the pCSA is used to cal- culate the optimal path to the destination. The pCSA provides sets of optimized paths to the input pt. For analysis, we compared the travel time by seasons and time zones by creating a probability density function of the travel time to the destination. This will be discussed in detail Fig. 2: Estimated percentage of residential areas in all logs. in the next section. Approximately 80% were Kyoto residents, and 1% were visitors from abroad. C. Travel route search using meta-heuristics The TDTSP from GPS data was used to calculate the shortest path. Figure 3 shows the research flow, and the details are as follows. A. GPS data • Calculate the optimal path from the GPS data in the same We used GPS data provided by Agoop Inc. These data were way as described above. obtained from users of a smartphone application who gave • From the obtained sets of optimal paths, determine the their consent. The data are linked to a daily ID that identifies travel cost and TDTSP. a device and allows it to be tracked for 1 day. • Meta-heuristics explore paths with low travel costs This study used 58 days of data in February and April 2019 around specified points. measured in Kyoto. In 2019, February was the month when With the pCSA, we find the optimal sets of paths to move Kyoto had the fewest number of tourists, and April was the the edges in the network. Afterward, we need to determine the month when Kyoto had the highest number of tourists. Table time weight W of the dynamic graph G. I shows the amount of data measured in each month. The We now consider how to determine the time weight number of terminals used for the measurement was higher w(vi, vj, τ) from node vi to vj. We use the pCSA with vj as in April, but the number of logs was lower. The GPS data input, and divide the sets of optimal paths obtained by dividing used in this study is obtained when the user performs some the sets by time τ ∈ T . Considering real-world scenarios, action when the application is launched. While the application w(vi, vj, τ) should be close to the cost of moving directly is running in the background, it measures when a user moves from vi to vj in that period. Since there are many paths with significantly or stays in facilities, or after a certain amount of long travel times formed by connecting intermediate points time, depending on the OS. Since the number of logs depends using the pCSA, we adopt the shortest travel time in the lower on user activities, it is difficult to determine the cause of the probability 5% of subgroups as w(vi, vj, τ) in this study. low number of logs in April. The search for the solution of the TDTSP is calculated on Figure 1 shows the number of logs measured by the hour the basis of Eqs. (3) - (9). for all data. It shows that the number of logs is low during the late-night hours when people are immobile, and the number V. RESULTS AND DISCUSSION of logs increases as people become mobile around 8:00. Several case studies of movement analysis using temporal These data are also accompanied by an estimate of the networks and shortest travel route search using meta-heuristics residence location of the terminal owner. The percentage of are conducted with location information observed in Kyoto all logs classified by immigration is shown in Fig. 2. Twenty City. percent of visitors were from outside Kyoto Prefecture, and Figure 4 is a map of Kyoto City showing the locations 1% of all visitors were foreigners. used in the case study. The locations surrounded by a red - - - - C C