Master of Arts

Discovering the Space-Time Dimensions of Schedule Padding and Delay from GTFS and Real-Time Transit Data Thesis submitted in May of 2015 to the Graduate School of the University of Cincinnati In partial fulfillment of the requirements for the degree of Master of Arts In the Department of Geography of the College of Arts and Sciences by Nathan S. Wessel Bachelor of Urban Planning, University of Cincinnati Committee chaired by Dr. Michael Widener Abstract Schedule padding is the extra time added to transit schedules due to expected random variability in travel times throughout a route. To this point, methods for applying padding to certain route segments and times have been relatively unsophisticated, largely reacting to observed changes in travel time variability relative to the existing schedule. By comparing schedule data and real-time vehicle locations, we aim to locate the segments of routes that are most affected by this random variability, and thus have the most padding. These segments could most benefit from targeted delay reduction techniques, such as signal prioritization or multi-door boarding. We also outline cartographic methods that could be used to depict such results to lay people and policy-makers. Our approach is relevant to any city with both General Transit Feed Specification (GTFS) data and a real-time vehicle location feed, though we take a single large city as our case study. For this research, we focus on Toronto, Ontario, and the Toronto Transit Commission. We use real-time transit vehicle locations, obtained from a publicly available API, to establish what we take to be a reasonable maximum speed for each segment of a route, or any set of routes. We then compare this ideal performance to the scheduled performance, derived from GTFS data, where the difference between the two can be interpreted as the amount of schedule padding on a segment at a particular time. Since schedule padding is a response to stochastic delay, this technique should lead us to the spatio-temporal locations of the most significant sources of delay and guide attempts to reduce delay and maintain acceptable on-time performance. Results suggest that choke-points are observed to crop up in expected places, like downtown at rush-hour, or near major signalized intersections. What is interesting is that we are able to begin to quantify delays in one part of a transit system, and compare them to other delays anywhere else in the same system. This should help to guide infrastructure investments that can minimize the impacts of random delay and to justify such expenses with explicit potential time or money savings. Transit delay has been discussed extensively in the literature and in the press, but this delay is relative to a schedule which can be fairly arbitrary. This thesis is novel in its emphasis on schedule padding as a signal of avoidable delay below the level of the scheduled expectation. II III Table of Contents List of Figures, Maps, Tables and Videos............................................................................................V 1 Introduction........................................................................................................................................1 2 Literature Review..............................................................................................................................5 2.1 Data-derived performance measures for industry......................................................................5 2.2. The great opening of transit data and the real-time revolution.................................................6 2.3 Performance measures outside of industry................................................................................9 2.4 What remains to be considered................................................................................................11 3 Outline of the Project.......................................................................................................................12 4 Passengers as Sources of Delay.......................................................................................................15 4.1 Stopping...................................................................................................................................15 4.2 Internal congestion...................................................................................................................16 5 Data..................................................................................................................................................18 5.1 General Transit Feed Specification..........................................................................................18 5.1.1 Simplification, Segmentation, and Summation....................................................................18 5.2 Real-time Vehicle Locations....................................................................................................21 5.2.1 Handling Errors.....................................................................................................................22 5.2.2 Distinguishing Tracks...........................................................................................................23 6 Taking Measures..............................................................................................................................25 6.1 Establishing reasonable minimum speeds on edges................................................................27 6.2 Calculating schedule padding..................................................................................................28 7 Cartographic Depiction....................................................................................................................31 7.1 Maps.........................................................................................................................................31 7.2 Animated Maps........................................................................................................................57 8 Interpretation and Discussion..........................................................................................................61 8.1 Do the results match basic expectations?.................................................................................61 8.2 Some exemplary segments.......................................................................................................63 9. Concluding remarks........................................................................................................................67 9.1 Lessons learnt and directions for future research.....................................................................67 9.2 In Summary..............................................................................................................................70 References:.........................................................................................................................................72 Appendix 1: Code...............................................................................................................................76 Appendix 2: Example padding calculation.........................................................................................84 IV List of Figures, Maps, Tables and Videos Figures 1. State-space of schedule padding and random delay. page 3. 2. A metaphor for schedule padding. page 12. 3. Map showing redundancy of stops in TTC GTFS data. page 19. 4. Map showing intended effect of stop clustering technique. page 20. 5. Map showing unintended effect of stop clustering technique. page 20. 6. Map showing sample polylines generated from NextBus API vehicle locations feed. page 22. 7. Map showing location reporting error in Downtown Toronto. page 23. 8. Taking measurements: matching vehicle tracks to scheduled edges. page 25. 9. Taking measurements: intersections of vehicle tracks with stop cluster geometries. page 26. 10. Taking measurements: points on line intersections from which the measurements are finally taken. page 26. 11. Cumulative distribution of service hours at observed speeds across all segments. page 28. 12. Scheduled temporal distribution of service hours and the proportion of them which appears to be padding. page 30. 13. Satellite image of East Eglinton. page 66. Maps 1A & 1B: Observed mean speeds on network edges. pages 33, 35. 2A & 2B: Scheduled mean speeds on network edges. pages 37, 39. 3A & 3B: Observed fastest speeds (9th decile observations). pages 41, 43. 4A & 4B: Speed of fastest scheduled trips. pages 45, 47. 5A & 5B: Padding per trip per kilometer given 9th decile observations as maximum speed. pages 49, 51. V 6A & 6B: Padding per trip per kilometer given fastest scheduled speeds as maximum speed. pages 53, 55. 7: Padding on Eglinton Avenue East between Laird Drive and Yong Street. page 64. 8: Padding on Sheppard Avenue West from Allen Road to Keele Street. page 64. 9: Padding on Broadview Avenue and Pape Avenue. page 65. Tables 1. Variables returned by the NextBus API's vehicleLocations command. page 21. 2. Correlation between various values on network edges. page 62. Videos 1. Map 5A animated with 15 minute time steps. page 58. 2. Map 6A animated with 15 minute time steps. page 59. VI 1 Introduction Public transportation is a complex phenomenon, fundamental to the growth and maintenance of major cities in the industrial and postindustrial parts of the world. A great deal has been written about public transit in recent

Load more