On the Accuracy of Schedule-Based GTFS for Measuring Accessibility

Total Page:16

File Type:pdf, Size:1020Kb

On the Accuracy of Schedule-Based GTFS for Measuring Accessibility On the Accuracy of Schedule-Based GTFS for Measuring Accessibility Nate Wessel1 and Steven Farber2 1 Department of Geography and Planning, University of Toronto 2 Department of Human Geography, University of Toronto Scarborough April 9, 2019 Abstract In this paper we assess the accuracy with which General Transit Feed Specification (GTFS) schedule data can be used to measure accessibility by public transit as it varies over space and time. We use archived Automatic Vehicle Location (AVL) data from four North American transit agencies to produce a detailed reconstruction of actual transit vehicle move- ments over the course of five days in a format that allows for travel time estimation directly comparable to schedule-based GTFS. With travel times estimated on both schedule-based and retrospective networks, we compute and compare a variety of accessibility measures. We find that origin-based accessibility even when averaged over one-hour periods can vary widely between locations. Origins with lower scheduled access tend to produce less reliable estimates with more variability from hour to hour in real accessibility, while higher access zones seem to converge on an estimate 5-15% lower than the schedule predicts. Such over- and under-predictions exhibit strong spatial patterns which should be of concern to those using accessibility metrics in statistical models. Momentary measures of accessibility are briefly discussed and found to be weakly related to momentary changes in real access. These findings bring into question the validity of some recent applications of GTFS data and point the way toward more robust methods for calculating accessibility. 1 Introduction Over the last decade, the General Transit Feed Specification (GTFS) has been established as a standard format for exchanging information on scheduled transit operations. GTFS was designed to enable point-to-point routing applications for transit users, and around a thousand (Zervaas, 1 2018) transit agencies around the world are now engaged in creating and updating GTFS datasets, primarily for the role these play in enabling popular applications such as Google Transit (Antrim, Barbeau, et al., 2013). Transport researchers however have been quick to see other applications for this rich new source of data; the same format that allows efficient point-to-point route-finding in user-facing applications, also allows for the efficient estimation of transit travel times between many points all across a city or even a country (Owen and D. Levinson, 2016). The large travel time datasets thus derived have been used as the basis for studies of accessibility by transit in a number of different contexts. These applications range from a concern with an equitable distribution of access (e.g. Pereira et al., 2018; M. Widener et al., 2015) to accessibility as an input to mode-share models (e.g. Owen and D. Levinson, 2015; Boisjoly and El-Geneidy, 2016) to accessibility as a metric for use in transport planning decisions (e.g. Farber and Grandez, 2016; Stewart, 2017b; Farber and Fu, 2017). It is important to remember however that GTFS data is only a schedule, an expectation for future transit service, and not necessarily a realistic description of service as it actually happens. Transit vehicles often run late, get stuck in traffic, depart the station early, and require detours around obstacles. Thus there is potentially a large gap between the accessibility we would expect based only on a schedule and the accessibility that people actually experience in the real world. If this is so then our measures of accessibility may have substantial error, or more likely, are systematically biased. This brings into question the validity and accuracy of numerous studies that make use of GTFS-based travel time calculations and accessibility scores, and may suggest the need either to explicitly acknowledge the limitations of schedule-based analyses or find a more realistic way of measuring accessibility. While there have been numerous studies assessing schedule adherence as a general performance metric (e.g. El-Geneidy, Horning, and Krizek, 2011; Bertini and El-Geneidy, 2003), very few have yet considered the network effects of schedule non-adherence on door-to-door travel time estimates and the aggregate accessibility measures derived from them. This study takes four North American transit agencies as case studies: • the Toronto Transit Commission (TTC) • the Jacksonville Transportation Authority (JTA) • the Metropolitan Boston Transportation Authority (MBTA) • the San Francisco Municipal Transportation Agency (SF Muni) For each agency, we use archived Automatic Vehicle Location (AVL) data to construct a routable ground-truth dataset to which the schedule-based GTFS data can be compared directly. Our assumption is that observations of the transit fleet based on AVL systems, while imperfect, are 2 likely much more accurate representations of what happens on the ground than schedules produced before the events actually take place. These paired datasets, which we refer to as schedule-based and retrospective, are then each used to estimate travel times and accessibility scores across the four agencies at every minute of the morning and evening peak periods over the course of five weekdays. Our analytic strategy is to calculate identical accessibility scores with the schedule- based and retrospective datasets and compare these to understand the nature of any systematic and/or random differences between them. The first goal of this paper is to uncover any systematic bias present in schedule data. Schedule- based accessibility measures that are systematically too high or too low could bias a comparison of transit accessibility with other modes or between agencies and regions. A second goal is to find out how much observed levels of accessibility vary around these typical values, and whether that variation exhibits strong spatial patterns. This should give researchers some idea how confident they should be when using schedule data to estimate accessibility levels at any particular time or place. 2 Background 2.1 Review of GTFS Accessibility Studies Travel time estimates have long been used as a way of understanding transport accessibility. With the rise of GTFS data, detailed time-based accessibility studies for public transit were made possible at a large scale and there has been a great deal of recent work making use of schedule- based transit accessibility measures. At their core these studies use the cost of travel, usually measured only in total travel time1, as the basis for assessing the comparative utility of transit from different places and times. It would be impossible to mention every application, but our purpose in this section is to review some of the more common themes and to survey the methods employed. One of the most common applications of GTFS in the literature is to assess urban accessibility in terms of environmental justice. For example, Farber, Morang, and M. Widener (2014) use GTFS to look at small-scale temporal variability in transit access to supermarkets in Cincinnati, finding that few of the city's low income residents have adequate access to healthy food. Pereira et al. (2018) look at the social distribution of change in access to schools and jobs in Rio de Janiero after a major restructuring of the transit system. Fransen et al. (2015) combine socio-demographic data and a GTFS accessibility metric to look for gaps in service provision in Belgium where households 1For examples of other metrics applied to accessibility, see El-Geneidy, D. Levinson, et al. (2016) and Cui and D. Levinson (2018). 3 without cars are not collocated with high levels of transit access. And El-Geneidy, Buliung, et al. (2016) compare the spatial distribution of transit access to the relative social advantage of neighborhoods in Toronto. Another common theme is the use of GTFS to project changes in accessibility from current state to a future planned state. Researchers can do this by taking a published GTFS package as the status quo and then simply add in some proposed transit line while holding other service more or less constant. For example, Ma and Jan-Knaap (2014) make use of this technique to demonstrate an expected change in access to jobs that would result from the development of a proposed light rail project in Maryland. Farber and Grandez (2016) explore competing transit development schemes in the Greater Toronto area. And J. Lee and Miller (2018) compare accessibility outcomes before and after a proposed bus rapid transit project in Columbus, Ohio. Conway, Byrd, and Eggermond (2018) discuss some of the practical problems with fabricating GTFS schedule data before a project is built and suggest strategies for estimating accessibility with a variety of alternative schedules for proposed services. Other applications of GTFS accessibility analysis include assessing historical levels of access over time within a single city (Farber and Fu, 2017), estimation of block-group level transit mode share at a metropolitan scale (Owen and D. Levinson, 2015), and measuring students access to campus as related to activity participation (Allen and Farber, 2018b). Such applications make use of a wide range of techniques for calculating accessibility. As transit vehicles arrive and depart at discrete times, transit travel times can vary widely from moment to moment (Anderson, Owen, and D. M. Levinson, 2012) and there has been some disagreement on how large a temporal sample is necessary to generate a representative travel time and thus accessibility metric. Estimating travel times from GTFS can be a computationally expensive process and many are reluctant to take a larger sample than they feel is necessary for their purposes (Stepniak et al., 2019). At one extreme, researchers have picked just one time as representative (e.g. 8am for the morning commute) and estimated travel times from that moment only (e.g. Ma and Jan-Knaap, 2014; M. Widener et al., 2015). Boisjoly and El-Geneidy (2016) conducted a comparative analysis of time-sensitive transit accessibility measures, finding them to be generally correlated and appropriate at least for a mode share regression model.
Recommended publications
  • On the Accuracy of Schedule-Based GTFS for Measuring Accessibility
    On the Accuracy of Schedule-Based GTFS for Measuring Accessibility Nate Wessel1 and Steven Farber2 1 Department of Geography and Planning, University of Toronto 2 Department of Human Geography, University of Toronto Scarborough Abstract In this paper we assess the accuracy with which General Transit Feed Specification (GTFS) schedule data can be used to measure accessibility by public transit as it varies over space and time. We use archived Automatic Vehicle Location (AVL) data from four North American transit agencies to produce a detailed reconstruction of actual transit vehicle move- ments over the course of five days in a format that allows for travel time estimation directly comparable to schedule-based GTFS. With travel times estimated on both schedule-based and retrospective networks, we compute and compare a variety of accessibility measures. We find that origin-based accessibility even when averaged over one-hour periods can vary widely between locations. Origins with lower scheduled access tend to produce less reliable estimates with more variability from hour to hour in real accessibility, while higher access zones seem to converge on an estimate 5-15% lower than the schedule predicts. Such over- and under-predictions exhibit strong spatial patterns which should be of concern to those using accessibility metrics in statistical models. Momentary measures of accessibility are briefly discussed and found to be weakly related to momentary changes in real access. These findings bring into question the validity of some recent applications of GTFS data and point the way toward more robust methods for calculating accessibility. 1 1 Introduction Over the last decade, the General Transit Feed Specification (GTFS) has been established as a standard format for exchanging information on scheduled transit operations.
    [Show full text]
  • Master of Arts
    Discovering the Space-Time Dimensions of Schedule Padding and Delay from GTFS and Real-Time Transit Data Thesis submitted in May of 2015 to the Graduate School of the University of Cincinnati In partial fulfillment of the requirements for the degree of Master of Arts In the Department of Geography of the College of Arts and Sciences by Nathan S. Wessel Bachelor of Urban Planning, University of Cincinnati Committee chaired by Dr. Michael Widener Abstract Schedule padding is the extra time added to transit schedules due to expected random variability in travel times throughout a route. To this point, methods for applying padding to certain route segments and times have been relatively unsophisticated, largely reacting to observed changes in travel time variability relative to the existing schedule. By comparing schedule data and real-time vehicle locations, we aim to locate the segments of routes that are most affected by this random variability, and thus have the most padding. These segments could most benefit from targeted delay reduction techniques, such as signal prioritization or multi-door boarding. We also outline cartographic methods that could be used to depict such results to lay people and policy-makers. Our approach is relevant to any city with both General Transit Feed Specification (GTFS) data and a real-time vehicle location feed, though we take a single large city as our case study. For this research, we focus on Toronto, Ontario, and the Toronto Transit Commission. We use real-time transit vehicle locations, obtained from a publicly available API, to establish what we take to be a reasonable maximum speed for each segment of a route, or any set of routes.
    [Show full text]
  • Training Manual
    / TRANSIT SERVICE PLAN NING AND SCHEDULING Training Manual Lehman Center for Transportation Research Florida International University 10555 West Flagler Street, EC 3609 Miami, FL 33174 Tel: 305-348-3144 | Fax: 305-348-2802 Email: [email protected] in association with National Center for Transit Research University of South Florida 4202 E. Fowler Ave., CUT100 Tampa, FL 33620 Tel: 813-974-3120 | Fax: 305-974-5168 Email: [email protected] i Training Manual for Transit Service Planning and Scheduling Copyright © 2015 by Lehman Center for Transportation Research Lehman Center for Transportation Research Florida International University 10555 West Flagler Street, EC 3609 Miami, FL 33174 All rights reserved. No part of this manual may be photocopied or reproduced in any form without written permission from the publisher. Moreover, no part of this publication can be stored in a retrieval system, transmitted by any means, or recorded or otherwise, without written permission from the author. Limits of Liability and Disclaimer of Warranty While every precaution has been taken in preparing this manual, including research, development, and testing, the publisher and author assume no responsibility for errors or omissions. No liability is assumed by either publisher or author for damages resulting in the use of this information. Printed in the United States of America ii Foreword The manual is intended for use by new transit staff, as well as seasoned professionals who want to review key concepts and best practices in the transit industry. The manual consists of two sections: Transit Planning and Transit Scheduling. It covers material for performing essential transit tasks.
    [Show full text]
  • Nurail Project Nurail2012-MIT-RO1
    NURail Project NURail2012-MIT-RO1 The final report for NURail project: NURail2012-MIT-R01 consists of three distinct documents. The first 233 pages are titled “The Impact of Amtrack Performance in the Northeast Corridor” By Tolulope A. Ogunbekun. The second 174 pages are a report titled “Capacity Challenge on the California High-Speed Rail Shared Corridors: How Local Decisions Have Statewide Impacts” By Samuel J. Levy and the last 206 pages consists of “Analysis of Capacity Pricing and Allocation Mechanisms in Shared Railway Systems” by Maite (Maria Teresa) Pena-Alcaraz. These were completed under grant number: DTRT12-G-UTC18. NURail Project ID: NURail2012-MIT-R01 High-Speed Rail as a Complex Sociotechnical System The Impact of Amtrack Performance in the Northeast Corridor By Tolulope A. Ogunbekun Supervised by Professor Joseph M. Sussman 10/1/2015 Grant Number: DTRT12-G-UTC18 1 2 DISCLAIMER The contents of this report reflect the views of the authors, who are responsible for the facts and the accuracy of the information presented herein. This document is disseminated under the sponsorship of the U.S. Department of Transportation’s University Transportation Centers Program, in the interest of information exchange. The U.S. Government assumes no liability for the contents or use thereof. 3 TECHNICAL SUMMARY Title The Impact of Amtrak Performance in the Northeast Corridor Author: Tolulope A. Ogunbekun Introduction The problem addressed in this report: develop an understanding of how Amtrak performs in the Northeast Corridor (NEC) of the US. The primary goal of this research is to study the impact of Amtrak’s performance in the Northeast Corridor.
    [Show full text]
  • Calgary-Bow Valley Mass Transit Feasibility Study Client Ref: RFP 1-500-5330-5320
    REPORT Calgary -Bow Valley Mass Transit Feasibility Study (Client Reference: RFP 1-500-5330-5320) Final Report Prepared for: The Town of Banff Prepared by: CPCS In association with sub-contractors: Dillon Consulting Ltd. Dominion Railway Services Ltd. Iron Moustache CPCS Ref: 17191 November 5, 2018 REPORT | Calgary-Bow Valley Mass Transit Feasibility Study Client Ref: RFP 1-500-5330-5320 Table of Contents Executive Summary ........................................................................................................................ i Acronyms / Abbreviations ............................................................................................................. ix 1 Introduction ............................................................................................................................... 1 Background ................................................................................................................................ 1 Project Objectives ..................................................................................................................... 2 Project Structure ....................................................................................................................... 2 Methodology ............................................................................................................................. 2 Limitations ................................................................................................................................. 3 Outline of this Report
    [Show full text]
  • A Concept for Flexible Operations and Optimized Traffic Into Metroplex Regions
    https://ntrs.nasa.gov/search.jsp?R=20120000031 2019-08-30T18:39:29+00:00Z NASA/CR—2011-217302 A Concept for Flexible Operations and Optimized Traffic into Metroplex Regions Daniel DeLaurentis, Steve Landry, and Dengfeng Sun Purdue University, West Lafayette, Indiana Fred Wieland and Ankit Tyagi Intelligent Automation, Inc., Rockville, Maryland 'HFember 2011 NASA STI Program . in Profile Since its founding, NASA has been dedicated x CONFERENCE PUBLICATION. Collected to the advancement of aeronautics and space papers from scientific and technical science. The NASA scientific and technical conferences, symposia, seminars, or other information (STI) program plays a key part in meetings sponsored or co-sponsored by helping NASA maintain this important role. NASA. The NASA STI program operates under the x SPECIAL PUBLICATION. Scientific, auspices of the Agency Chief Information Officer. technical, or historical information from It collects, organizes, provides for archiving, and NASA programs, projects, and missions, disseminates NASA’s STI. The NASA STI often concerned with subjects having program provides access to the NASA Aeronautics substantial public interest. and Space Database and its public interface, the NASA Technical Report Server, thus providing x TECHNICAL TRANSLATION. English- one of the largest collections of aeronautical and language translations of foreign scientific space science STI in the world. Results are and technical material pertinent to NASA’s published in both non-NASA channels and by mission. NASA in the NASA STI Report Series, which includes the following report types: Specialized services also include creating custom thesauri, building customized databases, x TECHNICAL PUBLICATION. Reports of and organizing and publishing research results.
    [Show full text]