<<

Transit Policy: Use of Automated Data to Improve Incremental Decision Making

By

Andrew W. Stuntz

S.B. in Economics Institute of Technology (2013)

Submitted to the Department of Civil and Environmental Engineering and the Department of Urban Studies and Planning in partial fulfillment of the requirements for the degrees of

Master of Science in Transportation and Master in City Planning

at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY June 2018

© 2018 Massachusetts Institute of Technology. All rights reserved.

Author ……………………………………………………………………………………………………… Department of Civil and Environmental Engineering Department of Urban Studies and Planning May 11, 2018

Certified by…………………………………………………………………………………………………. John P. Attanucci Research Associate, Center for Transportation and Logistics Thesis Supervisor

Certified by…………………………………………………………………………………………………. Frederick P. Salvucci Senior Lecturer, Center for Transportation and Logistics Thesis Supervisor

Accepted by………………………………………………………………………………………………… Professor of the Practice, Ceasar McDowell Chair, MCP Committee Department of Urban Studies and Planning

Accepted by...………………………………………………………………………………………………. Jesse Kroll Professor of Civil and Environmental Engineering Chair, Graduate Program Committee

2

Transit Fare Policy: Use of Automated Data to Improve Incremental Decision Making by Andrew W. Stuntz

Submitted to the Department of Civil and Environmental Engineering on May 11, 2018 in Partial Fulfillment of the Requirements for the Degrees of Master of Science in Transportation and Master in City Planning

Abstract Incremental changes in fare policy can have substantial and long-term impacts on transit ridership and revenue, but they are often driven by near-term revenue needs and determined within short time frames with limited analysis. This thesis proposes a procedural framework to organize analysis of incremental fare changes, linking exploration of current pricing strategies to estimation of behavioral parameters and modeling of fare change scenarios. Within this framework, empirical case studies are presented at two of the five largest transit agencies in the U.S. – the Massachusetts Bay Transportation Authority (MBTA) and the Transit Authority (CTA). These agencies have increased the price of passes relative to pay-per-use in recent years, motivating three particular applications that make extensive use of automated fare collection (AFC) data: 1) differentiating employer-based, pre-tax, automatically-renewing pass sales from other pass sales, 2) estimating cost sensitivity of both ridership frequency and fare product choice using only recent experience at a single agency, and 3) incorporating fare product choice in a traditional elasticity spreadsheet model to predict impacts of fare change scenarios. Passes sold through employer programs and online are found to have lower use than other passes, contributing substantially to revenue while increasing ridership; expanding these programs or extending tax benefits to all transit commuters could further increase revenue and ridership. Individual-level AFC data are used to estimate fare-related behavioral parameters: resulting MBTA elasticity estimates of -0.7 for pay-per-use and -0.5 for employer-based passes are higher than current agency assumptions of -0.25 and -0.15, use of a CTA 30-day or 7-day pass appears to boost a customer’s ridership by up to 11% or 21% (respectively), and a CTA product choice model is estimated without reliance on stated preference data. A CTA fare model combining product choice and elasticities predicts substantial switching between fare products when pass multiples are changed, and a simplified model illustrates that passes should be priced below revenue maximization to capture low-cost gains in ridership. The procedural framework in this thesis applies to all transit agencies, and the empirical applications are relevant to agencies that collect AFC data and offer multiple payment structures.

Thesis Supervisor: John P. Attanucci Title: Research Associate, Center for Transportation and Logistics Thesis Supervisor: Frederick P. Salvucci Title: Senior Lecturer, Center for Transportation and Logistics

3

4

Acknowledgments I am grateful to many people for supporting this work and supporting me over the past three years.

I would like to thank the CTA and the MBTA for providing the financial, administrative, and supervisory support that made this research possible.

At the CTA, Maulik Vaishnav provided invaluable guidance and essential context for my work. Thank you for taking an interest in this project and giving me a window into the CTA (even from afar). I am grateful to many others as well: to Jeremy Fine, Scott Wainwright, Alex Cui, Michael Connelly, Sonali Tandon, and others who provided lively discussion and insightful feedback during my visits to Chicago; to Tom McKone, Carole Morey, Laura De Castro, Silvia Garcia, and President Carter for supporting and facilitating this research; and to Marge Keller, Garrett Vandendries, and Bryan Post for always making sure I had what I needed.

At the MBTA, I would not have gotten far without Laurel Paget-Seekins and Ian Thistle. I am grateful for your extensive knowledge, dedication to improving the status quo, and good humor. The same goes for Anna Gartsman and the rest of OPMI, Evan Rowe, Brendan Fogarty, Steven Andrews, and Annette Demchur – thank you all for your support and advice, and for valuing the ideas of a lowly grad student. A special thanks to Jim D'Arcangelo, Bill Haberlin, Will Kingkade, Lynne O’Neill, and Prateek Agarwal for helping me access and understand different fare-related data.

At MIT, my research advisors, John Attanucci and Fred Salvucci, provided valuable direction and practical wisdom throughout my program, urging me every week toward a mix of idealism about what urban transit could look like and pragmatism about how to get there. Thank you for the confidence you placed in me, and for always making clear that my well-being was even more important than my research. The other faculty in the MIT Transit Lab — Jinhua Zhao, Nigel Wilson, and Haris Koutsopoulos — created a wonderful environment to share and develop ideas; I have learned a great deal from Friday seminars and will miss attending. Gabriel Sanchez-Martinez and Jay Gordon made my work possible by building the data infrastructure for AFC-based research at the MBTA, and they helped me many times to use it more efficiently and effectively. I am grateful to my MST cohort (especially Josh, David, Malivai, Adam, Scott, Sid, and Alex), my MCP cohort (especially my Intro to HCED and Brockton finance / life sciences buddies), fellow MBTA and CTA RAs (Chris, Catherine, Josh, David, Mihir, Katya, Gabe, Eli, Apaar, Ari, Ru, and Jintai), and the rest of the Transit Lab for being kind friends and generous colleagues. I thank my DUSP instructors and academic advisor Karl Seidman for helping me to see and imagine the world differently.

I am also very thankful for my church family at Christ the King Presbyterian Church and my immediate family. This master’s program was a gift, but you have been with me through both sorrows and joys. Most of all, I am grateful for Emily. Thank you for knowing me and still loving me, and for being my companion in grad school and for life.

5

6

Table of Contents 1 Introduction ...... 13 1.1 Motivation ...... 13 1.2 Research Scope and Questions ...... 15 1.3 Methodology ...... 19 1.4 Thesis Organization ...... 19 2 Case Study Background ...... 21 2.1 MBTA ...... 21 2.2 CTA ...... 34 2.3 Conclusions ...... 45 3 Literature Review ...... 46 3.1 Fare Policy Elements ...... 46 3.2 Pricing Theory and Strategy ...... 53 3.3 Exploratory Analysis of Fare-Related Behavior Using Automated Transit Data ...... 59 3.4 Demand Modeling and Fare Scenario Prediction ...... 63 4 A Framework for Incremental Transit Fare Policy Analysis ...... 87 4.1 Step 1: Identify Current Pricing Strategies ...... 87 4.2 Step 2: Describe and Segment Transit Use ...... 89 4.3 Step 3: Model Demand by Fare Structure and Market Segment...... 91 4.4 Conclusions ...... 100 5 Describing the Role of Pass Sale Channels ...... 102 5.1 Potential Significance of Pass Sale Channels and Empirical Questions ...... 102 5.2 Pass Sale Channels at the MBTA and CTA ...... 107 5.3 Prior Work on Pass Sale Channels ...... 107 5.4 Using AFC Data to Describe the Role of Pass Sale Channels at the CTA and MBTA ...... 108 5.5 Conclusions and Implications ...... 126 6 Estimating Behavioral Parameters of Fare Product Purchase and Use ...... 128 6.1 Introduction ...... 128 6.2 Empirical Setting ...... 129 6.3 Estimation of Induced Ride Factors at the CTA ...... 131 6.4 Estimation of Elasticities at the MBTA ...... 139 6.5 Estimation of Fare Product Choice Utility Parameters at the CTA ...... 164

7

6.6 Summary and Conclusions...... 175 7 Incorporating Fare Product Choice in Modeling of Fare Change Scenarios ...... 177 7.1 Introduction ...... 177 7.2 Methodology ...... 177 7.3 Model Demonstration ...... 198 7.4 Model Evaluation (January 2018 Fare Change)...... 201 7.5 Conclusions and Extensions ...... 207 8 Summary and Conclusions ...... 210 8.1 Key Findings ...... 210 8.2 Overall Conclusions and Recommendations ...... 215 8.3 Research Extensions ...... 216 References ...... 218 Appendix: Cluster-Based Segmentation Using AFC Data ...... 225 Motivation and Literature Review ...... 225 Data and Features ...... 226 K-Means Clustering ...... 227 Conclusions and Extensions ...... 233

8

List of Tables Table 1-1: Case Study Application Questions ...... 18 Table 2-1: MBTA Selected Transit Service Characteristics (2015) ...... 22 Table 2-2: MBTA Modal Ridership by Minority Status ...... 25 Table 2-3: Selected MBTA Fares, July 2012-Present ($) ...... 30 Table 2-4: MBTA Fare Revenue by Sale Channel, FY2015 ...... 32 Table 2-5: CTA Selected Transit Service Characteristics (2016) ...... 35 Table 2-6: Selected CTA Fares, January 2009-Present ($) ...... 42 Table 2-7: CTA System Fare Revenue by Sale Channel, 2017 ...... 44 Table 3-1: Single-Objective Pricing Rules ...... 47 Table 3-2: Example of Elasticity Adjustment Logic in CTA Fare Models ...... 81 Table 4-1: Initial Parameter Assumptions for Pass Pricing Model Example ...... 96 Table 5-1: Sale Channels at the MBTA and CTA ...... 107 Table 5-2: MBTA Full-Fare Pass Sales by Sale Channel, FY2017 (July 2016 – June 2017) ...... 109 Table 5-3: CTA Full-Fare Pass Sales by Sale Channel, 2017 ...... 112 Table 5-4: Monthly Revenue Loss if MBTA Corporate Monthly LinkPasses Were Eliminated (October 2016) ...... 124 Table 5-5: Monthly Revenue Loss if CTA Pre-Paid Benefits 30-Day Passes Were Eliminated (Passes Sold in October 2017) ...... 125 Table 6-1: Example Classification of Individual Changes in Sales and Ridership ...... 150 Table 6-2: Example Aggregation of Individual-Level Changes in Sales and Ridership ...... 151 Table 6-3: Calculation of Unadjusted Elasticities for MBTA LinkPasses...... 161 Table 6-4: Difference Between Assumed Pay-Per-Use Elasticity and Observed Change in Pay-Per-Use Ridership ...... 161 Table 6-5: Calculation of Adjusted Elasticities for MBTA LinkPasses ...... 162 Table 6-6: Example Data Format for Choice Model Estimation ...... 168 Table 6-7: Fare Product Market Shares in Sample and Population of Daily Choices ...... 169 Table 6-8: Weights for Choice Model Estimation ...... 170 Table 6-9: Fare Product Choice Model Estimates ...... 172 Table 6-10: Summary of Fare-Related Parameters and Estimation Methods ...... 175 Table 7-1: Model Baseline Trips by Trip Type (Millions) ...... 185 Table 7-2: Multinomial Logit Utility Parameters for Fare Product Choice ...... 187 Table 7-3: Calibration of Synthetic Baseline Fare Product Market Shares to Observed Baseline Shares 188 Table 7-4: Calibration of Pass Revenue per Customer-Week to Observed Pass Sales ...... 188 Table 7-5: Model Elasticity and Induced Ride Factor Assumptions ...... 189 Table 7-6: FERRET Predicted Bus and Rapid Transit Ridership With and Without Diversion Factors (MBTA July 2016 Fare Change Scenario, Selected Full-Fare Tariff Types) ...... 191 Table 7-7: Predicted Ridership and Revenue Impacts for Selected Fare Change Scenarios ...... 201 Table 7-8: Comparison of Predicted and Actual Early 2018 YOY Ridership and Revenue Impacts of the January 2018 Fare Change ...... 203

9

List of Figures Figure 1-1: Fare Policy “Life Cycle” ...... 16 Figure 2-1: MBTA Ridership and Passenger Miles by Mode (2015) ...... 22 Figure 2-2: MBTA Unlinked Passenger Trips by Mode (2002 – 2017) ...... 23 Figure 2-3: MBTA Service Area (2011) ...... 24 Figure 2-4: Racial Dot Map of (2010) ...... 25 Figure 2-5: Income Dot Map of Boston (2010) ...... 26 Figure 2-6: MBTA Fiscal Year 2015 Revenues and Expenses ...... 27 Figure 2-7: Revenue by MBTA Fare Structure Dimension, FY2015 ...... 28 Figure 2-8: Bus and Rapid Transit Ridership (AFC Taps) by MBTA Fare Structure Dimension, FY2015 ...... 29 Figure 2-9: MBTA Monthly Ridership (AFC Taps) and Fare Revenue Before and After Fare Change (FY15-17) ...... 31 Figure 2-10: MBTA Monthly Year-Over-Year (YOY) Change in Ridership (AFC Taps) and Fare Revenue Before and After Fare Change ...... 31 Figure 2-11: MBTA Organizational Chart ...... 34 Figure 2-12: CTA Ridership and Passenger Miles by Mode (2016) ...... 35 Figure 2-13: CTA Unlinked Passenger Trips by Mode (2002 – 2017) ...... 36 Figure 2-14: CTA ‘L’ (Left) and Metra (Right) Maps ...... 37 Figure 2-15: Racial Dot Map of Chicago (2010) ...... 38 Figure 2-16: Income Dot Map of Chicago (2010) ...... 38 Figure 2-17: CTA 2016 Revenues and Expenses ...... 39 Figure 2-18: Revenue by CTA Fare Structure Dimension, 2017 ...... 41 Figure 2-19: Ridership by CTA Fare Structure Dimension, 2017 ...... 41 Figure 2-20: CTA Ridership by Tariff, 2009 - 2017 ...... 43 Figure 3-1: Conceptual Elements of Fare Policy ...... 46 Figure 3-2: Example Metrics for Multi-Criteria Decision-Making ...... 48 Figure 3-3: MBTA Bus and Subway Validations Dashboard, October 2015 ...... 60 Figure 3-4: Time Distribution of MBTA Bus and Subway Validations by Time Period and 4 Pass Type, October 2015 ...... 60 Figure 3-5: MBTA FERRET Model Baseline (FY15) Revenue and Ridership by Assumed “Mid” Elasticity ...... 84 Figure 3-6: Typology of Parameter Estimation and Scenario Modeling Methods ...... 86 Figure 4-1: Three Key Fare-Related Customer Behaviors ...... 94 Figure 4-2: Assumed Weekly Trip Frequency Distribution for Pass Pricing Model Example ...... 95 Figure 4-3: Revenue and Ridership Changes Relative to Pay-Per-Use-Only Baseline for Alternative Pass “Multiples” (Under Initial Parameter Assumptions)...... 97 Figure 4-4: Sensitivity of Ridership, Revenue, and “Optimal” Pass Multiples to Parameter Assumptions99 Figure 5-1: Employee Market for Passes With Introducing a Pre-Tax Employer-Based Pass Program (Impact on Employees) ...... 105 Figure 5-2: Employee Market for Passes With Introducing a Pre-Tax Employer-Based Pass Program (Impact on Transit Agency) ...... 106 Figure 5-3: MBTA Monthly LinkPass Frequency of Use (Taps), October 2016 ...... 110

10

Figure 5-4: MBTA Monthly LinkPass Time of Use, October 2016 ...... 111 Figure 5-5: MBTA Monthly LinkPass Transit Mode Shares, October 2016 ...... 111 Figure 5-6: CTA Pass Frequency of Use (Taps), October 2017 ...... 113 Figure 5-7: CTA Pass Time of Use, October 2017 ...... 114 Figure 5-8: CTA Pass Transit Mode Shares, October 2017...... 114 Figure 5-9: Use Value of MBTA Monthly LinkPasses Sold in October 2016 ...... 116 Figure 5-10: Use Value of CTA Passes Sold in October 2017 ...... 117 Figure 5-11: Relative Change in MBTA Monthly LinkPass Sales by Sale Channel, FY16-17 ...... 119 Figure 5-12: Relative Change in CTA Pass Sales by Sale Channel, July 2017 – March 2018...... 120 Figure 6-1: Difference in Average Weekly Trips Within Account (30-Day Pass minus Pay-Per-Use) ... 134 Figure 6-2: Percent Difference in Average Weekly Trips Within Account (30-Day Pass Relative to Pay- Per-Use) ...... 134 Figure 6-3: Filtering Accounts on Minimum Average Weekly Ridership...... 135 Figure 6-4: Filtering Accounts on Number of Switches Between Pay-Per-Use and Pass ...... 136 Figure 6-5: Percent Difference in Average Weekly Trips Within Account After Filtering (30-Day Pass Relative to Pay-Per-Use) ...... 137 Figure 6-6: Percent Difference in Average Weekly Trips by Trip Type (30-Day Pass Relative to Pay-Per- Use) ...... 138 Figure 6-7: Percent Difference in Average Weekly Trips by Trip Type (7-Day Pass Relative to Pay-Per- Use) ...... 139 Figure 6-8: Distribution of Tap Frequency for Pay-Per-Use Panel and Panel Subsets ...... 143 Figure 6-9: Distribution of Tap Frequency by Fiscal Year (Pay-Per-Use Cards Active in All 36 Months) ...... 144 Figure 6-10: Regression Results for Annual Analysis of Pay-Per-Use Cards Active in All 36 Months .. 145 Figure 6-11: Median of Average Daily Trips (Demeaned at Card Level) Shows Smooth Trend for Weekend and Farebox Trips ...... 146 Figure 6-12: Regression Results for Monthly Analysis of Pay-Per-Use Cards Active in All 36 Months 147 Figure 6-13: Regression Results for Pay-Per-Use Cards Active in At Least 18 Months ...... 148 Figure 6-14: Monthly Sales and Ridership on Selected MBTA Fare Products, FY 2015-2017 ...... 154 Figure 6-15: Traceable Switching and Churn in Cards for MBTA LinkPasses...... 156 Figure 6-16: Monthly Year-Over-Year Change in Net Churn for Non-Corporate Monthly LinkPasses with Alternative Baseline Trends ...... 158 Figure 6-17: Monthly Year-Over-Year Change in Net Churn for Corporate Monthly LinkPasses with Alternative Baseline Trends ...... 159 Figure 6-18: Monthly Year-Over-Year Change in Net Churn for 7-Day LinkPasses with Alternative Baseline Trends ...... 160 Figure 6-19: Sensitivity of LinkPass Elasticity Adjustment to Assumed Pay-Per-Use Elasticity ...... 162 Figure 6-20: Comparison of Selected MBTA LinkPass Elasticity Estimates with MBTA FERRET Model Assumptions and Naïve Before-and-After Estimates ...... 163 Figure 6-21: Previous Week’s Use Value and Percent Rail Rides in Sample and Population of Daily Choices ...... 169 Figure 6-22: Predicted Fare Product Choice Probabilities for an Individual Customer by Weekly Use Value ...... 173 Figure 7-1: Structure of Fare Product Choice and Elasticity Fares Scenario Model ...... 179

11

Figure 7-2: Model Segments Defined by Customer-Week Type and Trip Type ...... 180 Figure 7-3: Example Calculation of Fare Product Market Shares for One Customer-Week Type ...... 181 Figure 7-4: Conceptual Example of Model Logic ...... 184 Figure 7-5: Trips by Customer-Week Type in the Observed Baseline ...... 186 Figure 7-6: FERRET Predictions Versus Actual Percent Changes in MBTA Revenue (FY16 to FY17, MBTA July 2016 Fare Change Scenario, Selected Full-Fare Tariff Types) ...... 191 Figure 7-7: CTA Choice and Elasticity Model Predicts Substantial Switching to Pay-Per-Use as 30- and 7-day Pass Multiples are Increased ...... 192 Figure 7-8: MBTA FERRET Model Predicts Little Switching to Pay-Per-Use as Monthly and 7-Day LinkPass Multiples are Increased ...... 193 Figure 7-9: CTA Choice and Elasticity Model Captures Fare Product Switching as 30-Day Pass Prices are Increased and 7-day Pass Prices are Held Constant ...... 194 Figure 7-10: Sensitivity of Results to Elasticities (Scenario: PPU Bus Fare +$0.25, PPU Rail Fare +$0.50, Pass Prices Unchanged) ...... 199 Figure 7-11: Sensitivity of Results to Induced Ride Factors (Scenario: PPU Bus Fare +$0.25, PPU Rail Fare +$0.50, Pass Prices Unchanged) ...... 199 Figure 7-12: Predicted Change in Ridership and Revenue Under Alternative Pay-Per-Use Fares ...... 200 Figure 7-13: Comparison of Predicted and Actual Early 2018 YOY Ridership and Revenue Impacts of the January 2018 Fare Change ...... 203 Figure 7-14: Year-Over-Year Changes in 7-day and 30-day Pass Sales, Before and After Jan 2018 Fare Change ...... 205 Figure 7-15: Year-Over-Year Changes in Ridership by Fare Product, Before and After Jan 2018 Fare Change ...... 205 Figure 7-16: Comparison of Predicted and Actual Impacts for Two Periods in Early 2018 ...... 206

12

1 Introduction 1.1 Motivation On a simple level, transit agencies are service providers that deliver rides connecting people to places and activities – school, work, commerce, worship, community, and recreation. More broadly, however, public transit has the potential to help cities function and grow efficiently and equitably. With respect to the economy, transit contributes to a landscape of accessibility and opportunity. It provides workers with improved access to jobs and businesses with improved access to labor and customers, and it allows for a level of density and a pattern of land uses that can create and enhance economic activity and opportunity. Considering the environment, transit reduces greenhouse gas and particulate emissions by reducing congested automobile travel, and it improves health by facilitating more active commute patterns. And in terms of equity, many areas rely on transit systems to provide a baseline level of accessibility to opportunities and critical services for everyone, across all demographics.

Transit fare policy plays a critical and multifaceted role in helping transit agencies deliver on their potential to help cities function and grow efficiently and equitably. Fares are a primary source of operating revenues for transit agencies, second only to public subsidy; in 2014, passenger fares accounted for 32% of operating funding across U.S. transit agencies (Neff and Dickens 2017). The pricing strategies selected by transit agencies (or implicit in their fare structures and levels) also affect ridership via the competitiveness of transit relative to alternative modes, the financial incentives for specific ridership behaviors, and the financial burden placed on different groups of riders. Distribution and collection of fare products – including the payment and boarding process, the simplicity of fare rules, and the salience of fares – further influence ridership behavior, customer experience, and transit operations. Through these impacts on ridership and through public debate surrounding fare changes, fare policy helps to secure and sustain necessary external financial support for transit capital and operations. Fare collection and validation also provide the single best source of system-wide data on transit use and transit customer behavior, with myriad applications from fare policy to service planning.

Within fare policy, the specific scope and questions of this thesis are motivated in two ways. First, this research was conducted in collaboration with two of the five largest transit agencies in the U.S. – the Massachusetts Bay Transportation Authority (MBTA) and the Chicago Transit Authority (CTA). These agencies serve as case studies, demonstrating some common challenges and questions related to fare policy analysis. Second, review of relevant academic literatures reveals several weaknesses or gaps in addressing the challenges at the MBTA and CTA. The following sections describe this motivation in brief, and Chapters 2 and 3 explore the two case studies and the literature in more detail. 1.1.1 Case Studies The MBTA and the CTA are similar in several important ways and grapple with some similar questions related to fare policy analysis. One similarity is the need to balance fare revenue goals against ridership and equity objectives. Within the last five years, these agencies have faced operating budget shortfalls and have used fare increases as one element of their response. However, both agencies also have a bottom-line interest in ridership and equity, creating a tension around fare increases. Over half of their operating budgets and nearly all of their capital funding derive from public sources (not fares); the

13 accessibility, mobility, congestion and pollution reduction, economic development, and other benefits provided by transit are most easily measured by ridership, making it the political currency that ensures continued public support for transit. The MBTA and CTA also have similar fare structures – providing “flat” fares on bus and rail, but allowing customers a choice between passes and pay-per-use fares – and both agencies chose to increase their pass prices proportionally more than pay-per-use fares (the CTA in 2013 and the MBTA in 2016); those price changes increased pass “multiples,” or the number of rides needed to break even on a pass purchase. In its most recent fare change, the CTA partially reversed that pattern, concentrating fare increases on pay-per-use fares. Finally, both are experiencing multi-year declines in bus ridership and growth in competing travel alternatives like ride hailing.

Within this context, these agencies regularly consider whether incremental changes to their fare policies can be used to advance their objectives for ridership, equity, and revenue. The more specific questions that flow out of those discussions are determined by the current fare structure. As a result, one important question for the CTA and MBTA is how to price or promote or modify pass products relative to pay-per- use fares. Both while deciding about fare policy changes and after changes have been made, looking backward also raises analytical questions: What can be learned from recent fare change experiences, and what can be learned about fare policy by observing customer behavior between fare changes?

While similar in some regards, the MBTA and CTA also differ in many ways, including the services they operate, their fare technologies, and their fare policy decision making processes. These differences may either enhance or constrain their ability to make informed fare policy decisions that advance their objectives.

More detailed background and context for these two case studies are provided in Chapter 2. 1.1.2 Literature Looking beyond these two agency case studies, there are also several strains of academic and industry research that relate to the same broad fare policy questions. Three of these bodies of literature are reviewed in Chapter 3:

1. transit pricing theory and strategy, 2. analysis of transit user behaviors using automated fare collection (AFC) data, and 3. transit demand modeling.

While the literature provides many useful insights, there are several weaknesses or gaps in relation to the particular context and questions at the CTA and MBTA:

 Pricing theory for pass products and other real-world fare structures. Every major transit agency in the U.S. and most around the world offer passes and must regularly decide how to price them relative to pay-per-use fares; however, since Carbajo (1988) and the “deep discounting” literature in the late 1980s, there has been little work on theory or practical questions related to pricing.  Exploratory analysis of pricing strategies and behavior using AFC data. While the quality of transit fare collection data and the ease of using a variety of analytical techniques have greatly improved over the past three decades, there are relatively few studies using AFC data to explore

14

and describe connections between transit purchase and travel behavior and transit pricing strategies.  AFC-based segmentation for fare policy analysis. Recent studies such as Basu (2018) demonstrate the potential to create new and useful segmentations from AFC data using data fusion or machine learning techniques like clustering; however, only a few studies such as Halvorsen (2015) have taken advantage of these techniques to segment fare policy analyses.  Elasticities by payment structure (pass versus pay-per-use). Exhaustive reviews of elasticity estimates have found that little is known about relative price sensitivities for different fare product payment structures (even though it is commonplace for transit agencies to offer both passes and pay-per-use fares).  Before-and-after elasticity estimation using individual-level AFC data. After changing fares, transit agencies typically attempt to estimate fare elasticities using simple before-and-after comparisons of ridership and fares. By contrast, academic studies mostly estimate elasticities using long, aggregate time series models or large cross-sections across multiple agencies. As a result of this disconnect, there has been little work trying to improve simple before-and-after studies of a single fare change by taking advantage of the AFC panel data available to transit agencies.  Fare product choice modeling using AFC data. Following survey-based fare product choice models developed in the 1980s and 90s, only Zureiqat (2008) has made use of individual-level AFC data to model fare product choice using revealed preferences in actual choice situations.  Integrating fare product choice into fare change scenario models of ridership and revenue. Agencies mostly use simple elasticity spreadsheet models to predict the ridership and revenue impacts of potential fare changes. These models are appropriate for across-the-board fare changes (increasing prices by the same proportion across products), but they do not predict fare product choice and may perform poorly when relative prices of alternative products are changed. This becomes important when agencies wish to change pass prices and per-ride fares differently, but only a few academic studies integrate fare product choice into fare change scenario models.

The case study applications in this thesis relate to each of these gaps, and in some cases aim to address them. 1.2 Research Scope and Questions There are many potential research topics related to these agency case studies and relevant bodies of literature. The specific research scope for this thesis is narrowed at three levels:

1. Broadly, this thesis focuses on the use of AFC data to inform decisions about incremental changes to fare structure and levels. 2. Within that scope, different tools and techniques are organized into a proposed framework for analyzing fare changes. 3. Three empirical case study applications are developed at the MBTA and CTA to demonstrate selected steps in the proposed framework.

The following sections describe this narrowing scope.

15

1.2.1 Broad Scope This thesis is focused on a particular subset of fare policy topics relating to the use of automated fare collection (AFC) data to improve decision making about incremental changes to fare structure and levels. This scope is narrow in that many other important aspects of fare policy design and decision making are not addressed; however, this thesis still only scratches the surface of several persistent fare policy questions with broad application to transit agencies around the world. This broad scope can be broken down into four components:

1. Fare structure and levels. Chapter 3 describes the many interrelated drivers, functions, and technologies that play a role in transit fare policy. While this thesis touches on many topics, it focuses on only three functional elements of fare policy – fare structure (or pricing rules), fare levels or prices, and to a lesser degree fare product distribution and fare collection. It does not take up topics such as fare payment technologies or and enforcement.

2. Incremental changes. Many important decisions are made during initial design of a transit fare policy (such as selection of fare products offered) and during fundamental changes or redesigns (such as regional fare integration). As shown in Figure 1-1 below, this thesis focuses instead on incremental changes – modifying fare levels and tweaking products and policies. These changes are generally less dramatic, but they are the most common fare policy decisions made by existing transit agencies. In the U.S. and many places around the world, transit agencies grapple with growing operating deficits and backlogs of deferred maintenance. As political opportunities present themselves, agencies regularly make incremental changes to fare levels in an attempt to address these fiscal needs (weighing many other concerns in the balance).

Figure 1-1: Fare Policy “Life Cycle”

Initial Design Incremental Fundamental Change Change

•Products •Fare levels •New product •Policies •Product tweaks •New technology •Technologies •Policy tweaks •Integration •New alternatives

3. Decision making. This research is not focused on decision making, but it is shaped by the typical context for fare policy decisions. Unlike many transit planning and operating decisions, fare levels and structures are often highly public decisions made personally by senior transit leadership (executives and boards). These decisions are frequently influenced “from above” by elected officials such as mayors and governors and “from below” by the public and by transit agency staff who analyze fare policy options. There are thus many stakeholders with many different objectives and interests. This thesis approaches this context mostly from the perspective

16

of transit agency analysts. The role that should be played by analysts at public agencies is a topic of considerable debate and disagreement (see e.g. Davidoff (1965) for a seminal expression of one perspective). Practically, however, all transit fare policy analysts need to understand the objectives of different stakeholders, describe the implications of fare policy alternatives for different objectives, and frame the questions and results effectively for both the public and for senior officials making final decisions. However, this deliberate process of informed debate on tradeoffs (between revenue needs, ridership goals, equitable distribution of costs and burdens, etc.) is often short-circuited by narrow political windows to make fare changes; short timeframes often preclude quantitative fare policy analysis, which requires time-consuming data collection and cleaning, model development, and scenario evaluation. It is no surprise, then, that transit agencies in the U.S. continue to rely heavily on professional judgment and rules of thumb to inform fare policy decisions (Boyle 2006). This thesis studies ways that transit agencies might organize and use existing AFC data to more quickly, cheaply, and accurately understand the potential impacts of fare change options on ridership and revenue. (Equity is not a focus of this study, but one major component of equity analysis is disaggregation of predicted ridership and revenue impacts; the distribution of impacts depends on first understanding the level of impacts.) The hope is that better information about the implications of fare policy changes, both internally at transit agencies and externally for the public, will be one ingredient for more robust public debate, a higher degree of accountability, and ultimately wiser decision making.

4. AFC data. Fare policy alternatives could be analyzed using customer surveys, sampled counts, aggregate sales and revenue, and either aggregate or disaggregate AFC data. This thesis focuses on the use of disaggregate AFC data, which is already available at most major transit agencies. AFC data provides cheaper and more accurate ridership information than surveys and counts, and it is already used widely for agency reporting and analysis of aggregate ridership and revenue. However, there has been relatively little fare policy research that takes advantage of the rich detail on transit travel behavior in transaction-level AFC data (rather than aggregating AFC data into measures that could have been derived from traditional counts). AFC data does have limitations, however. It is often only available on bus and light/heavy rail (which restricts AFC analysis to those modes); it does not include non-transit travel, user demographics, and stated preferences that might be derived from a customer survey; it does not contain information on other transit factors or any “external factors” that might be relevant to fare policy, such as service quality or economic trends; and it can be computationally demanding, especially for analyses over the multi-year periods often needed to learn from past fare policy changes.

This broad scope is explored further through a more detailed overview of fare policy in Chapter 3. 1.2.2 Framework for Incremental Fare Policy Analysis The bodies of literature mentioned above and described in Chapter 3 are often disconnected from each other or are not applied to transit agencies. This raises a general procedural question – is there a sensible framework that can sequence or relate these elements in the specific context of incremental fare policy analysis?

This thesis proposes a procedural framework for analyzing incremental fare changes, which can be summarized as three steps:

17

1. Identify current pricing strategies 2. Describe and segment transit use 3. Model demand by fare structure and market segment

The final step, modeling the potential ridership and revenue impacts of fare changes, is perhaps the most obvious one. However, there are many considerations in setting up a fare scenario model – selection of scenarios, the implicit theory of behavior, model mechanics, and parameter estimation – and it is helpful to first develop a good understanding of pricing strategies and fare-related behaviors (in the first two steps of the framework). A “toy model” of pass pricing is presented in Chapter 4 to demonstrate the demand modeling process and to develop some basic intuition and lessons for relative pricing of passes and pay- per-use fares.

This framework is developed in Chapter 4, and it provides the basis for specific case study applications in Chapters 5, 6, and 7. 1.2.3 Case Study Application Questions Chapters 5, 6, and 7 of this thesis present in-depth study of specific empirical application topics at the MBTA and CTA:

Chapter 5: Describing the Role of Pass Sale Channels Chapter 6: Estimating Behavioral Parameters of Fare Product Purchase and Use Chapter 7: Incorporating Fare Product Choice in Modeling of Fare Change Scenarios

Research questions for these chapters are shown in Table 1-1 below, situated within the framework described above. The specific application questions are driven by the history, current fare structure, policy interests, and data at the case study agencies. Apart from the first question, this thesis does not provide parallel analyses at both case study agencies; one or the other is used as convenient for each question.

Table 1-1: Case Study Application Questions Ch. Step in Procedural Framework Specific Application Question Case Study What can AFC data teach us about 1. Identify current pricing strategies 5 the strategic role of pass sale MBTA & CTA 2. Describe and segment transit use channels at the MBTA and the CTA? CTA: induced ride How can AFC data be used to factors, fare 3. Model demand by fare structure estimate parameters describing fare- product choice 6 and market segment related behaviors at the CTA and utility parameters MBTA? MBTA: elasticities How can fare product choice be incorporated into MBTA and CTA 3. Model demand by fare structure 7 fare change scenario modeling while CTA and market segment relying on AFC data rather than customer surveys?

18

1.3 Methodology

1.3.1 AFC Data The primary data source for this research is the automated fare collection (AFC) data at the MBTA and the CTA. These data include fare validations (also called taps and fare stages), which are recorded when customers use cards or tickets at fare gates and bus fareboxes or pay cash on buses. It also includes most sale transactions, such as purchase of a monthly pass from a fare vending machine.

The AFC data from the CTA Ventra system includes an account ID for individual customers, the time of the transaction, the transit service that was used (bus or rail), the fare product that was used, the dollar amount of the transaction, and the “use value” of the transaction (the applicable pay-per-use fare for the transaction, regardless of whether a pay-per-use or pass product was actually used). Sale transaction data at the CTA additionally includes the sale channel used to purchase each fare product.

Automated fare collection data at the MBTA does not directly include several of these elements – the transit service used, the “use value” of transactions, or the sale channel for fare product sales – so these fields are constructed where possible using available fields and an administrative database for several sale channels. Several major sources of MBTA fare product sales and revenue are not recorded automatically; commuter rail validations are performed visually (not by the AFC system), several commuter rail sale channels operate independently of the AFC system (such as the mTicket mobile app and on-board ticket sales), and Corporate and Semester Pass Program sales are not recorded in AFC. In light of these AFC limitations, aggregate MBTA accounting data on monthly sales of each fare product across AFC and non- AFC sale channels is also used. 1.3.2 Analytical Methods Different analytical methods are used in each of the three application chapters:

 Chapter 5: Analysis of the role of pass sale channels primarily uses descriptive summaries of sales and ridership data.  Chapter 6: Behavioral parameters measuring sensitivity to fares are estimated using several different methods. Induced ride factors are approximated using descriptive summaries. Elasticities are estimated using informal trend analysis and linear regression analysis. Choice utility parameters are estimated using multinomial logit models.  Chapter 7: Modeling of potential fare change scenarios at the CTA is inspired by discrete- continuous econometric models of demand. While statistical properties of the proposed model are not developed, the model is compared qualitatively to other related modeling approaches. 1.4 Thesis Organization Chapter 2 provides relevant background information on the MBTA and CTA case studies, and Chapter 3 reviews the academic literature. Chapter 4 presents a framework for incremental fare policy analysis and a simple pass pricing model. Chapters 5, 6, and 7 present empirical case study applications on pass sale channels, estimation of fare-related behavioral parameters, and incorporation of fare product choice into

19 fare change scenario modeling (respectively). Chapter 8 summarizes findings, highlights general conclusions, and suggests future work.

20

2 Case Study Background This section provides some relevant background information on the two transit agencies used as case studies in this thesis: the Massachusetts Bay Transportation Authority (MBTA) and the Chicago Transit Authority (CTA). 2.1 MBTA

2.1.1 Transit Service and Ridership The MBTA provides five different transportation services in the Greater Boston metropolitan area: Subway, Bus, Commuter Rail, Ferry, and Paratransit. Selected characteristics of these services are summarized in Table 2-1 and Figure 2-1 below. As of 2015, the MBTA provided roughly 1.3 million rides on an average weekday across all modes, with 90% of those rides on bus, heavy rail, and light rail. The MBTA’s commuter rail network provides much longer trips than other services, accounting for 38% of total vehicle-miles traveled on the system; by a couple different calculations, MBTA commuter rail carries between one third and one half of all commuters traveling into Boston during the peak period (Ofsevit 2017). (For additional information on current MBTA service, see the MBTA Focus40 State of the System reports and the MBTA Performance Dashboard.1,2)

1 https://www.mbtafocus40.com/mbta-today/ 2 http://www.mbtabackontrack.com/performance

21

Table 2-1: MBTA Selected Transit Service Characteristics (2015) Annual Vehicle Average Annual Stations Service Weekday Annual Passenger Routes and Stops Hours Rides Rides Miles Bus (incl. Silver Line) ~170 ~8,100 2,382,048 446,700 133,807,610 334,648,550

Heavy/Light Rail 4 lines 121 2,054,954 756,300 235,782,274 733,660,958 Red Line 2 branches 29 281,000 Orange Line 1 branch 19 209,000 Blue Line 1 branch 12 66,300 Green Line 4 branches 66 200,000

Commuter Rail 14 lines 138 744,459 121,700 32,869,874 678,185,066

Ferry 3 routes 7 22,577 4,740 1,341,397 11,568,376

Paratransit N/A N/A 1,270,369 6,800 2,149,718 17,868,150

Total System 1,336,000 405,951,000 1,775,931,000 Sources: MBTA Blue Book (2014b), FTA Transit Profiles 2015 (2016), MBTA FMCB Strategic Plan (2017)

Figure 2-1: MBTA Ridership and Passenger Miles by Mode (2015)

Sources: MBTA Blue Book (2014b), FTA Transit Profiles 2015 (2016), MBTA FMCB Strategic Plan (2017)

MBTA ridership has been relatively flat over the last 15 years, as shown in Figure 2-2 below. MBTA commuter rail ridership declined slightly over that time, but has been stable for the last few years. After rising gradually from the late 2000s through 2014 or 2015, both bus and rail ridership began to decline in the last two years. This has raised questions about the specific sources and causes of ridership losses on bus and rail. The decline could stem from changes in service or fares made by the MBTA or from changes outside the agency (such as new transportation alternatives, gas prices, population growth, etc.).

22

Figure 2-2: MBTA Unlinked Passenger Trips by Mode (2002 – 2017)

Source: FTA (2018)

2.1.2 Geography and Demographics The MBTA serves 175 of the many cities and towns that make up the Boston metropolitan area.3 The MBTA’s heavy and light rail system extends into 10 cities and towns; the bus and commuter rail networks cover a much wider area still, shown in Figure 2-3. The municipal fragmentation within the MBTA’s service area has a significant impact on transit policy and planning, from finance and fare policy decision making to bus operations on locally-owned roads and commuter rail parking in locally-owned lots.

3 https://mbta.com/history

23

Figure 2-3: MBTA Service Area (2011)

Source: Cosgrove et al (2011)

While racially and socioeconomically diverse as a region, racial and income composition varies dramatically across neighborhoods and municipalities within the MBTA’s service area. One way to visualize these divides is using dot maps, which plot individuals or small groups of people as random dots within Census geographies. Figure 2-4 and Figure 2-5 provide a geographic picture of race and income extremes in the Boston area. While not a focus of this thesis, the concentration of races and of wealth and poverty raise critical questions about the MBTA’s role in providing transportation services that are accessible and affordable to all.

24

Figure 2-4: Racial Dot Map of Boston (2010)

Source: Cable (2013)

Table 2-2: MBTA Modal Ridership by Minority Status

Source: Lozada et al (2017), Table 4-1

25

Figure 2-5: Income Dot Map of Boston (2010)

Source: ESRI (2016) Notes: Blue dots represent households with income over $200,000. Orange dots represent households with income under $25,000.

2.1.3 Operating Finances The MBTA’s operating costs are roughly $2 billion, including both costs of running transit and over $400 million per year servicing historical debts. Operating revenue includes both “own-source” revenue (fares and other operating income) and other dedicated revenue sources. The MBTA’s financial history is complicated. Ridership-based assessments on municipalities used to fund a large share of operations, but these assessments were partially shifted to the state in the 1970s and further eroded after municipal property tax increases were capped by Proposition 2 ½ in 1982; since 2001 the MBTA’s largest revenue source has been 20% of the Massachusetts state sales tax. Figure 2-6 shows that fares generated roughly $600 million and accounted for around one third of the total operating budget in Fiscal Year 2015 (July 2014 through June 2015).

Following severe winter storms in February 2015 (“Snowmageddon”), a new Fiscal Management and Control Board (FMCB) appointed by Governor Charlie Baker was charged with closing a persistent

26 operating deficit that had risen to $119 million in FY2015 (shown in Figure 2-6). Without additional financial assistance from the state, the FMCB has pursued that goal through a combination of reducing operating costs and increasing own-source revenues. One element of that strategy was raising fare revenues through a fare change in July 2016, which is described in the next section.

Figure 2-6: MBTA Fiscal Year 2015 Revenues and Expenses

Source: MBTA (2015)

2.1.4 Fare Structure and Levels Under the MBTA’s fare structure (or pricing rules), fares are differentiated in four important ways: by service, tariff, user type, and medium. The most significant differences are by service. Different fares or prices are charged for seven different services -- Local Bus, Inner Express Bus, Outer Express Bus, Rapid Transit (including bus rapid transit, light rail, and heavy rail), Commuter Rail, Ferry, and paratransit (“The RIDE”). Pricing for bus and rapid transit is “flat,” meaning that it does not vary by trip distance, zone, or origin-destination pair. Pricing for commuter rail is “zone-based,” increasing for longer trips according to pre-defined zones (roughly concentric arcs moving out from downtown Boston). The MBTA also offers multiple tariffs (i.e. payment structures):

 pay-per-use (also called single-ride, pay-as-you-go, or stored value), available on all services;  multi-ride tickets, available for round trips or 10 rides on commuter rail;  monthly passes, available separately for each service and each commuter rail zone, and typically allowing unlimited travel on all less expensive services (e.g. any Commuter Rail pass covers travel on closer Commuter Rail zones, Rapid Transit, and Local Bus); and  7-day and 1-day passes, available only for Rapid Transit (also covering Local Bus).

For pay-per-use riders using a smartcard (CharlieCard), the MBTA offers “step-up” transfers between Bus and Rapid Transit (i.e. for a transfer trip, customers pay a single fare for the most expensive service used in the trip). Across these tariffs or payment structures, there is different pricing for different types of

27 users. This is framed as a “base”/”adult”/”full” fare (the default); discounted fares for students, seniors, and people with disabilities; and free fares for children, the blind, and service officials (like police). Finally, within pay-per-use, fares vary depending on the fare medium that is used, with a higher fare charged for cash and ticket (CharlieTicket) rides than for smartcard (CharlieCard) rides. Details of the MBTA fare structure are described exhaustively in the MBTA’s tariff document (MBTA 2016b).

Figure 2-7 shows how different fares and fare products contribute to total fare revenue in FY2015, broken out by the defining elements of the MBTA's fare structure as discussed above. Figure 2-8 provides a similar breakdown of ridership for only bus and rapid transit, excluding commuter rail and ferry. (Ridership by service including commuter rail is shown in Figure 2-1.)

Figure 2-7: Revenue by MBTA Fare Structure Dimension, FY2015

Source: MBTA Accounting Notes: Percentages are out of $596,431,016 total fares on relevant products (excluding paratransit, parking, etc.) in MBTA accounting data for FY2015. The pay-per-use tariff includes one-way commuter rail tickets; the multi-ride (commuter rail) tariff includes round-trip and 10-ride commuter rail tickets. * “Highest Service” is the most expensive service that can be used for each fare product or fare type. This is not an allocation of revenue to services based on ridership; Commuter Rail and Boat passes can also typically be used on Bus and Rapid Transit, so a ridership-based allocation would attribute a greater share of revenue to those services.

28

Figure 2-8: Bus and Rapid Transit Ridership (AFC Taps) by MBTA Fare Structure Dimension, FY2015

Source: MBTA AFC Notes: Percentages are out of 262,886,304 total AFC validations on bus and rapid transit in FY2015. Commuter rail, ferry, paratransit, and parking are not included. (Note that commuter rail accounted for about 9% of total MBTA rides in 2015; see Figure 2-1.)

The MBTA’s fare levels (and to a lesser degree fare structure) have evolved over time. Table 2-3 shows recent changes in selected fares and prices from July 2012 to the present (all “full-fare” or “adult”). There were several notable changes to fare levels in July 2016:

 Changes in July 2016 varied across trips and products. Fare levels were raised by an average of 9.3%, but different fares and prices were changed by very different proportions (Andrews and Demchur 2016). This stands in contrast to the fare change in July 2014, where all fares and prices were increased by a similar percentage (with differences apparently caused by rounding to convenient denominations).  Fare increases in July 2016 were twice as large as July 2014. Many fares increased by 7- 13% in July 2016. The previous change increased fares by about 5%.  The July 2016 fare change increased LinkPass “multiples.” When an agency offers multiple tariff options, the relative prices of a single ride and a period pass for the same service becomes an important policy decision. This is often described as a "pass multiple" – the number of pay- per-use rides that would cost the same as the price of an unlimited use pass (i.e. the "breakeven" number of rides). In July 2016, the prices of monthly and 7-day LinkPasses (valid on Rapid Transit and Local Bus) were increased by about 12%, while the pay-per-use fare for Local Bus and Rapid Transit increased by only about 6% (half as much). These changes increased the multiple from 35.7 to 37.6 for the Monthly LinkPass and from 9.0 to 9.4 for the 7-day LinkPass (dividing the pass price by the Rapid Transit pay-per-use fare on a CharlieCard). This means that after July 2016, a customer who takes 38 rail rides (or 19 round trips) pays the same amount as a customer who purchases a Monthly LinkPass. By contrast, the “one-way” or pay-per-use fares for Commuter Rail were increased by roughly the same proportion as the corresponding monthly Commuter Rail pass prices.

29

Table 2-3: Selected MBTA Fares, July 2012-Present ($) Jul 2012 - Jun 2014 Jul 2014 - Jun 2016 Jul 2016 - Present Fare Change Fare Change Fare Local Bus Pay-Per-Use - CharlieCard 1.50 +0.10 (+7%) 1.60 +0.10 (+6%) 1.70 Local Bus Pay-Per-Use - CharlieTicket/Cash 2 +0.10 (+5%) 2.10 -0.10 (-5%) 2 Rapid Transit Pay-Per-Use - CharlieCard 2 +0.10 (+5%) 2.10 +0.15 (+7%) 2.25 Rapid Transit Pay-Per-Use - CharlieTicket/Cash 2.50 +0.15 (+6%) 2.65 +0.10 (+4%) 2.75

7-Day "LinkPass" (Rapid Transit) 18 +1 (+6%) 19 +2.25 (+12%) 21.25 Monthly "LinkPass" (Rapid Transit) 70 +5 (+7%) 75 +9.50 (+13%) 84.50

Commuter Rail One-Way - Zone 1 5.50 +0.25 (+5%) 5.75 +0.50 (+9%) 6.25 Commuter Rail One-Way - Zone 2 6 +0.25 (+4%) 6.25 +0.50 (+8%) 6.75 Commuter Rail One-Way - Zone 3 6.75 +0.25 (+4%) 7 +0.50 (+7%) 7.50 … Commuter Rail One-Way - Zone 10 11 +0.50 (+5%) 11.50 +1 (+9%) 12.50

Monthly Commuter Rail Pass - Zone 1 173 +9 (+5%) 182 +18.25 (+10%) 200.25 Monthly Commuter Rail Pass - Zone 2 189 +9 (+5%) 198 +19.75 (+10%) 217.75 Monthly Commuter Rail Pass - Zone 3 212 +10 (+5%) 222 +22.25 (+10%) 244.25 … Monthly Commuter Rail Pass - Zone 10 345 +17 (+5%) 362 +36.25 (+10%) 398.25 Sources: MBTA (2012, 2014a, 2016a)

The July 2016 fare change increased revenue but also coincided with a visible drop in ridership. Figure 2-9 and Figure 2-10 show monthly AFC taps on bus and rail and monthly fare revenue, both as totals and year-over-year percent changes around the July 2016 fare change. The year-over-year spike in ridership and revenue in February 2016 is actually caused by exceptionally low ridership and sales in February 2015 due to severe winter storms; the spike in year-over-year revenue in May 2016 is likewise caused by a one-time fare discount in May 2015. Apart from these spikes, total ridership and revenue are relatively flat from FY2015 to FY2016. By contrast, there is an increase in total fare revenue and decline in ridership following the fare change, with revenue up 6% and ridership down 4% on average relative to the previous year.

30

Figure 2-9: MBTA Monthly Ridership (AFC Taps) and Fare Revenue Before and After Fare Change (FY15-17)

Sources: MBTA AFC, MBTA Accounting

Figure 2-10: MBTA Monthly Year-Over-Year (YOY) Change in Ridership (AFC Taps) and Fare Revenue Before and After Fare Change

Sources: MBTA AFC, MBTA Accounting

31

2.1.5 Fare Product Distribution and Fare Collection The MBTA distributes fare products and collects payments through a variety of different sale channels. Table 2-4 shows total fare revenue in FY2015 by sale channel. By far the two most significant channels from a revenue perspective are fare vending machines (FVMs) located throughout the system (primarily in rail stations and major bus terminals) and the Corporate Program, which fulfills pre-tax purchase of passes by employees at participating companies through payroll deduction. FVMs account for 44% of MBTA fare revenue, and the Corporate Pass Program makes up another 28%. Sale channels for MBTA passes are described in more detail in Chapter 5.

Table 2-4: MBTA Fare Revenue by Sale Channel, FY2015 Revenue Sale Channel $millions % Fare Vending Machine $260.7 44% Corporate Pass Program $166.6 28% Ticket Office Machine / Sales Outlet Terminal $43.7 7% mTicket Mobile App $36.0 6% Farebox $23.9 4% Retail Sales Terminal $21.1 4% Commuter Rail Conductor $18.8 3% MBTA Web Site $10.7 2% Semester Pass Program $7.6 1% Student CharlieCard Program $4.6 1% Visitor and Group Program $2.3 0% Other $0.5 0% Total $596.4 100% Sources: MBTA Accounting Notes: FY2015 is from July 2014 through June 2015

The MBTA has an integrated fare collection system (called the Charlie system), which consists broadly of fare vending (FVMs, fareboxes onboard buses and light rail, and other devices), fare validation (at station gates, bus fareboxes, and other validators), and fare media that interact with these devices (smart cards called and magnetic stripe tickets called CharlieTickets). With a few exceptions, bus and light/heavy rail fares are validated by tapping cards or inserting tickets at station gates or fareboxes (or paying cash at fareboxes), while Commuter rail and ferry fares are validated visually by conductors.

In contrast to newer fare collection systems like the Ventra system in Chicago, the MBTA's Charlie system is card-based rather than account-based; information such as fare products and stored transit value "live" on CharlieCards and CharlieTickets. The MBTA is currently in the process of designing and implementing a new fare collection system (“AFC 2.0”) that will be account-based and allow for "open payment” (MBTA 2016c).4 The MBTA expects implementation of this new technology to improve operations (via faster boarding and alighting) and create new possibilities for fare policy changes (Block-

4 https://mbta.com/afc2

32

Schachter 2016). Inexpensive, ubiquitous validators allow integration of commuter rail validation into the AFC system and make possible new fare structures on bus and rail (such as distance-based pricing, which requires customers to tap out at the end of each transit trip). Decoupling of hardware such as validators from “back end” fare management software will also make modifications to fare levels and complex fare rules (such as distance-based fares or fare capping) easier to implement. While AFC 2.0 will make some fare structure changes technologically feasible for the first time, the desirability and political feasibility of these changes are separate questions for policy makers. 2.1.6 Fare Policy Analysis and Decision Making The context described above shapes the current practice of fare policy analysis and decision making at the MBTA. Under the agency’s current management structure (shown in Figure 2-11), fare policy decisions are approved by the Fiscal Management and Control Board (FMCB), which was created by Governor Charles Baker in July 2015 with a significant focus on reducing the MBTA’s operating deficit; increasing revenue to cut the deficit was one important motivation of the July 2016 fare change. The FMCB has also been responsible for guiding and approving the development of the MBTA’s next-generation fare collection system (AFC 2.0).

The Governor of the state assumes final decision-making authority over MBTA management and policy decisions. However, fare policy decisions are also constrained by the state legislature, which has imposed caps on the frequency and magnitude of fare increases. Following the fare change in July 2016, the state legislature tightened these caps to no more than a 7% increase in any single price (whether a pay-per-use fare or a pass price) once every two years.5 (As seen in Table 2-3, the July 2016 fare change would not be allowed under this new legislation.)

Since 2015, fare policy analysis at the MBTA has been performed on an as-needed basis by staff in the MassDOT/MBTA Office of Performance Management and Innovation (OPMI), with periodic modeling assistance from the Central Transportation Planning Staff (CTPS) at the Boston MPO. CTPS developed and updates a spreadsheet model for predicting the ridership and revenue impacts of fare change scenarios (the FERRET model), which is described in Chapter 3; leading up to recent fare changes, results from this model have been presented to MBTA executive leadership to inform selection of final fare changes. More recently, new Revenue staff under the CFO are also involved in fare policy analysis, fare product distribution, and marketing.

5 https://malegislature.gov/Laws/SessionLaws/Acts/2016/Chapter164

33

Figure 2-11: MBTA Organizational Chart

Source: MBTA FMCB Strategic Plan (2017) Note: Luis Ramirez became General Manager and CEO of the MBTA in September 2017 (following Interim General Managers Brian Shortsleeve and Steven Poftak).

2.2 CTA

2.2.1 Transit Service and Ridership The CTA provides bus and heavy rail transportation in the Chicago metropolitan area. The CTA operates alongside two other transit agencies – Metra providing regional commuter rail services, and Pace providing suburban bus and paratransit services.

Selected characteristics of CTA services are summarized in Table 2-5 and Figure 2-12. The CTA provided approximately 1.6 million rides on an average weekday, split almost evenly between bus and heavy rail. The rail network provides longer rides than the bus network on average, so rail accounts for over 2/3 of total passenger-miles at the CTA.

34

Table 2-5: CTA Selected Transit Service Characteristics (2016)

Annual Vehicle Average Annual Stations Service Weekday Annual Passenger Routes and Stops Hours Rides Rides Miles Bus 129 10,768 5,758,937 826,322 259,058,440 633,607,162 Heavy Rail 8 lines 145 4,004,874 759,866 238,645,812 1,445,244,645 Total System 1,586,000 497,704,000 2,078,852,000 Sources: CTA Facts (https://www.transitchicago.com/facts/), CTA Annual Ridership Report (2017b), FTA Transit Profiles 2016 (2017)

Figure 2-12: CTA Ridership and Passenger Miles by Mode (2016)

Sources: CTA Facts (https://www.transitchicago.com/facts/), CTA Annual Ridership Report (2017b), FTA Transit Profiles 2016 (2017)

From 2002 to 2012, bus ridership was flat at around 300 million unlinked trips per year while rail ridership grew about 28%. Beginning in 2012, growth in rail ridership began to slow and bus ridership began to decline; by 2017, bus had declined about 21% from 2012 levels and rail had also begun to decline. This break in trend and significant loss in bus ridership has prompted questions about the specific sources of decline (whether changes in service or fares at the CTA or changes outside the agency such as new transportation alternatives, congestion affecting CTA buses, gas prices, or population shifts).This thesis does not aim to explain this bus ridership decline; however, as discussed later and shown in Figure 2-13, the beginning of the ridership decline coincided with a large increase in CTA pass prices in January 2013 and large shift in ridership away from pass products and toward pay-per-use products. Fare increases reduce ridership and pass use stimulates ridership (since there is zero marginal cost for each additional ride), so this fare policy change likely contributed to subsequent ridership decline.

35

Figure 2-13: CTA Unlinked Passenger Trips by Mode (2002 – 2017)

Source: FTA (2018)

2.2.2 Geography and Demographics The CTA serves Chicago and 35 surrounding suburbs. As shown in Figure 2-14, the large majority of CTA rail service falls inside the City of Chicago. As a result, Chicago and its Mayor play a major role in CTA governance; this is reflected in the CTA’s governing board, which has four members appointed by the Mayor of Chicago and three appointed by the Governor or Illinois. 6

6 https://www.transitchicago.com/facts/

36

Figure 2-14: CTA ‘L’ (Left) and Metra (Right) Maps

Source: CTA ‘L’ System Map (https://www.transitchicago.com/maps/), Metra Maps and Schedules (https://metrarail.com/maps-schedules/system-map)

Like the MBTA, the CTA plays an important role in providing accessible and affordable mobility options in a service area marked by both overall diversity and local concentrations of races and socioeconomic groups. The dot maps in Figure 2-15 and Figure 2-16 illustrate the concentration of both white residents and above-average wealth on the north side of the city, with fairly sharp lines also dividing black, Hispanic, and Asian residents. These patterns make the geographic distribution of policy and service impacts critically important.

37

Figure 2-15: Racial Dot Map of Chicago (2010)

Source: Cable (2013)

Figure 2-16: Income Dot Map of Chicago (2010)

Source: ESRI Mapping Incomes 2018, http://storymaps.esri.com/stories/2018/mapping-incomes/index.html Notes: Blue symbols represent tracts where the proportion of households with incomes over $100,000 exceeds the national rate. Red dots represent tracts where the proportion of households with incomes under $25,000 exceeds the national rate.

38

2.2.3 Operating Finances The CTA’s annual operating budget of roughly $1.5 billion is shown in Figure 2-17. Fare revenue accounted for 39% of the operating budget in 2016, with most of the remainder coming from several public sources – Illinois Sales Tax revenue, City of Chicago Real Estate Transfer Tax revenue, and the Illinois Public Transportation Fund (which partially matches the other public sources). Due to the State’s fiscal problems, public funding is somewhat unpredictable year to year, and the CTA often has to control expenses during its annual budget cycle to balance revenues and expenses. From 2013 through 2017, the CTA was able to balance its budget without raising fares by cutting costs and expanding non-fare revenues. In 2018, faced with cuts of over $33 million in state funding, the CTA raised fares as one measure to close its projected operating deficit.

Even in the 2018 budget proposal, however, fares still account for about 39% of total projected revenue.7 Given the CTA’s reliance on public support, maximizing fare revenue is not the only objective of CTA fare policy. Retention and growth in ridership is also important for the CTA’s bottom line because it demonstrates the critical role that the CTA will continue to play in the Chicago region and the state, helping to secure ongoing public support for CTA operations.

Figure 2-17: CTA 2016 Revenues and Expenses

Source: CTA (2017a)

2.2.4 Fare Structure and Levels CTA fares are differentiated in the same four primary ways as MBTA fares: by service, tariff, user type, and medium.

7 The proposed budget for 2018 projects $583 million in fare revenue out of a budget of $1.514 billion.

39

The CTA offers several tariffs (or payment structures) – pay-per-use and four different rolling period passes valid for 1, 3, 7, or 30 days. The passes are valid on CTA rail and bus; 30-day passes are also valid on Pace buses, and 7-day passes are valid on Pace for a small premium. The CTA also offers a discounted monthly pass to Metra (commuter rail) monthly pass-holders, valid on the CTA only during weekday peak periods. Pay-per-use fares are higher for CTA ‘L’ trains than CTA bus, but both have “flat” fares (not varying by distance or time). Customers transferring between CTA services or between the CTA and Pace using a Ventra card pay a flat $0.25 transfer fee regardless of which services are used in the trip. Transfers between the CTA and Metra are not discounted; the only integrated CTA-Metra fare product is the $55 monthly Link-Up pass, which is added to an existing Metra monthly pass to provide CTA access on weekdays during peak periods. As explained below, fare collection for Metra and the CTA are integrated in the Ventra mobile app.

By default, CTA users pay “full” fares. Lower (“reduced”) fares and prices are available to children, grade school students, seniors, and people with disabilities. The CTA is free to active military and veteran and certain seniors and people with disabilities. University students also ride at a pre-paid reduced rate if their school is enrolled in the CTA U-Pass program. Schools in the U-Pass program sign contracts with the CTA and are charged directly based on student enrollment; schools pass costs to students through enrollment fees rather than per-ride fares or pass prices.

Pay-per-use fares also vary by medium, with a premium for using cash, disposable single-ride tickets, and unregistered contactless credit cards. There is a $5 charge to purchase Ventra cards (transit smart cards), but it is credited as account value if the card is registered.

Figure 2-18 shows how different fares and fare products contribute to total fare revenue in 2016, broken out by the defining elements of the CTA’s fare structure as discussed above. Figure 2-19 provides a similar breakdown of ridership.

40

Figure 2-18: Revenue by CTA Fare Structure Dimension, 2017

Source: CTA Ventra, CTA (2017a) Notes: Percentages are out of a projected $560,377,000 for 2017 from the 2018 Budget Recommendations. Within that total, pay-per-use revenue is based on ride validations on CTA bus and rail and sales of CTA limited-use fare products (such as single-ride tickets) recorded in Ventra. Pass revenue is based on sales of CTA pass products recorded in Ventra. U-Pass and other non-Ventra sales make up the remainder.

Figure 2-19: Ridership by CTA Fare Structure Dimension, 2017

Source: CTA historical ridership figures Notes: Percentages for Service (bus and rail) are out of 479,435,218 total rides in 2017, including cross-platform transfers on rail. Other percentages are out of 437,516,437 rides, ignoring cross-platform transfers.

Table 2-6 shows selected CTA fares and prices since January 2009. In 2013, the CTA raised pass prices substantially while leaving pay-per-use fares unchanged. This had the effect of raising the pass multiple (measured as the "breakeven" number of rail rides) from 38.2 to 44.4 for a 30-day pass and 10.2 to 12.4 for a 7-day pass. This represented a significant departure from "deep discounting" at the CTA in the 1990s and 2000s. Following the strategy advanced by Oram (1990) and Fleishman, Koppelman, and Schofer (1991), the CTA had focused fare increases on base fares while pricing its passes at favorable

41 multiples in a bid to increase revenue while growing or maintaining ridership. As described below, the January 2013 fare change was followed quickly by replacement of the CTA's legacy AFC system with the new Ventra system. Some customers were unsure whether passes were being sold during the transition to Ventra, and the added convenience of loading value onto a transit account under Ventra further enhanced the attractiveness of pay-per-use relative to passes. In January 2018, after five years without a fare change, the CTA somewhat reversed course on pass multiples, raising base fares but leaving passes unchanged except for a $5 increase in the 30-day pass price. Even with this price increase, the 30-day pass multiple was reduced to 42.0; the 7-day multiple fell to 11.2.

Table 2-6: Selected CTA Fares, January 2009-Present ($) Jan 2009 - Dec 2012 Jan 2013 - Dec 2017 Jan 2018 - Present Fare Change Fare Change Fare Bus - Cash 2.25 - - 2.25 +0.25 (+11%) 2.50 Bus - Card 2 - - 2 +0.25 (+13%) 2.25 'L' Train - Card 2.25 - - 2.25 +0.25 (+11%) 2.50 Transfer 0.25 - - 0.25 - - 0.25 3-Day Pass 14 +6 (+43%) 20 - - 20 7-Day Pass 23 +5 (+22%) 28 - - 28 30-Day Pass 86 +14 (+16%) 100 +5 (+5%) 105

It is likely too early for the full impact of the 2018 fare change to be visible; some preliminary summaries are presented in Chapters 5 and 7. Looking back at 2013, however, Figure 2-20 shows large shifts in total ridership and fare product market shares following the fare change in January 2013. Ignoring free rides, cross-platform transfers, and U-Pass rides, the share of all other CTA ridership on pass products and pay- per-use flipped within the span of one year, from about 56% on passes before 2013 to about 36% on passes since 2014. The drop in pass share of CTA ridership was complete by the fourth quarter of 2013; given that the Ventra system roll-out did not begin until August 2013 and did not complete until July 2014, the fare change was clearly the main driver of the substantial shift from passes to pay-per-use. The fare change and shift toward pay-per-use also coincided with the beginning of the substantial, ongoing decline in CTA bus ridership; however, other events around the same time – “de-crowding” service changes in December 2012, closure of the Dan Ryan branch from May to October 2013, and the Ventra roll-out – confound impacts of the fare change on ridership.

42

Figure 2-20: CTA Ridership by Tariff, 2009 - 2017

Source: CTA historical ridership figures

2.2.5 Fare Product Distribution and Fare Collection The CTA uses many different sale channels to distribute fare products and collect payments. Table 2-7 shows fare revenue that is recorded in the Ventra system in 2017, divided by sale channel. Note that the Ventra system does not include U-Pass revenue, and Ventra stored value can be used to pay fares and purchase products for Pace Bus (via most CTA sale channels) and Metra (via the Ventra mobile app); this means that the stored value loaded onto Ventra accounts is not all CTA revenue, since some of it is later used on Pace and Metra. The largest sale channel from a revenue perspective is ticket vending machines (TVMs) located throughout the system; however, online sale channels – the Ventra mobile app, threshold autoload, and the Ventra web site – also combine to account for about 1/3 of revenue recorded in Ventra. The Pre-Paid Benefits (PPB) program, which works with employers to facilitate pre-tax transit purchases via payroll deductions, also provides a significant share of revenue. Sale channels for CTA passes are described in more detail in Chapter 5.

43

Table 2-7: CTA Ventra System Fare Revenue by Sale Channel, 2017 Revenue in Ventra Sale Channel ($millions) Ticket Vending Machine (TVM) $217.3 Mobile Ventra $73.1 Retailers $69.4 Threshold Autoload $55.8 Pre-Paid Benefits (PPB) $54.0 Patron Website $36.0 Distributor Order $16.9 Other $8.3 Mobile Metra -$5.7 Other Non-CTA Spending ~ -$37.3 Total $487.8 Source: CTA Ventra Notes: U-Pass revenue is not included in Ventra. Ventra stored value can be used to pay fares and purchase products for Pace Bus (via most CTA sale channels) and Metra (via the Ventra mobile app), resulting in negative CTA revenue. Non-CTA spending is based on the difference between stored value sales (which could include revenue later spent on non-CTA services) and stored value deductions from CTA bus and rail ride validations.

The CTA Ventra system is an account-based and open-payment fare collection system. In contrast to card-based systems like the MBTA, different payment and validation methods can all be associated with a single user account. This allows for fare validation using a variety of media – transit smart cards, transit tickets, and contactless bank cards – and allows purchases to be made in real time via the Ventra mobile app or web site (in addition to ticket vending machines and other traditional sale channels). The Ventra system was rolled out between August 2013 and July 2014, replacing a card-based fare collection system that included magnetic stripe tickets (the Transit Card) and earlier smart cards (the ). 2.2.6 Fare Policy Analysis and Decision Making As described above, the CTA operates under the Chicago Transit Board with four members appointed by the Mayor of Chicago and three appointed by the Governor of Illinois.8 As a result, the Mayor of Chicago plays an influential role in high-level policy decisions at the CTA (including fare policy changes). The CTA President leads operation of the CTA, including development and proposal of budgets and fare policies. Fare policy analysis is conducted primarily by budget and ridership analytics staff as part of the CTA’s annual budget cycle. With the help of outside consultants, the CTA has developed and updated several different fare prediction tools in recent decades; some of these tools are described in Chapters 3, 6, and 7. Ventra system staff also play an active role in fare policy decision making by communicating and developing the technological potential of the Ventra system; Ventra opens fare policy opportunities that were infeasible under previous fare collection technologies.

8 https://www.transitchicago.com/board/

44

2.3 Conclusions This chapter provided essential background on the two transit agencies used as case studies in this thesis, the MBTA and the CTA. These two agencies have several important similarities. With respect to fare policy objectives, they must balance tightening budgets and the desire to increase operating revenue against the need to advance ridership and equity (which justify sustained support for public subsidies). Both have flat fares with period passes and recently increased pass “multiples,” and both are operating in a context of declining bus ridership and new, competing travel alternatives (chiefly ride hailing). They also differ in important ways. The MBTA provides bus, subway, and commuter rail under an integrated fare policy, while commuter rail in the Chicago metropolitan area is provided by a separate agency (Metra) with little integration of fare rules. CTA pass multiples are currently much higher than at the MBTA, and CTA passes have a correspondingly lower fare product market share than at the MBTA. The CTA’s fare collection system is account-based, while the MBTA’s card-based system has at least a couple more years before replacement. And the greater municipal fragmentation in the Boston area affects the roles of agency, city, and state officials in shaping fare policy.

The next chapter reviews the academic literature on several topics relevant to fare policy analysis at the CTA and MBTA.

45

3 Literature Review This chapter describes existing work in several areas of academic research and industry practice that are relevant to analyzing and determining incremental fare changes. The first section provides a broad overview of transit fare policy, which further clarifies the scope of this thesis. Following sections describe pricing theory and strategy, exploratory analysis of transit user behavior using AFC data, and demand modeling. 3.1 Fare Policy Elements As described in Chapter 1, fare policy is broad and multifaceted. To provide essential context and further clarify the focus of this thesis, it is helpful to describe the different elements of fare policy.

A comprehensive framework for fare-related decision making was proposed in Fleishman et al (1996), which described five fare decision parameters: policy, strategy, structure, payment technology, and collection system. Similar frameworks are described in McCollum and Pratt (2004) and TranSystems (2007). These terms (especially “policy,” “strategy,” and “structure”) often overlap in meaning. In an attempt to better follow the everyday meaning of terms and to provide a fresh perspective, fare policy is described in Figure 3-1 and the sections below in terms of three conceptual categories – drivers (“why”), functions (“what”), and infrastructure (“how”).

Figure 3-1: Conceptual Elements of Fare Policy

3.1.1 Drivers The drivers of fare policy are the objectives, purposes, goals, or stimuli that motivate fare-related decisions. Fleishman (1996) describes three categories of drivers – policy, service, and technology – which are borrowed to organize the discussion below.

46

Policy Drivers Policy drivers can be formal or informal and short-term or long-term. In some cases, transit agencies explicitly identify and formally adopt a set of principles and objectives (and occasionally measurable targets) to help guide the fare policy decision making process. Policy objectives that were commonly identified at U.S. agencies in 1996 and 1998 included the following (Fleishman 1996, 2003):

 Regular review (primarily at large agencies)  Fare revenue growth or mandated  Ridership growth  Simplicity and comprehensibility  Other short-term crises

While these “top” objectives often stimulate a fare policy review or change, other policy objectives come into play during evaluation of fare policy alternatives. For example, the objectives in the MBTA Fare Policy approved in December 2015 include improved operational efficiency and system utilization, improved equity and accessibility, and alignment with broader city or regional goals (such as transit mode share targets). Other policy drivers may be less explicit or more circumstantial, such as personal priorities of agency leadership, external political forces such as mayoral or gubernatorial influence, or legislative interventions.

A fundamental challenge of fare policy decision-making is that different policy objectives and agendas conflict with each other. For example, the goal of increasing fare recovery ratios (the share of transit operating costs covered by fares) typically conflicts with social goals of providing a minimum standard of accessibility, and the goal of improving transit service in urban areas may compete with other regional or state-level goals for infrastructure investment and improved mobility. Table 3-1 provides an example of the general “rules” for setting fare structures and levels that are implied by different common objectives. All of the implied pricing rules conflict with each other.

Table 3-1: Single-Objective Pricing Rules Objective: Maximize… Implied Pricing Rule: Set fares… Ridership ≤ $0 Revenue where fare elasticityit = -1 Efficiency = marginal-costit – adjustments Equity based on demographic geography Simplicity = p (single flat price)

In practice, multiple objectives are important and pricing rules implied by a single objective are rarely selected. Multi-criteria decision-making is one framework that can be used to frame this kind of decision (with multiple objectives). Metrics are defined and calculated to measure the performance of different possible decisions relative to each objective (as in Figure 3-2). These metrics can either be presented independently to inform stakeholders and decision-makers (who will emphasize different objectives), or they can be formally weighted and combined into overall scores for different alternatives. While perhaps more objective and transparent than less formal decision-making processes, the selection and presentation of metrics plays a significant role in framing objectives and determining outcomes.

47

Figure 3-2: Example Metrics for Multi-Criteria Decision-Making Decision Alternatives Objective Metric A B C 1. Ridership a) Annual ridership # # # b) Ridership growth rate # # # c) Transit mode share # # # 2. Revenue a) Annual revenue # # # b) Fare recovery ratio # # # 3. Efficiency … # # # 4. Equity … # # # 5. Simplicity … # # #

Even where there is agreement about long-term objectives, there can be conflicting short-term objectives that implicitly reflect different theories of change. For example, expansion in transit system service and use may be a unifying objective, but there is often disagreement about whether the necessary funding for service expansion should come primarily from increased fare revenues or primarily from external sources.

Service Drivers Service drivers are changes to transit operations or service, such as a new bus or rail line. Introduction of a new mode is particularly likely to trigger review and revision of fare policies since it forces decisions about fare collection, pricing relative to existing modes (including transfer policies), and potentially sources of revenue to cover new operating and maintenance costs. As discussed in Zureiqat (2008) and Halvorsen (2015), agencies may also turn to fare policy for demand management, such as reducing peak period crowding or increasing ridership on under-utilized capacity. In this case, fares are used intentionally as a behavioral incentive (not only a revenue source).

Technology Drivers Finally, technology drivers are planned or potential new technologies for fare payment/collection, validation, or enforcement. New technologies open new possibilities for fare structures, affect the cost of implementing (and reversing) fare changes, and require revenue to operate and maintain. Payment and wireless communication technologies have improved dramatically in the last 20 years, so implementation of modern fare collection systems is a major fare policy driver at many agencies including the CTA (which replaced its system in 2013) and the MBTA (which recently awarded a contract to replace its system in the coming years). Even if modern technologies are in place, fare policy changes may be motivated by integration of those technologies across multiple transit agencies (as currently at Metrolinx in the Toronto region)9 or with other transportation services (such as Metro Bike Share in )10.

9 http://www.metrolinx.com/en/regionalplanning/fareintegration/default.aspx 10 http://www.bicycletransit.com/los-angeles/

48

3.1.2 Functions Distinct from drivers are functional elements of fare policy, which make up the road map or rules by which fare policy operates. There are at least four inter-related functional elements of transit fare policy that are identified in Fleishman (1996), Jain (2011), and Hong (2006):

1. Fare structure 2. Fare levels 3. Product distribution and fare collection 4. Fare validation and enforcement

Fare Structure Used variously in the literature to refer to the whole of fare policy, pricing rules, and fare levels, “fare structure” is used in this thesis to mean the rules by which different prices and fares can be calculated. Fare structures are commonly defined by at least four different kinds of features, sometimes leading to a very high number of possible fares:

1. Features of an individual, such as age, disability, school enrollment, occupation (e.g. military or public servants), income, and employment status; 2. Features of a trip, such as mode, transfers (either within or across agencies), travel distance or zone crossings, line or route, origin or destination, time of day, day of week, and trip duration; 3. Fare product “tariff” or payment structure, such as pay-per-use (also called single-ride or pay- as-you-go), multi-ride / limited-use, unlimited-use pass (of varying durations), tapered (separate boarding charge and time or distance fee), and fare capping;11,12 and 4. Payment or validation medium, such as cash, ticket, transit card, , or mobile device.

These features are sometimes framed in other categories, such as fare products, transfer policies, and discounting. However, these are all cross-cutting issues that should not be confused with the fundamental components of pricing rules:

 The fare products that are offered to transit customers relate closely to the entirety of an agency’s fare structure. The attributes of fare products can both reflect and determine any combination of the different features of fare structure. For example, if fare structure is not based on distance or zone, then there are no fare products with distance or zone as an attribute. However, some elements of fare structure may not be evident from looking at fare products; for example, a customer using a single fare product may still pay different fares depending on the type of trip taken.  Fare integration cuts across the features of fare structure described above. From the customer perspective, a trip that uses multiple transit services, crosses transit agency service areas, or combines public transit and private transportation services is still a single trip with an associated total fare or price (based on the combined fare structures of all the service operators). In light of

11 The term “tariff” is adopted in this thesis to refer to these payment structures because it is a general term for pricing without any other conflicting connotations in American English. Several of the tariffs listed can be understood as variations on a “two-part tariff,” i.e. different combinations of a lump-sum fee and a per-unit charge. 12 Note that fare product eligibility, validity, and availability are also often defined by features of individuals, features of trips, and medium.

49

this perspective, fare integration projects aim to simplify and rationalize these combined fare structures using consistent fare media and pricing rules across services and organizations. Again, this could take place across services or modes within an agency (such as transfer policies between bus and rail service within an agency like the MBTA or CTA), across different transit agencies (like the effort to integrate nine different operators around Toronto), 13 or between transit and private transportation services. There are often substantial technological and institutional obstacles to integration, and it can result in major changes to fare structure or effective pricing strategies.  Pricing rules such as unlimited-use passes, transfer policies, or student pricing are often framed as discounts, concessions, exceptions, or rewards relative to "base" pricing. For example, the CTA historically provided bonus value when customers loaded more than a certain dollar amount of stored value onto a transit card, and BART piloted the “BART Perks” rewards program from Aug 2016 to Feb 2017 (Greene-Roesel et al. 2018). While the framing of these pricing rules and the increasing ability to target promotions to specific customers have important political and behavioral consequences, discounts and rewards are still fundamentally pricing rules.

Transit agencies have employed a wide variety of different rules for pricing. Fleishman (1996, 2003) provide many examples of U.S. transit agencies, and current fare structures at many large transit agencies in the U.S. and around the world are described in recent reports for Metrolinx in Toronto (Metrolinx 2015) and Translink in Vancouver (Lipscombe 2016; Translink 2016). (Gwilliam (2008) also identifies a wide range of fare structures.) Translink (2016) shows substantial variation in whether or how major transit systems use of distance, mode or service type, time of day, and transfers to define their fare structures; however, nearly all agencies offer a combination of pay-per-use (or pay-as-you-go) and pass products, with a few offering fare capping. (Under capping, customers are charged pay-per-use fares until they reach a daily, weekly, or monthly “cap” in total fares; all subsequent rides within the applicable time period have zero marginal cost.) This motivates the emphasis throughout this thesis on understanding and accounting for customer fare product choice (and specifically the choice between pay-per-use fares and passes). Hong (2006) offers many examples of less common and innovative fare structures and structure variations made possible by smartcards, including line-restricted passes, free zones, airport-specific fares, time-specific fares for students, ride- and time-limited passes (e.g. 30 days or 50 rides, whichever comes sooner), non-consecutive day passes, price capping, pricing fare media, add-value bonuses, frequent ride bonuses / discounts, return journey discounts, parking discounts (for park-and-ride), and origin- destination-specific discounts.

Any transit agency fare structure (along with fare levels) reflects a set of pricing strategies. These strategies could be explicit and intentional, motivated by presumed connections between the selected pricing rules and broader fare policy objectives (discussed above under drivers of fare policy). Strategies could also be implicit and perhaps even unknown – the unintended (but still present) implications of fare structure for customer behaviors. Pricing strategies are discussed at length in Section 3.2.

Fare Levels Fare structure defines the possibilities for assigning different prices, and fare levels simply set those prices. As a result, prices can take many different forms. In combination with the underlying fare

13 http://www.metrolinx.com/en/regionalplanning/fareintegration/default.aspx

50 structure, fare levels have behavioral implications for transit ridership. The level and structure of fares relative to non-transit travel alternatives will affect choice of travel mode, and the relative fares of different transit options (such as different modes and different tariffs) will affect purchase decisions and ridership frequency for those who do choose to use transit. These behavioral implications of fare levels are a focus of this thesis.

Fare levels are changed much more frequently than fare structure. Substantially altering fare structure sometimes requires new technology and often involves significant costs for implementation and explanation to customers. For example, implementation of a distance-based fare structure requires technological capacity to validate trip destinations as well as origins (such as “tap-in, tap-out” AFC systems with validators at station exits as well as entrances). Changing fare levels is easier to communicate to the public, and in modern fare collection systems it only requires a central software change to implement. These more frequent opportunities to modify fares – and the questions they raise for transit operators – provide motivation for the focus of this thesis on incremental fare changes rather than major changes to fare structure.

Product Distribution and Fare Collection Fare products can be distributed and payments collected through many different “sale channels.” This typically includes fare vending machines (FVMs) and ticket offices or windows, but can also include point-of-entry or onboard sales (such as cash payment on vehicle fareboxes or ticket sales by commuter rail conductors), retail stores and other businesses, mail ordering via phone or web site, sale of digital or account-based products on web sites or mobile apps, grade schools, universities, social service organizations, and employers. Certain sale channels allow subscription for automatically-recurring payments and automatic distribution of fare products. For example, many employers in major U.S. cities offer automatic payment for transit fare products via payroll deductions to take advantage of federal and state tax benefits. Modern AFC systems such as the Ventra system in Chicago also provide options for recurring purchase of stored transit value or fare products onto an account or transit card via automatic credit card payments. Typically, product distribution and payment happen at the same time, but not always. For example, semester passes for university students may be sold at the beginning of a term but distributed monthly to students. Alternatively, payment could be collected or returned retroactively, as in the case of Transport for London “best value” refunds and the AccessMIT employee pay-per-use agreement with the MBTA (Rosenfield 2018).

The distribution of fare products and collection of payments relate closely to other elements of fare policy. Distribution channels typically require some infrastructure and are often limited to certain fare media. Cash fares are often paid on boarding a vehicle, making product distribution, fare collection, and fare validation practically one and the same.

However, distribution and collection can also play distinctive roles in price differentiation (or at least, incidentally, customer segmentation) and the salience of fare levels. Some sale channels allow for verification of customer identity or customer attributes, which is essential for offering group-specific products like senior or student passes. Sale channels also less explicitly target different groups of customers via variation in the products that they offered and the convenience with which customers can access them (such as a mobile app); in theory, this allows products offered only on specific sale channels to be designed and priced for a specific market. Sale channels also naturally affect the salience of fare

51 levels. Different payment processes make customers more or less acutely aware of the cost of their transit use. Automatically-recurring payments are particularly likely to reduce the salience of fare levels by putting them out of sight and out of mind. Chapter 5 explores the behavioral implications and strategic opportunities of pass sale channels at the CTA and MBTA.

Validation and Enforcement Fare validation rules prescribe the way that customers prove they have paid for the services they use. Fleishman (1996) identifies three main types of validation:

 barrier, such as a fare gate or a turnstile restricting access to a station or platform;  point-of-entry, where fare payment is checked (and potentially processed) upon boarding a vehicle, such as using a bus farebox; and  proof of payment, where customers produce evidence of payment upon request (such as a loaded with an appropriate fare product or a paper receipt). This requires a program of fare inspection to deter fare evasion.

In each case, validation can be performed automatically or manually. For example, point of entry validation on a bus could be performed visually by the driver or automatically using an interaction with a fare validators. Fare evasion or non-payment could occur for many different reasons depending on the approach to fare validation, so enforcement policies and violation penalties are catered to specific implementations.

Along with fare product distribution and fare collection, the fare validation process has an important impact on the salience of fare levels. Validation could provide a pointed reminder of the expense of transit travel (such as displaying a flashing red reduction in stored value on a transit smart card) or no feedback at all about expense (such as a smartcard validator without a display). These design factors will affect customers’ sensitivity to fares and fare changes. 3.1.3 Infrastructure If functions define the rules by which fare policy operates, infrastructure is the implementation or realization of those rules. Fare policy infrastructure mediates the exchange of information, fare media, fare products, and money between customers and transit agencies. While physical infrastructure comes to mind first, digital infrastructure (such as transaction management or web applications) is equally as important. Possible fare policy infrastructure includes fare vending machines, fareboxes, fare gates, fare validators, transit vehicles, transit drivers, transit police, billboards and posters, ticket offices, fare servers, snail mail, convenience stores, employers, schools, social service agencies, email, mobile applications, web sites, mobile devices, smart cards, tickets, and credit cards.

This infrastructure is often collectively called a “fare collection system,” though it actually cuts across all of the different functions of fare policy – structure, levels, distribution, collection, validation, and enforcement. Transit agencies use a wide variety of fare technologies, and every implementation is different. However, it is helpful to draw a few general distinctions:

 Non-electronic versus electronic. In non-electronic systems, payments and validation are performed using cash, tokens, and tickets. Electronic systems are able to use a wider variety of

52

fare media (such as magnetic-stripe tickets or contactless bank cards), and validation can occur electronically rather than mechanically (such as a token in a turnstile) or visually (requiring staff).  Closed-loop versus open-loop payment / card-based versus account-based. Within electronic systems, there are two general approaches. A closed-loop system requires that customers use payment and validation media that are specific to the transit system (and not valid anywhere else). Closed-loop systems are also typically card-based, in that information such as fare products and stored transit value "live" on the agency-specific fare media (such as smartcards or magnetic- stripe tickets). In open-loop systems, customers are able to use either system-specific fare media or general-use media, such as contactless bank cards. Open-loop systems are account-based; multiple cards or devices can be registered to a single customer account, and any of the different media can serve as a token to identify, access, and modify the customer account.

All major transit agencies in the U.S. (and many smaller ones) moved to electronic fare collection systems in the 1990s and 2000s. In recent years, agencies have started replacing earlier electronic systems with open-payment, account-based systems. Advantages of account-based systems include flexible acceptance of different payment methods, the ability for customers to update accounts in different and more convenient ways, improved ability to integrate payment across transportation services, and typically lower costs to modify fare rules (Kocur 2015). To the degree that they reduce the convenience of paying in cash, open payment systems require special equity considerations for customers without bank cards or (Brakewood and Kocur 2013). The improved convenience of payment (such as the option to make automatic, recurring purchases using an associated credit card) also affects the salience of fares and the relative attractiveness of different fare products (such as passes versus pay-per-use fares).

One important aspect of fare collection systems is the data that is generated on individual-level sales and ridership behavior. This automated fare collection (AFC) data is used to inform a wide range of operations, planning, and policy decisions. It is essential for agencies to retain ownership of this data even if implementation and operation of fare collection systems are contracted to a third party. 3.2 Pricing Theory and Strategy

3.2.1 “Optimal” Transit Pricing and Complications Since the 1970s and 1980s, there has been a reasonably active academic literature on the "optimal" pricing of public transit in terms of economic efficiency. See Gwilliam (2008) for a good overview. The basic starting point in microeconomic theory is that an efficient outcome is reached when the cost to a customer of taking an additional trip is equal to the social marginal cost of that trip. This allows potential riders to “see” the full social cost of getting from A to B on transit so they can make the socially optimal decision among their travel alternatives.

It is helpful to think of the social marginal cost of transit trips in three components:

1. Agency cost, specifically the long-run marginal cost to the agency of providing additional transit service (vehicles and runs) to serve additional trips. Note that in the short run, with fixed service, the marginal cost to the agency of serving an additional trip is essentially zero. In the long run, additional ridership (particularly at peak load points) is assumed to affect service frequencies and

53

capacity; however, this may not apply to marginal ridership on services that are operating under capacity. 2. Private generalized cost of travel, which is primarily the cost of the user’s travel time (adjusted for comfort and productivity). 3. External costs and benefits, including any impacts of an additional trip on other transit riders or on costs that are external to the transit system.

These components of social marginal cost imply an optimal transit fare. Transit users already experience the cost of their own travel time, so the optimal fare should equal the sum of the other two components of social marginal cost, agency cost and external net costs. If these two components are not incorporated into fares, they will not be experienced directly by the rider.

Careful consideration of these two cost components suggests several reasons that transit fares should be subsidized to maximize social welfare, but it also shows the difficulty in actually computing the “optimal” transit fare. One important aspect of agency costs is that transit systems experience economies to scale, meaning that the marginal cost of serving more trips decreases as total ridership and system size grow. As seen in other settings with economies of scale, this means that agency marginal costs may be below average costs and a subsidy may be needed to maximize welfare while avoiding bankruptcy. Agency costs are straightforward to estimate (at least in theory), so more attention has been given to understanding the potential external costs or benefits of marginal transit ridership and service that should affect transit pricing:

 The “Mohring effect,” also referred to as user economies of scale. In the long run, additional passengers lead to increased transit service frequency; this results in reduced wait times for all riders, not only the marginal rider, suggesting that fares should be reduced to account for this external benefit. Alternatively, from the agency perspective, the long-run marginal cost of providing service should be adjusted to account for the travel time savings of more frequent service for existing passengers (not only new ones); as with economies of scale, this implies that fares should optimally be set below the level needed to cover agency average costs, requiring an additional subsidy. The implied subsidy may be different for high frequency and high utilization services versus low-frequency, low-load services, though the logic is similar (Mohring 1972; Turvey and Mohring 1975; Jansson 1979).  Crowding and delay. The authors who identified the Mohring effect were also concerned with the short-run impact of additional ridership given a fixed level service. Each additional passenger boarding any given vehicle slows the boarding and alighting process and increases crowding, causing delay and discomfort to other passengers. While the Mohring effect suggests the need for a transit fare subsidy in the long run, this short-run external cost suggests that fares on crowded services (such as during peak periods or the congested portion of a route) should be higher than fares on services with low load factors (Tirachini, Hensher, and Rose 2014); however, this might not hold given different utilization and rider access behaviors across services (Jansson 1979).  Correcting automobile externalities. In the U.S., there is considerable evidence that motor vehicle users do not face the full marginal social cost of their auto use, and that auto use imposes substantial external costs in the form of congestion, safety and health costs, and environment damage. (See, e.g., Delucci (2004).) The most direct solution to this problem is to introduce corrective pricing on motor vehicles, such as congestion pricing (Basso and Jara-Díaz 2012). But

54

a second-best approach would be to subsidize transit fares to recover an optimal mode split with motor vehicles.

While this theory is foundational to public policy arguments for continued subsidization of transit service, it has very rarely been applied directly to practical decisions about fare policy at transit agencies for several reasons. First, external costs and benefits are very difficult to estimate, especially since optimal fares will vary by setting (mode, time period, etc.). Two notable attempts to estimate optimal fare levels are Nelson et al. (2007) and Parry and Small (2009). Nelson et al. (2007) simulate the metro-wide travel time impacts of alternative levels of transit service in Washington D.C. (adjusting for transit crowding effects) and find that the transit system, WMATA, is close to optimal size given its fare levels and cost recovery ratio of 59%. Parry and Small (2009) develop an analytical model incorporating the Mohring effect, transit crowding, and motor vehicle externalities. They find optimal subsidies of well over 50% of operating costs across modes and time periods in Washington, D.C., Los Angeles, and London.

Another obstacle to applying this theory is that fare policy decisions are not based solely on economic efficiency. Rather, they are made based on a variety of objectives including, equity, simplicity or comprehensibility of fares, and political feasibility. Given the need to weigh all these considerations, it is no wonder that transit agencies do not expend the considerable effort needed to estimate optimal fares based on a single objective (efficiency).

There have been only a handful of more practical economics studies applying this theory to decisions about specific fare structures and fare products. Vickrey (1980) details the implications of marginal cost pricing for different trip settings and emphasizes the efficiency of charging higher fares on crowded trips where additional passengers impose costs on others; however, he acknowledges the rider acceptance and political feasibility challenges of highly differentiated fares that reflect variation in external costs. Carbajo (1988) developed an analytical model for pricing unlimited-use passes relative to single-ride fares based on either welfare maximization or profit maximization. The analytical results for profit maximization show that the pass price should be set “such that its proportionate deviation above the opportunity cost of providing the travelcard [pass] to the marginal consumer is inversely proportional to the elasticity of travelcard participation.” A numerical example suggests that consumer surplus may be highest without passes, but agency profits (and possibly total welfare) can be higher when passes are offered and single-ride fares are increased. Two recent studies have built on this model. Jara-Díaz, Cruz, and Casanova (2016) critiques and extends Carbajo’s model to allow for differences in customer income effects and car ownership; the authors apply their model to Santiago in Chile, illustrating the importance of this customer heterogeneity. Hörcher, Graham, and Anderson (2018) also develop a related analytical model that incorporates the external cost of transit crowding. Customers using unlimited-use pass products experience no marginal fare regardless of whether they ride on crowded services, which runs contrary to marginal social cost pricing. Partially as a result of this crowding externality, the authors show that offering pass products at attractive prices maximizes agency profits, but it only maximizes social welfare if the agency is operating under a sufficiently stringent financial constraint; in essence, the motivation for offering passes is to generate additional agency revenue rather than to optimize economic efficiency. (Their model does not explicitly address passes and pricing in the context of motor vehicle externalities.)

55

The intuitions and behavioral dynamics identified in this literature are valuable for developing fare policy. However, there are obstacles to applying most of this literature to practical decisions about fare structures and levels – how to design a new fare product, modify relative product pricing, or marginally adjust fare levels. Simulation models like Nelson et al. (2007) are difficult to develop. Analytical models like Carbajo (1988) that are focused on theoretical rigor and optimality may be inaccessible to transit analysts who must implement models using available transit data and explain model intuition and results to decision makers. As a result of these obstacles, agencies typically look to other sources for guidance. 3.2.2 Pricing Strategies Applied to Transit A summary of pricing strategies can help to clarify the purpose or function of different fare product attributes.

It is helpful to note that the “product” being consumed by a transit customer is not a fare product or even transit per se, but rather transportation services in general. There are many different ways that people access the various alternatives within the market, and access is distinguished from consumption; the point of sale (obtaining access to a service) can be different than the point of use (consumption of a service). In the case of transit, transportation services are accessed by purchasing a fare product, just as a movie theater is accessed by purchasing a ticket. Fare products are then validated when the service is actually used.

As public-interest organizations that are both subsidized and operating under financial constraints, transit agencies are in an unusual position to think about pricing strategy for fare products. On one hand, there is a desire to set prices in a way that somehow reflects the cost of providing different services. Pricing with reference to operating costs has the potential to improve economic efficiency by communicating the relative costs of different services to customers, and it has an appealing sense of fairness. On the other hand, pricing provides an opportunity for transit agencies to generate fare revenues and to serve other objectives like providing baseline accessibility to all members of the public. The different defining features of fare products allow for agencies to differentiate prices on the basis of either costs or these other objectives (and often a combination of both).

While differentiating fares based on the cost of providing service is a simple and familiar idea, the range of pricing strategies available to achieve other objectives is less familiar to public agencies. There are a number of different strategies suited for different circumstances, but the overall objective is the same – to charge different prices to different groups of customers based on characteristics of the customers (rather than the cost of providing service). The following sections describe two broad groups of pricing strategies and how they relate to fare product attributes: price discrimination and product differentiation. Names of specific strategies vary, but they are drawn in part from Phillips (2005).

Price Discrimination Fare products can be used to charge different customers different prices for access to the exact same transportation services. This happens in two related ways.

First, a fare product could be offered to only customers who can prove that they have a certain characteristic (called group pricing). For example, a fare product for students or seniors could require an ID for purchase. The same idea applies to frequency of travel, though without any external enforcement.

56

If an agency wishes to offer a pass product for frequent travelers (either because they have a lower average willingness-to-pay per trip or in order to build loyalty with regular customers), it will definitively separate customers into those who expect to travel more and less than the “multiple” between the single- ride fare and the pass price. (As discussed below, pass products also involve self-selection of a different kind.) While different “tariff types” such as passes and pay-as-you-go products can be used to differentiate customer groups based on their travel patterns, they also in turn affect customer travel patterns by changing the perceived or actual marginal cost of travel.

Second, a fare product could be offered only to customers who obtain it through a certain program or distribution mechanism, such as products purchased through employers using pre-tax payroll deductions (called channel pricing).

Price discrimination is the most direct and targeted pricing strategy, since it clearly splits customers based on some known characteristic. However, these strategies are also generally the hardest strategies to implement since they require some sort of group membership verification and may be perceived as unfair by the public.

Product Differentiation Fare products can also be differentiated by type and quality of the services which can be accessed. Transportation is a highly heterogeneous good; people travel at different times, on different routes, and with different levels of speed, comfort, and convenience. Rather than charging every individual a different price based on all the particularities of their trip, transit fare products are defined by a few trip attributes and are valid on all trips with those attributes. Additionally, different purchase and validation processes for the fare products themselves make travel more or less convenient. A few examples of attributes used to differentiate the access provided by transit fare products are mode validity, distance or zone validity, time of day validity, and fare product medium (cash, ticket, card, , etc.).

One approach to product differentiation is horizontal differentiation, which distinguishes products on attributes other than objective quality. One example is differentiating fares based on geographic coverage – charging different fares for trips that have a similar level of service but use different routes or origins or destinations; for instance, a fare premium could be charged at rail stops serving an airport or on special trains that serve sporting events. Charging different fares for the same service at different times (peak and off-peak, daytime and late-night, weekday and weekend) might also reflect horizontal differentiation. This kind of differentiation allows agencies to target products to the widely varying needs and preferences of specific groups of customers and charge each group a different price.

Vertical differentiation (or product versioning) instead distinguishes products on factors that result in clear differences in quality perceived by most or all customers, such as speed or comfort in travel. This allows a seller to charge a higher price for a “premium” service to those who are willing to pay, even if the two services have a similar operating cost. (Charging only the cost differential would merely reflect cost-based pricing.)

Finally, self-selection differentiates products based on the value different customers place on convenience (specifically, charging less for products that involve some inconvenience or flexibility). For example, an agency could differentiate prices by time of day to allow customers who place a lower value on their time or have a more flexible schedule to obtain a discount for traveling during unpopular times. The value of

57 convenience is related to (though not synonymous with) price sensitivity, so this allows agencies to charge lower prices to customers with lower willingness-to-pay. (This could be viewed as a case of vertical differentiation.) Pass products reflect a mix of both group pricing (discussed above) and self- selection based on a different kind of convenience; passes will be attractive to customers who reflect a combination of membership in the group of frequent travelers and who are willing to tolerate the inconvenience of pre-payment in order receive the benefits of a pass (a per-ride discount and the convenience of infrequent payment).

“Deep Discounting” Self-selection was a pillar of a transit pricing strategy developed in the late 1980s called “deep discounting.” As set forward in Oram (1990), this strategy involves a combination of “market segmentation, fare prepayment, direct marketing and aggressive promotion” with the goal of increasing both ridership and revenue (or at least increasing one without negatively affecting the other). Per-ride fares are increased, and pre-paid, multi-ride or unlimited-use fare products are offered at a significant discount (at least 25% off) relative to the new cash or pay-per-use fare levels. Different customer segments then self-select into fare products on the basis of their frequency of transit travel and their sensitivity to cost. Price-sensitive customers choose pre-paid, discounted fare products (which increases their ridership), while price-insensitive customers may choose to continue paying per-ride fares at the higher fare level (with only modest impact on their ridership); cost increases are thereby effectively concentrated on customers with low sensitivity to price, and any ridership losses are offset or overcome by induced ridership from customers that adopt pre-payment methods (Fleishman 1996). Aggressive, targeted marketing ensures that each customer segment is aware of the relevant pre-paid, discounted fare product available to them.

This strategy offers the possibility of mitigate ridership losses following fare increases (or the possibility of increasing ridership without losing revenue), and shifting customers to pre-payment has other financial and operational benefits for transit agencies. Case studies of the impacts of deep discounting at U.S. transit agencies are reviewed in Fleishman (1996) and McCollum and Pratt (2004). Pre-payment methods such as pass products were widely adopted by transit agencies in the 1990s and 2000s, and while impacts are difficult to isolate it appears that many experienced some of the intended benefits – positive impacts on revenue with modest losses (and occasional gains) in ridership.

However, any pricing strategy based on self-selection has a few inherent drawbacks or challenges related to factors other than frequency of travel that affect customers’ fare product choices. First, deep discounting complicates the fare structure, requiring that customers be aware of and compare the different payment options available to them; this places a premium on effective marketing of fare products (Fleishman 1996). Second, customers’ fare product choices depend in part on the relative convenience of using pre-paid or pay-as-you-go options, which may assist or undermine in the goals of deep discounting (McCollum and Pratt 2004); for example, if passes or multi-ride tickets are difficult to purchase, even price-sensitive customers will be less likely to switch to pre-paid options (as desired). Finally, some lower-income customers will be deterred from purchasing discounted multi-ride or pass products because of the larger up-front cost, creating regressive outcomes in which higher-income riders pay less for transit (Hickey, Lu, and Reddy 2010; Verbich and El-Geneidy 2017). These inequities are perhaps the greatest difficulty in reaping the benefits of deep discounting (or pass products in general). Options to address

58 these inequities, such as fare capping, tend to undermine the self-selection mechanism that drives the ridership and revenue benefits of pass products and deep discounting.

In its focus on pricing of passes relative to pay-per-use fares at the MBTA and CTA, this thesis follows directly from this literature on deep discounting. Transit agencies like the MBTA and CTA that adopted a deep discounting strategy and introduced pass products in the past must decide whether to increase, sustain, or decrease discounts every time they make an incremental change in fare levels or fare structure. This thesis puts forward a framework and demonstrates tools for analyzing those incremental policy decisions in the present-day context, which includes longer-term experience with pass products and access to automated fare collection data.

Changes in Strategy Adoption of a new pricing strategy requires some kind of change to fare levels or fare structure. For example, initial adoption of a deep discounting strategy could require introduction of a new multi-ride or unlimited-use fare product and possibly an increase in pay-per-use fare levels. McCollum and Pratt (2004) organize their review of customer responses to fare policy according to five different types of changes:

1. Change in fare levels in the same proportion for all fare products 2. Change in relative fare levels of different fare products (such as a change in price for one product while leaving other products unchanged) 3. Introduction or removal of a fare product 4. Change in “fare structure basis” (such as switching from flat fares to distance-based fares) 5. Elimination of fares (allowing free transit travel)

Note that these different types of changes are mechanisms for implementing a fare strategy. They do not represent strategies on their own; one type of change could potentially be used in service of several different pricing strategies, and the particular change that is needed depends on current fare structure and levels. 3.3 Exploratory Analysis of Fare-Related Behavior Using Automated Transit Data This section provides examples from the literature of two ways in which automated transit data can be used for descriptive and exploratory analysis related to fare policy. The first might be called “business intelligence,” or using straightforward (though possibly computationally intensive) data summaries and visualizations to inform transit agency decisions. The second is sometimes termed “data mining,” referring to use of more involved data processing techniques such as statistical modeling or machine learning algorithms. (These definitions follow the distinction described in Shmueli et al. (2017).) 3.3.1 Business Intelligence AFC data is used regularly by transit agencies to summarize sales and ridership on different fare products. However, there has not been much published work on using AFC data to explore connections between fare products and ridership patterns. Stuntz, Attanucci, and Salvucci (2017) demonstrated how simple summaries of AFC data broken out by fare product attributes can illuminate the pricing strategies inherent

59 in a transit agency's fare structure. That work included an interactive web-based summary tool built using the JavaScript library dc.js (built on d3.js and crossfilter.js). Click-and-drag filtering of aggregated AFC data on both ride attributes (date, time, day of week, and mode) and fare product attributes (tariff, medium, and user type) allowed for drill-down exploration of cross-sectional ridership patterns by fare product at the MBTA, as illustrated in Figure 3-3 and Figure 3-4. Lathia and Capra (2011) similarly focus on fare policy incentives and clearly visualize the relationship between fare product attributes (primarily different tariff types) and usage patterns (time and frequency of use) using AFC data from Transport for London.

Figure 3-3: MBTA Bus and Subway Validations Dashboard, October 2015

Source: MBTA AFC

Figure 3-4: Time Distribution of MBTA Bus and Subway Validations by Time Period and 4 Pass Type, October 2015

7-day Pass Monthly Pass

Weekday

Weekend

Source: MBTA AFC

There are also few descriptive or exploratory studies taking advantage of the longitudinal nature of AFC data – the traceability of transit cards or accounts over time. In Chan et al. (2018), CTA staff analyzed a

60 net decline in ridership from May 2016 to May 2017 by tracing individual accounts in CTA AFC data. They segmented AFC accounts on frequency of ridership in May of each year and then described how both cards and total ridership shifted across segments between the two years. The ability to trace transit accounts over time allowed them to attribute the majority of the net change in ridership to a decline in ridership frequency for returning customers (still active in the AFC data, but at lower ridership levels) rather than customers leaving the system entirely (which would have result in more cards disappearing completely from the AFC data). While this was not a direct application to fare policy, it is similar in its timeframe and raises general issues that apply to fare policy analysis. Like ridership trends, fare policy trends and changes often need to be evaluated over long periods of time (multiple years) rather than using cross-sections or short periods; however, long-term analyses that take advantage of longitudinal, card- level AFC data are complicated by computational demands and the need to carefully account for turnover in cards. (The authors found that 70% of full fare cards at the CTA are used for 12 months or less.) These issues are encountered in Chapter 6, where similar attention is paid to card traceability and turnover. The authors also briefly discuss potential application of longitudinal AFC data to analysis of fare products and sale channels, finding that customer fare product choices and ridership patterns differ depending on the sale channels used to make purchases. These results are discussed in more detail in Chapter 5. 3.3.2 Data Mining For applications that relate or connect to fare policy, the primary data mining technique used in recent studies is clustering algorithms. These are “unsupervised” machine learning techniques, meaning that there is no particular outcome or class that is being predicted. Rather, given a set of data points each with a set of attributes/variables/features, these algorithms use some measure of similarity to assign the data points to groups (such that data within any group is all similar in some sense). (See Anil K. Jain (2010) or A. K. Jain, Murty, and Flynn (1999) for an review of clustering techniques.)

The primary usefulness of clustering for fare policy analysis is customer segmentation. In the following sections, customer segmentation is defined and examples of cluster-based segmentation are described.

Customer Segmentation Customer segmentation is one particularly important application of AFC data to fare policy analysis (and other planning issues), but its meaning is somewhat nuanced. In a handbook for transit market segmentation, TCRP 36 defines market segmentation as “the identification of groups of customers – or market segments – that have similarities in characteristics or similarities in needs who are likely to exhibit similar purchase behavior and/or responses to changes in the marketing mix” (Elmore-Yalch 1998). Mechanically, market segments could be any groups of customers, and the groups could be defined by any method whatsoever; segmentation is not in itself a method of analysis. However, as the TCRP definition hints, segmentation is meant to identify not just any arbitrary or convenient set of groups but rather meaningful groups.

What exactly does that mean? Market segments should be groups for which you expect to see behavioral differences in a particular application or area of study. Segments, then, are defined by characteristics that are observed, but they are also expected to behave differently in other respects. For example, imagine a transit agency is considering changing fares and wishes to understand the potential impacts on customer behavior. Grouping customers by mode of travel (rail riders and bus riders) would be a segmentation if

61 those two groups were expected to have different sensitivities to fare changes. By contrast, grouping customers by eye color or birth month would not be a segmentation, because there is no reason to believe that the groups are meaningfully different in their sensitivity to fare policy changes.

How, then, is a market segmentation developed? TCRP 36 describes two approaches to transitioning from many groups of riders (e.g. in summaries and classifications) to a limited number of rider segments used to determine strategy (Elmore-Yalch 1998). One approach is to use a priori assumptions about the behavioral differences between groups. This is generally easiest, but has some obvious drawbacks: it depends on good judgment and luck, and it does not necessarily respond to changes in the market over time. The other approach is post hoc segmentation, which uses empirical information (survey or otherwise) to indicate groups that are behaviorally different (across groups). Continuing the example above, if a prior fare change were studied, it might be observed that rail riders did in fact respond differently than bus riders to the change. As mentioned above, market segmentation is not a method of analysis, but actually the result of an analysis; for a priori segmentation that analysis is based on prior knowledge and reasoning, and under post hoc segmentation that analysis is more empirical or data-driven.

After being defined, a market segmentation becomes a tool for further descriptive analysis as well as for modeling. Measures of behavior might be summarized or tabulated separately for each market segment; for example, Hickey, Lu, and Reddy (2010) use transit service geography as a proxy for rider minority and low-income status to segment a descriptive analysis of fare product choices and fare change impacts at the NYCTA. Similarly, market segments might be distinguished in prediction models so that their predicted behaviors are allowed to differ from each other, such as differentiation of pass-holders and pay- per-use riders in the MBTA’s elasticity spreadsheet model for evaluating fare change scenarios (Andrews and Demchur 2016). In both cases, segmentation is often most useful when there is a small or limited number of market segments, allowing for information to be disaggregated in a meaningful way but still absorbed by an analyst or policy maker.

Cluster-Based Segmentation Having clarified the meaning and purpose of segmentation, the specific role and value of clustering techniques for segmentation is evident. There is no guarantee that the grouping generated by a clustering algorithm is meaningful for a particular application, so clustering methods do not automatically produce segmentations. However, clustering does create a new dimension of information (cluster membership) that differentiates customers in a new and succinct way. Clusters then simply become another option for defining a segmentation, either a priori or post hoc.

Basu (2018), the most rigorous application of clustering to AFC data to date, uses k-means clustering as a tool to facilitate customer segmentation and eventually personalized marketing and communication with transit users. He describes six steps in using AFC data for cluster-based segmentation of transit users on temporal and spatial ridership patterns:

1. AFC data (origin, destination, and time) 2. Feature creation (temporal trip intensity and spatial coverage) 3. Data treatment (dimension reduction via principal component analysis and scaling) 4. Clustering 5. Validation (usability and generalizability )

62

6. Production

Feature creation, data treatment, and validation are identified as the highest-effort steps in the process. He uses clustering k-means++ to develop short-term spatial and temporal segments oriented toward applications in personalized information provision as well as long-term segments suitable for event analyses. Short-term segments are based on the concentration of recent trips in different zones and at different times of day, while long-term segments are based on measures of trip frequency, various spatial features, and first and last trip times for users who traveled at least half of the weekdays in the previous month. Basu’s long-term clustering is most relevant to fare policy analysis. He finds eight stable, long- term segments of transit users that vary in their trip frequency and spatial and temporal travel characteristics. These segments concisely differentiate long-distance travelers, infrequent travelers, transit commuters, evening riders, and other groups. These differences capture variation in travel frequency and are clearly correlated with different trip purposes and the relative convenience and cost of alternative travel modes. As a result, the segments could be expected to have different preferences over fare product options and different sensitivities to average and marginal costs; this would likely make them useful for fare policy analysis. Before-and-after fare change evaluations or fare product choice models could be segmented to estimate different behavioral parameters for the different segments, and these parameters could be used to segment fare policy scenario modeling. Fare policy analyses are often segmented by fare product, but this tends to lump together very large groups of transit users or trips (such as all pay-per-use riders); cluster-based segmentation could be used to divide and differentiate customers within these groups, who might respond differently to fare changes. For example, the Appendix presents initial clustering of users at the MBTA based on Basu’s long-term temporal features; while resulting cluster membership is strongly correlated with fare product choice, the clusters still differentiate groups of users within each fare product. This initial analysis at the MBTA supports the promise of cluster-based segmentation for fare policy analysis at any agency with AFC data.

Other recent studies using clustering of AFC data and AFC-based travel metrics include Goulet-Langlois (2015), Halvorsen (2015), Ortega-Tong (2013), and Kieu, Bhaskar, and Chung (2014). (An earlier study, Krizek and El-Geneidy (2007), clustered transit users on travel characteristics, but used survey data rather than AFC data.) Ortega-Tong (2013) and Kieu, Bhaskar, and Chung (2014) both discuss patterns in fare product use across identified clusters. Halvorsen (2015) characterizes the resulting clusters as customer types and uses the clusters to segment an analysis of the Early Bird fare discount pilot in Hong Kong. Goulet-Langlois (2015) uses transit fare class (group pricing) only as a proxy for demographic information. 3.4 Demand Modeling and Fare Scenario Prediction The literature on travel demand modeling is vast. This section focuses on demand modeling approaches and techniques that have been used to analyze transit fare change scenarios.

As will be discussed more in Chapter 4, scenario modeling requires both estimation and prediction – estimation of behavioral parameters, and a procedure to apply those parameters to a specific baseline and scenario. In some cases estimation and prediction are neatly integrated. More often, prediction is distinct from estimation and depends on assumptions or estimated parameters as exogenous inputs. Any

63 particular study or technique might only address estimation or prediction (but not both); for example, a study may estimate own-price elasticity for a specific transit fare product without providing a prediction procedure that could account for changes in fares for other products or other modes.

This section begins by reviewing different prediction procedures and the parameters that they depend on to predict the ridership and revenue impacts of transit fare changes. It then turns to estimation of the different behavioral parameters and the specific modeling approaches used at the MBTA and the CTA. 3.4.1 Prediction Procedures Zureiqat (2008) identifies several types of models that have been used for fare policy analysis, which are rearranged slightly here into four general prediction procedures:

1. Four-step and mode choice models 2. Multiplicative “spreadsheet models” 3. Ticket type choice models 4. Discrete-continuous models

These approaches are discussed below, with particular attention to the varying level of abstraction used to represent transit fare structure and individual behavior. What are the individual behaviors that matter for fare changes, and how might a prediction procedure abstract away from them? Consider a natural transit agency question about a potential fare change: What will be the impact on ridership or sales for a particular fare product? There are three (simultaneously-determined) individual behavior changes in response to a fare change that would affect product-level ridership or sales:

1. Change in total travel demand, or “induced ridership.” An individual may take fewer or more total trips. 2. Change in mode choice. An individual may switch their mode of travel for certain trips to or from transit. 3. Change in fare product choice. An individual may switch the transit fare product that they use for their transit travel.

Modeling all three behaviors explicitly would be complicated, requiring simultaneous consideration of all modes and all fare product options for individual decision makers. In order to avoid explicitly predicting mode choices, it is common to use elasticities and other parameters that describe the marginal impact of a fare change only on transit and assuming attributes of all other modes are held constant (i.e. given the current cost, convenience, etc. of using other modes). This can capture the aggregate effect of induced ridership and mode switching in a single parameter; however, if in reality the characteristics of other modes are not constant but are changing over time (e.g. if the price of ride-hailing services is falling), then either this parameter would need to be updated or additional parameters would be needed (such as a cross-price elasticity with respect to the price of ride-hailing).

In order to further avoid explicitly predicting fare product choices, one could similarly use parameters like own-price elasticities that describe the impact of raising a fare product’s price holding the prices of all other fare products constant; this could capture the impact of both mode choice (people switching between transit and non-transit modes) and fare product choice (people switching between transit fare products) in a single parameter. However, this is rarely desirable for transit fare policy scenario analysis,

64 since multiple fare product prices are typically changed at the same time. An alternative and more common simplification is to use parameters like conditional elasticities that hold fare product choices constant (rather than holding the attributes of all fare products constant); these parameters reflect the marginal impact of mode switching and induced demand, but assume no switching between fare products. This more closely resembles many actual fare changes scenarios where prices of alternative fare product options are changed by similar proportions, in a way that is unlikely to affect fare product choice. However, if the price of an alternative fare product were to change in a different proportion, a conditional elasticity would not predict the resulting impact of switching between fare products; additional parameters would once again be needed.

Four-Step and Mode Choice Models As noted in Zureiqat (2008), traditional 4-step transportation planning models are much broader in scope than transit ridership and revenues (and certainly transit fare policy). These models are typically calibrated to regional travel survey data and aim to predict an aggregate cross-section of all travel demand in a region using four steps: trip generation, trip distribution, modal split, and route assignment. Four- step models explicitly represent mode choice (both transit and non-transit modes) typically using logit mode choice utility parameters. Four-step models can also include factors that allow for manipulation of “external factors” like regional demographics.

Four-step or mode choice model calculations are typically performed using highly aggregated trip data for all modes and fairly abstract representations of transit systems. As a result, the inputs, methodologies, and results are usually not at a scale or level of specificity that is relevant to specific transit policy questions (Boyle 2006). Four-step and mode choice models often use crude simplifications of transit fare structure, such as a single average fare for each transit mode. This still allows for high-level evaluation of total transit ridership in a multi-modal context, but it has several weaknesses for transit fare policy: the models cannot capture differences in fare sensitivity and fare change impacts for different transit fare products or market segments, and they can generate misleading revenue predictions if relative product prices or transit fare structure were to change (Wardman and Toner 2003).

Four-step models could be useful for fare policy evaluation in certain contexts, however. As part of its transit fare integration planning effort in the Toronto region, consultants at Steer Davies Gleave working with Metrolinx effectively extracted the mode choice step from Toronto’s regional 4-step travel demand model and added a more granular representation of different transit operators and services in the region (Steer Davies Gleave 2017). However, due to the aggregate nature of the model, they were unable to represent transfer pricing, and they used average fares for each operator/service; they note that their future work should include a fare product choice model. McElroy (2009) also used a regional household travel survey in Toronto to estimate a nested logit mode choice model, but added a “pass” or “no pass” choice within the transit nest. He notes that the significance of auto ownership in explaining use of a transit pass varies by geography, suggesting that transit passes are complements to car ownership for some individuals and substitutes for others.

Instead of applying a four-step model, Sound Transit in uses an incremental logit ridership forecasting methodology that starts from detailed baseline transit ridership information and then borrows logit mode choice model coefficients from the regional four-step travel demand model to incrementally adjust transit ridership under alternative scenarios (Sound Transit 2015). (By contrast to this

65

“incremental” approach, standard multinomial and nested logit models are “synthetic” in that they generate predictions for both the baseline and a scenario; the predicted baseline is a non-intuitive idea and presents challenges in matching the observed baseline for detailed segments. For more on the incremental logit model as a solution to these problems, see e.g. Kumar (1980), Koppelman (1983), and Bates, Ashley, and Hyman (1987).) This approach allows for more realism in the representation of transit fare structure, though the Sound Transit model appears to be used primarily to evaluate potential new transit services (in which case fares are held constant across scenarios).

Multiplicative “Spreadsheet Models” So-called “spreadsheet models” that predict ridership and revenue using multiplicative factors are a straightforward and widely-used approach to modeling changes in transit fares. Borrowing terminology from mode choice modeling, spreadsheet models can also be divided into “incremental” procedures that predict changes in demand from an observed baseline and “synthetic” procedures that predict levels of demand (both in the baseline and in scenarios).

“Incremental” Spreadsheet Models The incremental approach is usually called an elasticity spreadsheet model. A price elasticity approximately represents the percent change in demand resulting from a one percent increase in price.14 Changes in ridership for each segment of trips or customers (푖) are calculated directly based on the change in fare and the fare elasticity that apply to the segment. Scenario revenue for each segment is simply predicted ridership for the segment multiplied by the applicable fare (or average fare) for the segment.

푓푎푟푒푆푐푛,푖 − 푓푎푟푒퐵푎푠푒,푖 푅푖푑푒푟푠ℎ푖푝푆푐푛,푖 − 푅푖푑푒푟푠ℎ푖푝퐵푎푠푒,푖 = 푅푖푑푒푟푠ℎ푖푝퐵푎푠푒,푖 ∗ ( ) ∗ 푒푙푎푠푡푖푐푖푡푦푖 푓푎푟푒퐵푎푠푒,푖

푅푒푣푒푛푢푒푆푐푛,푖 = 푅푖푑푒푟푠ℎ푖푝푆푐푛,푖 ∗ 푓푎푟푒푆푐푛,푖

Elasticity spreadsheet models are highly flexible in representing details of transit fare structures and different segments of ridership. If transit ridership or sales totals are available at the level of specific fare rules (as they typically are with AFC data), then the appropriate change in fare can be applied to each group of rides separately. Different elasticities can also be applied flexibly to different segments. Data on non-transit modes is not needed.

However, this simplicity and flexibility comes with two main disadvantages. First, elasticities do not capture the impact of changes in other modes, such as increasing gas prices or decreasing ride-hailing fares). Spreadsheet models can account for these changes by introducing cross elasticities, which would measure the sensitivity of transit fare product demand to a change in a particular attribute of another mode (such as price); however, these parameters should ideally be updated with each change and may be difficult to estimate. Additionally, they might be inaccurate depending on the relative values in a specific scenario; for example, a reduction in ride-hailing fares from above bus fares to below bus fares would have a disproportionately larger impact on mode choice than a smaller reduction that still left ride-hailing fares significantly above bus fares.

14 For example, an elasticity of -0.5 means that a 1% increase in price would produce a -0.5% change in demand.

66

Second, elasticities used in spreadsheet models are conditional on fare product choice, meaning that they do not capture any impacts of fare product switching. This could also in theory be corrected by introducing cross-price elasticities or diversion factors. Similar to elasticities, however, the correct values for these parameters would depend on the relative price levels of all “competing” fare products before and after a fare change; for example, if an agency charges higher fares for cash than for smart cards, a fare change that moves the cash fare closer to the card fare would have a smaller proportional impact on cash use than a fare change that moves the cash fare lower than the card fare. Parameters to approximate switching between passes and pay-per-use are further complicated by the relationship between pass prices and pay-per-use fares; the prices and fares cannot be compared directly to see which is more or less expensive, since the expense of using pay-per-use depends on a customer’s transit ridership level.

In practice, incremental elasticity spreadsheet models have been used for fare policy analysis at many agencies. While not focused exclusively on fare policy analysis, Boyle (2006) surveyed 36 transit agencies in the U.S. and found that elasticity spreadsheet models were the most common quantitative method for forecasting the ridership impacts of service and fare changes. In some cases, efforts have been made to address the disadvantages of these models described above. Elasticity spreadsheet models at SEPTA, NYCTA, and the MBTA (discussed at the end of this section) all incorporate diversion factors to capture switching between different fare products when the ratio of their prices changes (Hickey 2005; Andrews and Demchur 2016). Transport for London has also used elasticity spreadsheet models for bus and Underground, accounting for switching between different ticket types using cross-price elasticities. These TfL models are described in Zureiqat (2008). Another recent example is Borjian, Schabas, and Segal (2017), which develops an elasticity spreadsheet methodology to analyze fare integration in the Toronto metropolitan area. The authors acknowledge that different origin-destination markets in the Toronto area will likely have different sensitivities to fare integration based on how competitive transit is with other modes. Instead of estimating different elasticities for each market or directly modeling mode choice, the authors note that the fare elasticity of the probability of an individual choosing to use transit with respect to transit fare in a logit mode choice model is linear in both 1) baseline the fare level and 2) the probability that the individual will not choose transit at baseline fare levels. Based on this observation, they scale elasticities for each market based on the fare level and transit mode share in the market, relative to region-wide averages. This allows fare elasticity to vary across different fare levels and different levels of transit competitiveness (while still maintaining the same region-wide average peak and off-peak elasticities).

“Synthetic” Spreadsheet Models "Synthetic" (i.e. Non-incremental) spreadsheet models are typically time series models of demand, often taking the following form:

log (푑푒푚푎푛푑푖푡) = 훼푖 + 퐱′풊풕휷 + 훾푓푎푟푒푖푡, where

푑푒푚푎푛푑푖푡 = ridership or sales in market 푖 (e.g. a particular fare product or trip type) at time 푡

푓푎푟푒푖푡 = the transit fare market 푖 at time 푡

퐱′풊풕 = other drivers of demand

67

훼푖 = a market-specific constant term representing average demand not explained by fares and the other drivers of demand included in the formula. These formulas calculate demand levels for both the present and a potential future scenario based on current and assumed future values of fare levels and other modeled drivers of demand. Synthetic models require at least one additional parameter relative to incremental models – some form of catch-all constant term like 푎푖 – and in practice they often include a variety of other control variables (each with a coefficient parameter). This necessitates that these formulas are estimated using aggregate time series regression models in the same form as the prediction formula.

This use of the same formula and the same variables or features for estimation and prediction has some advantages. There is no need to find and rely on estimates from prior studies (which may be in different contexts). Additionally, factors other than fares can (and should) be included in the analysis. This might be especially useful for scenarios involving multiple future policy changes (such as increased fares and improved service), and inclusion of variables related to “competing” modes (such as parking costs or roadway congestion) can control for changes in those modes. It also makes predictions for specific future time periods. This allows adjustment for any included “external” factors that affect both baseline and scenario demand (such as the effect of gas prices or population growth), and lagged fares could be included in the formula to differentiate short-term and long-term impacts of fare changes.

However, it has some drawbacks. A separate formula must be estimated for each “market” (group of customers or trips), which typically limits the level of analysis to a few large markets such as major fare products or transit modes. This makes it difficult to reflect the complexities of actual fare structures and often results in undesirable approximations; for example, to avoid estimating separate models for different fare products, average fare level per trip is often used as an explanatory variable (for all trips and products within a given market). Similar to incremental elasticity models, it is also difficult to adequately account for switching between fare products (though cross-price elasticities can be estimated and applied to each market to mitigate this problem). Additionally, the use of long-term time series to estimate fare parameters means that predictions will be slow to adjust to substantial changes in customer preferences or introduction of new services over time (such as growth in ride hailing). The need for integrated estimation also has downsides, and it is not always possible since it requires sufficient variation in fares and consistent data on control variables over an extended period of time (or across different agencies).

One example of a “synthetic” spreadsheet model is the model used in recent decades by WMATA in D.C. to forecast ridership and revenue under both fare and service changes. WMATA’s time series regression model was originally developed in 1998 and updated over time by Cambridge Systematics (most recently in 2010) (WMATA 2009; Cambridge Systematics 2010). Four formulas were developed and applied – weekday bus, weekend bus, weekday rail, and weekend rail – each including variables related to demographics, tourism, fares and service, external factors, and seasonality. WMATA reported that the model’s accuracy began to decline in 2013 and speculated that the model formulas based on long-term historical patterns were not adequately accounting for more recent changes (WMATA 2016). Note that aggregate time series models are often used merely to estimate fare elasticities for an incremental elasticity spreadsheet model. One example of this approach is the CTA’s fare analysis in 2012, described at the end of this chapter.

68

It is also possible to use a differenced version of the demand formula above to calculate changes in demand based on changes in fares and other factors:

∆log (푑푒푚푎푛푑푖푡) = ∆퐱′풊풕휷 + 훾∆푓푎푟푒푖푡,

This may be a useful transformation for addressing non-stationary in demand, which can lead to estimation of spurious correlations. As with the non-differenced formula, this type of time series model can either be used for direct “synthetic” forecasting or can be used to estimate elasticities for use in an “incremental” spreadsheet model. Transport for London (TfL) has used this type of formula to estimate models of fare revenue, with annual differencing of monthly observations of “fare indices” and control variables. In Jain (2011), two formulas are estimated – one for bus and one for Underground – that make use of multiple own-mode price variables and other-mode price variables to estimate several different kinds of fare elasticities (short- and long-term own-fare, cross-mode fare, and conditional) using a single time series model. The resulting formulas were used to “predict” the impacts of past fare policy changes (controlling for other factors).

Ticket Type Choice Models Fare product or ticket type choice models use characteristics of fare products and transit customers to predict the market share of different fare products (within transit modes). Choice models have as their level of observation an individual decision-maker. Under a basic multinomial logit formulation, a separate formula for each fare product 푗 is used to calculate the utility of that product to individual 푖, and the resulting utilities (푉푖푗) are combined to calculate probabilities that the individual will choose any particular fare product:

푉푖푗 = 훼푗 + 퐱′풊휷풋

푒푉푖푗 푃푖(푗) = 퐽 푉푖푘 ∑푘=1 푒

For the purpose of fare policy analysis, the most important characteristic of fare products (in 퐱′풊휷풋) is typically some form of product price or travel cost. Constant terms (such as 훼푗) capture inherent preference for one product over another (assuming cost and other modeled attributes were constant). Segment-level or system-wide product market shares are aggregated from individual choices.

This kind of choice model allows detailed representations of an agency’s fare products; attributes of a fare product can be parameterized, or their effect can be lumped into alternative-specific constants or nest scale factors. In contrast to mode choice models that do not include fare product choice at all and multiplicative spreadsheet models that can only include average marginal tendency to shift between fare products, choice models also provide a relatively realistic representation of actual individual choice behavior. Customers are assigned different product choice probabilities based on the relative attractiveness of each fare product (even in scenarios where the relative prices of products are changing in complex ways).

Fare product choice models have several limitations, however. In order to represent fare product choice with a level of realism and flexibility, they must be estimated with and applied to individual-level data that includes transit travel demand over some relevant choice horizon (such as a week or a month). This

69 is essential because customers consider the total cost of fare products when they make a fare product choice; the total cost of a pass is simply the pass price, but the total cost of a pay-per-use option depends on how many rides a customer expects to take (which varies across individuals). In the past, this required passenger surveys; however, as demonstrated in Zureiqat (2008) and Chapter 6 of this thesis, AFC data at most large transit agencies is also a source of this individual-level information. Another downside to fare product choice models is that they do not capture the effect of fare changes on mode choice and induced ridership. Under a product choice model, individuals are reallocated among fare products but never join or leave transit entirely, and transit travel demand is not adjusted based on the product that is selected. (In reality, customers are expected to ride more frequently if they select a zero-marginal-cost pass than if they were to ride pay-per-use.)

Transit fare product choice models are rare in practice. Zureiqat (2008) describes three relevant academic studies using heteroskedastic extreme value logit models, all from the late 1990s (Hensher and King 1998; Hensher 1998; Taplin, Hensher, and Smith 1999). There are only a handful of historical examples of actively-used U.S. transit system ticket type choice models – a Dallas Area Rapid Transit (DART) model and a Chicago Transit Authority (CTA) model, both estimated using survey data (Fleishman 1996). The CTA choice model is part of a larger discrete-continuous prediction procedure, mentioned in the next section and described in detail at the end of this chapter. Brakewood and Kocur (2011) estimate fare-related choice model at Transport for London (TfL) and the CTA, but their models are focused on choices between fare media rather than choices between different payment structures (with different associated costs).

Antos and Eichler (2016) used a less formal approach to predict adoption and revenue from potential new pass options at WMATA in D.C. The authors conducted a customer survey that included measures of travel frequency and stated interest in a purchasing a new pass. The stated interest in the survey was converted directly into estimated choice probabilities for customers with different travel profiles, and these probabilities were then applied to all users in the system’s AFC data.

Discrete-Continuous Models Discrete-continuous models combine prediction of discrete fare product choices (the focus of ticket type choice models described above) with prediction of continuous ridership or sales frequency (the focus of multiplicative spreadsheet models and, indirectly, mode choice models). The motivation is to combine the primary benefits of these two different modeling approaches: Fare product choice models are able to account for fare-change-induced switching between fare products more effectively than cross-elasticities or diversion factors in “continuous” elasticity spreadsheets and linear regression models, while continuous models can additionally capture the effect of mode choice and trip generation (through elasticities). Combining the two also allows agencies to model the relationship between fare product choice and ridership frequency, which is easily overlooked in a choice-only or elasticity-only model. Different fare product tariffs or payment structures result in different marginal fares; the most common example of this is period passes versus pay-per-use fares, though it is also true for more complex fare structures like two-part tariffs or capping. As a result, fare product and ridership choices are interdependent; customers choose fare products based on their expected ridership, and their product choice then affects their ridership frequency. For example, a customer with high expected demand may choose to purchase a monthly pass, and the zero-marginal-cost nature of the monthly pass then actually increases the customer’s ridership level.

70

While discrete-continuous models of ridership and revenue can take quite different forms, there are some common challenges. First, the data requirements are substantial, since the discrete step (some form of fare product choice model) relies on data for individual customers for both estimation and prediction. Representative samples or segments of transit users from either surveys or AFC data can be used to reduce data collection requirements and computational demands; however, this often means that fare product choice models are estimated using one dataset and applied to a different dataset (which can cause issues). Second, additional behavioral parameters are required relative to either a product choice model or a continuous elasticity model on its own. The discrete choice parameters typically need to be estimated, while elasticities and other continuous demand adjustment parameters can often be borrowed from prior studies. Additional parameters that connect the discrete and continuous steps of these prediction procedures – describing the impact of fare product choice on travel frequency – are non- standard and also likely need to be estimated.

Discrete-continuous models are quite common for describing other behaviors like car ownership (a discrete choice) and car use (a continuous choice), but only four examples were found applying this model structure to prediction of transit ridership and revenue.15

The most theoretically rigorous example is Zureiqat (2008), which uses individual-level AFC panel data to estimate a discrete-continuous econometric model derived from individual utility theory described in Ben-Akiva and Lerman (1985) and Train (1993). In the discrete step, riders choose among fare products based on both previous choices and the expected cost of future travel under each choice. In the continuous step, riders choose travel frequency based on previous ridership and the marginal cost under their chosen fare product, as well as a selectivity bias adjustment term derived from the discrete step. The estimated product choice and ride frequency formulas are then used to simulate individual decisions into the future under alternative fare policy scenarios; system-wide results are aggregated from the simulated individuals. As noted in Zureiqat (2008), this approach could be extended to a fully-discrete model of both fare product choice and “portfolios” of transit travel (comparable to Train, McFadden, and Ben- Akiva (1987)). Zureiqat’s fare product choice model (using only AFC data) is the foundation of the CTA fare product choice model estimated in Chapter 6. His modeling approach is described in more detail and compared to the approach of this thesis in Chapter 7.

A less rigorous but conceptually similar approach is to use individual-level data (from a survey or AFC) to estimate a discrete fare product choice model and to combine the results with an aggregated elasticity “spreadsheet model.” This approach has been applied at the CTA since the late 1980s. In the discrete step, a fare product choice model was estimated from a specialized customer survey. The resulting logit formulas were applied to individuals or groups in a system-wide customer survey, which served as “representative individuals” for the entire customer population (covering a range of characteristics such as ridership frequency, car ownership, and rail versus bus use); the results for each individual under different fare change scenarios were scaled up based on survey sampling weights to find system-wide fare product market shares. (The use of “representative individuals” for similar groups of customers avoids the need for data-intensive simulation of fare product choices for each individual customer, as in Zureiqat (2008).) In the continuous step, changes in effective cost were found for each “representative individual” based on changes in fare product choices between a baseline and a fare change scenario. Elasticities were then

15 A recent ticket sales model at Metro-North Railroad in New York may have also used a discrete-continuous modeling approach, but details about the model were not publicly available (Steer Davies Gleave n.d.).

71 applied to adjust ridership and sales based on this change in costs, in order to account for switching with other modes (as in an elasticity spreadsheet model). Additionally, ridership was adjusted for customers that switched between pay-per-use fares and passes, since the zero-marginal-cost nature of passes causes customers to ride more frequently. This approach and prior CTA models are discussed in more detail at the end of this chapter. This is the modeling approach adopted in Chapter 7 of this thesis.

Weesie et al. (2009) followed the same general approach as prior models at the CTA, developing a “Tariff Structure Evaluation Model” to predict rail ridership and revenue in the Netherlands under alternative fare structures. The authors were particularly interested in the impact of fare structure on the time of travel (to reduce peak demand). In the discrete step, the authors conducted an online, stated-preference survey that collected information about a recent trip (including time flexibility) and responses to a few hypothetical fare product choices. The survey results were used to estimate two choice models, one for fare product selection and one for other ridership choices (primarily time of travel). Similar to CTA fare models, the estimated choice formulas were applied to a sample of customers from a separate survey, and the sample results were scaled up to the system-wide level using sample expansion factors. Some of the explanatory variables in the stated-preference survey and the choice models were not available for the sample of customers to whom the choice formulas were applied, so Monte Carlo simulation was used to assign values to these variables (with each model run using different random draws from specified variable distributions). In the continuous step, after a fare product and time of travel had been selected, fare elasticities by trip purpose were used to adjust ridership, sales, and revenue (capturing the effect of switching to or from other modes of travel).

Harris, Thomas, and Boyle (1999) present a model at the Metropolitan Rapid Transit Authority (MARTA) that follows a similar logic – predicting fare product choices and then applying elasticity adjustments in a traditional spreadsheet model – but uses a more informal method for calculating fare product shares. They define K-factors for each pair of fare products, which represent the ratio of ridership market shares on the products if the cost of the two were equal. These factors can conveniently be calculated directly from baseline information, by multiplying the ratio of ridership on the two products and the relative fares or costs of the two products (to normalize for fare differences). New ridership ratios under alternative fares can then be calculated by multiplying the K-factors with the scenario fare levels, and market shares for each fare product are calculated from the product pair ratios so as to sum to 100%. This calculation roughly corresponds to the logic of a logit fare product choice model, with inherent preferences between products represented in the K-factor (like an alternative-specific constant), and preferences over relative costs captured by the ratio in product costs (like including weekly cost in systematic utility). 3.4.2 Behavioral Parameters The prediction procedures discussed above use a variety of different behavioral parameters to model the impacts of incremental fare policy changes. Important types of parameters include:

 Own-price elasticities measure the sensitivity of ridership or sales for a particular fare or fare % 푐ℎ푎푛푔푒 푖푛 푟푖푑푒푟푠ℎ푖푝 product to changes in fare level: 휀 = . Elasticities can be measured and % 푐ℎ푎푛푔푒 푖푛 푓푎푟푒 applied in multiple ways – point, arc, midpoint or linear, and shrinkage ratio – depending on the magnitude of the change in fare and assumptions about the shape of the demand curve (see, e.g.,

72

Balcombe et al. (2004)). As a reminder, own-price elasticities are often measured conditionally; they can represent price sensitivity conditional on equal proportional change in the prices of all fare products (perhaps the most common),16 conditional on non-zero transit travel, and/or conditional on customer fare product choice (i.e. not capturing the effect of fare product switching). If the elasticity for a fare product is not measured conditionally, it reflects both mode switching and fare product switching. Elasticities are used in multiplicative spreadsheet models.  Cross-price elasticities measure sensitivity of ridership or sales of one product to changes in the % 푐ℎ푎푛푔푒 푖푛 푟푖푑푒푟푠ℎ푖푝푎 price of a different product: 휀푎,푏 = . Cross-price elasticities can be used to % 푐ℎ푎푛푔푒 푖푛 푓푎푟푒푏 adjust for fare product switching in multiplicative spreadsheet models; the amount of switching is linear with the percent change in the price of the “competing” product.  Diversion factors refer to two different parameters. By one definition derived from demand theory, a diversion factor represents the portion of the change in demand for one good following a price change that is diverted from another good (Balcombe et al. 2004).17 There is a direct

relationship between these diversion factors, own-price elasticities, and market shares (훿푎,푏 = 휀푎,푏 − 푚푎푟푘푒푡 푠ℎ푎푟푒푎), which makes these factors useful for estimating cross-elasticities (as in 휀푎∗ 푚푎푟푘푒푡 푠ℎ푎푟푒푏 Wardman and Toner (2003), following Dodgson (1986)). Another definition of “diversion factor” created by SEPTA and adopted by NYCTA and the MBTA is the percent change in ridership for a product divided by the percent change in the ratio of fares for that product and a 푟푖푑푒푟푠ℎ푖푝푎,푡=1⁄푟푖푑푒푟푠ℎ푖푝푎,푡=0−1 “competing” product: 훿푎,푏 = 푓푎푟푒 푓푎푟푒 (Hickey 2005; Bench 2006). This is a ( 푎,푡=1)⁄( 푏,푡=1)−1 푓푎푟푒푎,푡=0 푓푎푟푒푏,푡=0 sort of relative cross-elasticity and can be applied directly in multiplicative spreadsheet models to adjust for fare product switching; the amount of switching varies linearly with the change in the ratio of fares (regardless of the starting or ending levels of the fares).  Choice utility parameters are used to calculate the individual-level utility of fare product alternatives. Their specific form varies. They are used in fare product choice models to capture switching between products resulting from fare policy changes. These choice models can be used on their own or as the discrete step in discrete-continuous models.  Induced ride factors relate to the connection between fare product choice and ridership or sales frequency, representing the proportional change in individual-level ridership attributable to a switch in fare product. For example, an induced ride factor of 1.1 for a pass relative to pay-per- use would mean that a customer using a pass will take 10% more rides than if the same customer had chosen to ride pay-per-use. Induced ride factors can be used to adjust ridership after any modeled fare product switching in multiplicative spreadsheet models (or the continuous step of discrete-continuous models).

The values for these parameters must either be taken from prior studies or estimated before prediction procedures can be applied. This section provides a quick and partial overview of sources in the literature

16 In this case, the elasticity is the sum of unconditional own-price elasticity and cross-price elasticities with respect to the price of all other fare products. 17 For example, consider an agency with two transit fare products, a pass and pay-per-use. If the price of pay-per- use is increased, pay-per-use ridership would go down due to both mode switching and switching to passes. A 0.2 diversion factor would imply that 20% of the reduction in pay-per-use ridership was diverted to the pass (rather than lost to other modes).

73 and estimation methods for different parameters, with a focus on capturing potential switching between transit fare products.

Estimates in the Literature In some cases, values for behavioral parameters can be drawn from prior academic and industry studies.

Estimates of transit own-price elasticities have been compiled in many review studies. Major recent reviews include McCollum and Pratt (2004) focused primarily on the U.S. and Balcombe et al. (2004) (summarized in Paulley et al. (2006)) focused on the U.K. Todd Litman also curates an compilation and review of fare elasticity estimates, echoing many of the same patterns (Litman 2004, 2017). Short-run, system-wide fare elasticities in recent reviews generally average around -0.4, but with wide variation in particular estimates; long-run elasticities are expected to be two to three times higher. These studies also highlight significant variation in transit elasticities along many dimensions. Larger fare increases and increases from higher baseline fares may have higher elasticities, suggesting that changes in the competitiveness of transit with other modes of travel (and the shape of the transit demand curve) are non- linear in transit fares (Paulley et al. 2006). This suggests a potential strategy of smaller, more routine fare increases instead of larger and less frequent changes, though this has not been a subject of much study. Bus trips are generally found to be twice as price-sensitive as heavy rail or subway trips (McCollum and Pratt 2004); this is unsurprising due to the generally lower speed, reliability, and comfort of buses relative to rail, especially when buses travel on congested areas or when bus trips involve transfers. Off-peak trips are likewise generally twice as sensitive to fare changes as peak-period trips, likely reflecting the different purposes of trips in peak times (largely non-discretionary work and school trips) and off-peak times (McCollum and Pratt 2004). (Pay-per-use elasticities at the MBTA are estimated separately for bus and rail trips and peak and off-peak trips in Chapter 6 of this thesis.) Rider characteristics such as car ownership, age, and income also affect elasticities. The effect of income on fare elasticity is inconsistent across studies, likely due to two opposite forces – higher-income riders are more likely to own cars (making them more price-sensitive), but transit fares are also a smaller share of their total income and expenditures (making them less price-sensitive) (C. Miller and Savage 2017).

Hensher (2008) provides a more critical meta-analysis of transit fare elasticities in the literature; one noteworthy caution relevant to this thesis is that elasticity estimates can vary depending on whether studies use average fare or fare for a particular fare product tariff or payment structure. Paulley et al. (2006) also notes that the impact of ticket type on elasticity is not clear and may depend heavily on relative prices and other fare product attributes.

Fare integration or altering transfer policies is one distinctive type of fare policy change that has received relatively little study. Borjian, Schabas, and Segal (2017) notes that limited published analyses have found that ridership and revenue impact of integrating fares or providing free transfers has typically been more positive than expected. One case study highlighted in McCollum and Pratt (2004) is the introduction of free transfers between bus and rail at the Transit Authority in 1997. Weekday subway trips rose 6.6% and bus trips rose 26% from 1996 to 1998; while this change was surely helped by other changes in fare policy – a new electronic fare collection system and introduction of multi- ride discounts and unlimited-use passes – free transfers seem likely to be the main driver of the large increase in bus use. The review and case studies in Miller et al. (2005) and Sharaby and Shiftan (2012) also suggest potentially large ridership gains from free transfer policies and fare integration.

74

The above review studies also include some cross-elasticities between transit modes (bus versus rail) and different transit trip types (such as peak versus off-peak trips). However, as noted in Gwilliam (2008), cross-elasticities between alternative transit fare products are uncommon. McCollum and Pratt conclude that “…more research is needed to investigate both market segment elasticities per se and the cross- elasticity of demand among different fare types. Available evidence suggests that shifts among fare types can be substantial” (McCollum and Pratt 2004). The multiplicative diversion factors used by NYCTA and the MBTA to account for shifts in elasticity spreadsheet models were based loosely on experience at other agencies and professional judgment, not estimated rigorously (Hickey 2005).

Along the same lines, there have been very few studies estimating fare product choice models with utility parameters. The studies of fare product choice models and discrete-continuous models identified earlier each estimated some variation on a multinomial logit model (Hensher and King 1998; Hensher 1998; Taplin, Hensher, and Smith 1999; Fleishman, Koppelman, and Schofer 1991; Multisystems, Inc 2000; Cambridge Systematics 2010; Weesie et al. 2009; Zureiqat 2008). It is more difficult, however, to apply the model parameters estimated in these studies to new fare policy analyses at different agencies. Utility scale in multinomial logit model choice models (the units of the model parameters) is confounded with the parameter estimates, so parameters cannot be easily mixed and matched from prior studies.

Induced ride factors are non-standard parameters, since most fare models do not account in any way for the impact of fare product choice on ridership. The CTA estimated factors internally using customer surveys after introduction of a 7-day pass in December 1998. Upper bounds of induced ride factors are estimated using AFC data in Chapter 6 of this thesis.

Methods for Estimation: Event Analysis at a One Agency In some cases, behavioral parameters are specific to a particular prediction procedure or transit agency and cannot be borrowed from the literature. In these cases, parameters must be estimated either using a separate model or as part of an integrated approach to estimation and prediction.

Methods for estimating demand models in general are beyond the scope here; Balcombe et al. (2004) provides a good overview of different stated preference and revealed preference options. This section focuses more narrowly on a practical question for transit agencies needing to implement prediction models that include some accounting for fare product choice: how can parameters be estimated using only recent experience at a single agency and taking advantage of existing AFC data? The large majority of elasticity and cross-elasticity estimates in the literature are derived from long time-series models, often across multiple transit agencies. What if an agency wished to learn from only recent years and perhaps only one or two recent fare changes?

A Starting Point: Before-and-After Methods for Elasticity Estimation The simplest approach is to make before-and-after comparisons of ridership and revenue, estimating the elasticity for any given fare product or ride type as the observed percent change in ridership divided by the percent change in fare. For example, this approach has been used to recalibrate or adjust elasticities in spreadsheet models at MARTA and the MBTA following fare changes (Harris, Thomas, and Boyle 1999; Andrews and Demchur 2016). There are three limitations to this approach:

1. First, it can only be used to estimate short-run elasticities, since making simple before-and-after comparison over longer periods of time would confound the effects of a fare change with other

75

events (Balcombe et al. 2004); this is a general limitation of using only recent fare change experience to estimate parameters (not specific to before-and-after analysis). 2. Second, it does not control for other factors affecting demand; this is less of a problem for short- term analyses than long-term ones, but estimates can still be biased by concurrent changes in or trends in transit service or external factors. 3. Third, it does account for or estimate switching between fare products. If, on the one hand, all fares were changed proportionally and little switching was expected, then before-and-after estimation would be reasonable but cross-elasticities could not be estimated. If, on the other hand, prices were changed by different proportions across alternative products, then elasticities and cross-elasticities are confounded; the before-and-after changes for a particular fare product would reflect a mix of mode switching and fare product switching, so interpreting them as elasticities could be highly inaccurate.

Controlling for Unobserved Factors in Elasticity Estimation The second issue with simple before-and-after studies – controlling for other factors that may be affecting demand – is potentially addressed by the short time period of analysis, by using control groups, or by estimating pre-period time trends. First, before-and-after studies use short time frames (often two years), and many factors that would need to be included in a long time series model (such as transit service levels, gas prices, and population) are unlikely to change significantly over such a short time period. This would not be a valid argument if there were concurrent events or sizeable trends affecting ridership, such as cuts in transit service or sudden growth in a travel alternative like ride hailing. Second, statistical and econometric methods for program evaluation suggest the use of control groups to remove baseline trends. Imbens and Wooldridge (2009) provides an extensive review. One class of methods assumes “unconfoundedness,” or the availability of sufficient controls to estimate the effect of a treatment; this is almost never applicable to transit fare changes at a single agency, since changes are applied equally to all customers at the same time. Other methods relax or drop the assumption of unconfoundedness. Perhaps the most relevant is difference-in-differences models, which use repeated cross-sections or panel data and compare the average before-and-after change for a treatment group with the average change for a control group.18 This approach is also problematic in a few ways. The most natural choice of a control group is one or more other transit agencies, which are unlikely to satisfy the necessary parallel trend assumption. The unit of observation in the transit fare policy application is either the whole transit agency (aggregate ridership or sales) or an individual customer; using agency-level observations results in a very small sample, and it is difficult to obtain individual-level panel data for multiple agencies. Abadie, Diamond, and Hainmueller (2010) offers an interesting generalization of difference-in-differences, synthetic control, that allows unobserved differences between treatment and control to vary over time (relaxing the parallel trend assumption). Synthetic control is geared toward comparative case studies and can be used to estimate a treatment effect on a single series (such as aggregate transit agency ridership over time) using a synthetic counterfactual constructed from many potential control groups (e.g., aggregate ridership at other transit agencies); however, using only recent, agency-level observations would still result in a small sample and (one expects) weak inference. In sum, availability of controls is a significant challenge for short-term transit fare change evaluation. A third option is to use long enough aggregate time series analysis and panels (using individual-level AFC data) to allow for estimation of prior trends. If at least two years of data are available before a fare change, use of differencing or fixed effects with time trends

18 Difference-in-differences models are easily extended to multiple control groups and multiple time periods.

76 can mitigate bias from external factors; in essence, shifts in the year prior to the fare change can be used as a control for shifts in the year after the fare change. Including trends in estimation models accounts for any stable process that is driving changes in transit ridership, such as steady population growth or steady growth of ride-hailing services; however, it still would not control for sharp changes in other factors affecting transit demand. This is the approach taken in Pincus (2014) to estimate elasticities from the 2012 fare change at the MBTA, and it is applied in Chapter 6 of this thesis.

Another potential way to avoid the confounding effect of changes in external factors is to estimate elasticities using cross-sectional data. Using a cross-section of agencies is problematic for reasons discussed above, but a cross-section within a single agency could also be used. For example, Iseki, Liu, and Knaap (2015) estimated fare elasticity as one parameter in a regression model where the dependent variable was total passenger-miles traveled for a particular origin-destination pair on the WMATA rail system. Unfortunately, however, this method only produced a fare elasticity estimate because WMATA uses distance-based fares (creating variation in fares across O-Ds). Origin-destination demand models may have many other applications, but they would not produce fare elasticity estimates under a flat fare structure. WMATA also had one dominant payment structure (pay-per-use); the same method would not necessarily apply to multiple tariff options.

Fare Product Switching The third issue with simple before-and-after studies – inability to control for and estimate fare product switching – could be addressed in a few different ways.

One potential option is aggregate time series regression models with cross-elasticities to account for fare product choice. Of transit studies, Wardman and Toner (2003) goes the farthest to incorporate fare product choice into time series modeling, demonstrating estimation of a complete system of demand equations (one for each fare product, including cross-elasticities) using aggregate time series data on ticket sales in Britain.19 Beyond transit studies, the problem of pricing multiple transit fare products is related to more general problems in economics such as differentiated product demand and multi-product monopolistic pricing. Several methods such as the Rotterdam model and Almost Ideal Demand System (AIDS) have been widely applied to estimate demand systems with a complete set of price elasticities and cross-elasticities to account for switching between goods or products (Deaton and Muellbauer 1980). There are two potential difficulties applying these aggregate time series models to demand for transit fare products at a single agency over a short time period. One is that models of conditional demand systems depend on “separability” of consumer preferences – that demand for a certain category of goods (such as meats) can be modeled on its own without considering demand for all other goods; it is doubtful that transit demand is separable from travel demand more generally, but market-specific data on multi-modal travel demand (including cars) is generally not available. Another issue is that these models require substantial price variation over time and across products to identify parameters, which usually does not exist at transit agencies over short time periods (or even long ones).

Another potential approach is to use panel data methods and condition elasticity estimation on fare product choice. One example is frequency regression estimation step in Zureiqat (2008), which estimates a separate linear model for each fare product conditional on individual-level fare product choice

19 The authors note that several studies have estimated time series models for different fare products separately, typically not include terms for cross-elasticities.

77 in the earlier discrete step. Chapter 6 of this thesis adopts this approach to estimate pay-per-use elasticities at the MBTA, restricting analysis to individual cards in AFC data using pay-per-use both before and after the fare change. This approach allows estimation of elasticities in the presence of fare product switching, but it does not estimate any parameters related to fare product choice.

Another option is discrete choice modeling, which does the opposite – estimating fare product choice utility parameters, but providing no help with elasticity estimation. Cross-sectional choice models take advantage of variation in individual transit travel demand, which affects average costs under different tariffs or payment structures. Panel data choice models additionally take advantage of variation in fares and prices over time. Examples of fare product choice model estimation using both surveys and AFC data were discussed above, and a choice model is estimated using CTA AFC data in Chapter 6 (Hensher and King 1998; Hensher 1998; Taplin, Hensher, and Smith 1999; Fleishman, Koppelman, and Schofer 1991; Multisystems, Inc 2000; Cambridge Systematics 2010; Weesie et al. 2009; Zureiqat 2008).

Finally, elasticities and product choice parameters could be estimated across products in an integrated fashion using discrete-continuous panel data models. Zureiqat (2008) is perhaps the only study to apply this approach to transit fare policy using AFC panel data, but the transit setting is very similar to many applications in the marketing literature. For example, Narayanan, Chintagunta, and Miravete (2007) and Lambrecht, Seim, and Skiera (2007) estimate discrete-continuous models for complex payment structures (two- and three-part tariffs, respectively) in phone and internet markets. One challenge in applying these methods to transit fare policy is that they still require enough variation in prices over time to identify elasticities (and, if desired, alternative-specific cost coefficients in the choice model); prices change much more frequently in many private marketing contexts than at public transit agencies. The elasticities that are estimated using Zureiqat’s method are also conditional on non-zero transit use, which could understate the impact of mode switching when applied to prediction; it is possible that marketing models such as Nair, Dubé, and Chintagunta (2005) that include purchase incidence as well as product and quantity choices would be able to address this problem. This issue is discussed further in Chapter 7. 3.4.3 Fare Modeling at the CTA and MBTA Having reviewed fare prediction procedures and parameter estimation in general, this section introduces modeling work that has been done in the past at the two case study agencies.

CTA Fare Modeling In 1989, the CTA used a grant from the Urban Mass Transportation Administration (later Federal Transit Administration) to perform a comprehensive evaluation of alternative fare strategies, which is described in Fleishman, Koppelman, and Schofer (1991). A substantial component of this project was development of a ridership and revenue model that could predict the impacts of shifting to a “deep discounting” strategy with more favorable pass multiples (among other fare change scenarios). The model that was developed combined prediction of fare product choice and prediction of ridership frequency. After evaluating different scenarios, the CTA adopted a form of “deep discounting” by increasing cash fares and decreasing token fares (including 10-packs of tokens); the change resulted in increased revenue and no change in ridership the following year (Fleishman 1996).

This initial CTA prediction model was subsequently updated and used for fare policy analyses in 1995, 1998, and 2005. The methodology used in 1998 and documented in Multisystems, Inc (2000) has been

78 described in some detail in both Hong (2006) and Zureiqat (2008). The essence of the prediction procedure can be described as five different sequential calculations:

1. Baseline ridership data. Define a segmentation of transit users, differentiating segments by average frequency of travel and other user characteristics, and calculate or estimate baseline monthly trips by trip type (a portfolio of rides) for an average user in each segment. 2. Fare product allocation. Allocate the users in each segment to specific fare products in both the baseline and a fare change scenario. This is done by performing the following calculations twice – once under baseline fare products and prices, and once under scenario products and prices: a. For the average user within each segment, calculate the monthly cost of the user’s portfolio of rides under each fare product option. b. Apply a multinomial logit formula to the average user in each segment (based on user characteristics and monthly costs under each fare product) to allocate riders within the segment to each fare product alternative. 3. Change in user costs in the scenario. Using the results – segments of transit users allocated to different fare products in both the baseline and a fare change scenario – calculate the average percent change in monthly cost (relative to the baseline) for users in each segment-fare product combination. This change in cost is a weighted average of changes for two groups: a. Switchers -- the average change in monthly cost for users who switched fare products (based on the cost of the fare product selected in the baseline and the cost of the fare product selected in the scenario), and b. Non-switchers -- the average monthly cost for users who did not switch fare products (based on the cost of the selected fare product under baseline and scenario prices). 4. Elasticity adjustments. Using the percent change in costs for each segment-fare product combination, adjust ridership and revenue by applying fare elasticities. (Elasticities can vary across user characteristics, fare products, and trip types.) 5. Induced ridership adjustments. For users predicted to switch between products, apply inflators or factors that adjust ridership for the impacts of using a particular fare product; for example, ridership would be scaled up for users switching from paying cash to using a pass.

This prediction procedure required three different kinds of parameters – a logit choice model formula, fare elasticities, and induced ridership inflators. The 1998 study described in Multisystems, Inc (2000) drew elasticities and induced ridership factors from previous studies. A nested multinomial logit fare product choice model was estimated using results of a mail-out stated preference survey. The survey collected general demographic information (including gender and car ownership) and asked respondents to make fare product selections under four different pricing scenarios (with different scenarios presented to different respondents).

The model also requires baseline ridership information, including any user characteristics that are included in the fare product choice model. In 1998, the logit choice model depended on car ownership; as a result, ridership by trip type for each user group had to be estimated by scaling system-wide customer survey data (including questions on car ownership) to match system-wide total ridership. Baseline allocation of transit users to fare products using the logit fare product choice formula did not perfectly match actual baseline ridership (in part because of fare policy changes between the survey and the model

79 development), so the logit formula parameters were manually adjusted to better conform to actual ridership.

In 2012, the CTA worked with Cambridge Systematics to develop a new set of fare modeling tools (Cambridge Systematics 2012). The general approach to the prediction procedure was similar, combining allocation of riders to fare products using a logit formula and then adjusting resulting ridership and revenue using elasticities. However, the specifics of the calculations for the final three “steps” were very different from prior CTA fare models (performed at a higher level of aggregation):

1. Baseline ridership data. Total future baseline ridership in four categories (weekday bus, weekday rail, weekend bus, and weekend rail) is projected using a time series formula. 2. Fare product allocation. a. The projected future baseline ridership is divided among fare categories by applying a logit fare product choice formula with baseline fares to riders who responded to a CTA Customer Satisfaction Survey (scaling up to total projected baseline ridership using survey probability weights). b. Projected future scenario ridership is similarly divided into fare categories, only using scenario fares and prices. 3. Change in user costs in the scenario. Under the fare product allocations developed above, the average system-wide user cost per linked trip – total revenue across all products divided by total linked trips across all products (a single number) – is calculated for the baseline and the scenario. 4. Elasticity adjustments. A ridership-weighted average elasticity is calculated from elasticities for four categories of ridership (weekday bus, weekday rail, weekend bus, and weekend rail). This one average elasticity is applied to the percent change in system-wide cost per linked trip (across all fare products) to find the percent change in system-wide ridership. This percent change is applied to total baseline ridership to find the total change in ridership in the scenario. This total change in ridership is allocated to fare products based on their predicted scenario allocations (i.e. based on shares of total ridership in the scenario); finally, scenario revenue for each fare product is assumed to change by the same percentage as ridership for the fare product. 5. Induced ridership adjustments. No adjustments are made for induced ridership due to customers switching between passes and pay-per-use fare products.

There are two clear issues with this modeling approach. First, the application of system-wide elasticities to system-wide changes in average fares does not allow for elasticities to vary across fare products (such as 7-day passes versus 30-day passes), and it effectively assumes the all customers experience the same percent change in actual costs. At the level of an individual customer, this produces results that are nonsensical, such as elasticity adjustments for continuing users of fare products that did not change in price. Similarly, if the impact of a fare change scenario were concentrated on a subset of customers with above-average elasticity, this would overstate ridership and revenue by effectively diluting both the change in cost and the elasticity; an example of this problem is shown in Table 3-2. Second, the omission of induced ridership adjustments assumes that fare product choice has no impact on ridership level. A customer switching between a pass and pay-per-use – between zero-marginal-cost ridership and paying a per-ride fare – is assumed to maintain the same level of ridership as before (apart from a generic elasticity adjustment applied equally to all fare products). This would tend to overstate ridership (or understate ridership losses) in scenarios that involved significant switching from passes to pay-per-use, such as the

80

January 2013 fare change. Prior CTA models avoided these two issues by performing adjustments on disaggregated data, allowing elasticity and induced ridership adjustments to vary appropriately across products, users (those who switched fare products and those who did not), and trip types.

Table 3-2: Example of Elasticity Adjustment Logic in CTA Fare Models Elasticity- Baseline % Change in Adjusted Ridership Elasticity Per-Ride Fare Ridership CTA 2005 and Earlier Methodology Segment A 50 -0.5 0% 50 Segment B 50 -1 +50% 25 Total 100 75 CTA 2012 Methodology Segment A 50 -0.5 0% Segment B 50 -1 +50% Total 100 81.25 Ridership-Weighted Average -0.75 +25%

The 2012 model under-predicted the very substantial switching from passes to pay-per-use that occurred following the 2013 fare change, suggesting potential issues with the product choice modeling methodology; however, it is difficult to attribute these prediction errors to specific sources. Like prior CTA models, the 2012 prediction procedure relied on a fare product choice model estimated using responses from a stated preference customer survey; all surveys can suffer from sampling issues, and stated preferences often include response bias. Additionally, restrictions were placed on the fare media options that were presented to survey respondents based on the general assumption that customers using non-account-based versions of fare products (such as pay-per-use on a Transit Card or a regular 7-day pass) would be unlikely to switch to account-based fare products (such as Chicago Card Plus pay-per-use or 30-day pass alternatives); it is hard to know the implication of these restrictions on the choice model estimates. The choice model was then applied to respondents in a CTA Customer Satisfaction Survey, and the results were scaled using ridership weights for each survey respondent; survey sampling and these scaling weights could also introduce error in the fare product choice predictions.

A final potential issue with the 2012 modeling was use of average fares to estimate elasticity parameters (which were then used to model fare change scenarios). Aggregate linear time series models were used to estimate elasticities for four different categories of ridership: weekday bus, weekday rail, weekend bus, and weekend rail. Log-transformed monthly total ridership for each category from 2000 to 2010 were regressed on monthly observations of explanatory variables including Chicago employment, U-Pass enrollment, service hours, gas prices, downtown parking costs, precipitation, monthly seasonality, and system-wide average fare per linked trip (across all fare products and trip types). The estimated coefficient for the fare variable in each model is interpreted as the fare elasticity for that category of ridership. In 2012, historical monthly average fare per linked trip was calculated as an input to the regression models:

1. Historical monthly ridership was tabulated by mode, fare product, and transfer sequence (whether a boarding was a transfer or the beginning of a new trip). 2. For each pay-per-use fare products, monthly revenues were calculated by multiplying ridership of each type by the applicable fares in a given month.

81

3. For each pass product, revenues were calculated in two steps. First, the average fare per linked trip in each month was calculated by dividing the pass price in that month by a single, time- invariant estimate of the number of linked trips per pass. Second, total revenue was calculated by multiplying this average fare by the total number of linked trips on the pass. 4. Finally, average fare per linked trip was calculated as total monthly revenue divided by total monthly linked trips (across all fare products).

As explained earlier, time series regression models often rely on some measure of average fare. At agencies like the CTA offering different fare payment structures (passes and pay-per-use), this is a crude approximation. Ridership decisions are based not only on average fare but also on marginal fare; transit users will tend to ride more if they are using a zero-marginal-cost pass instead of paying for each ride. Additionally, transit users choose between passes and pay-per-use options based on the total cost of taking their transit rides under each alternative. As a result, the average or median number of linked trips on any given pass should be expected to change over time as fare levels and pass multiples change; if the price of a pass goes up and pay-per-use fares do not, average ridership on that pass will go up as pass- holders with lower ridership frequency switch to less costly pay-per-use alternatives.

MBTA Fare Modeling Prior to a fare change in 2004, the MBTA and the Central Transportation Planning Staff (CTPS) of the Boston MPO developed a spreadsheet model to predict the ridership and revenue impacts of potential fare and service changes (Bench 2006). The model, later named the Fare Elasticity, Ridership, and Revenue Estimation Tool (FERRET), has been updated and used to analyze fare change scenarios prior to each MBTA fare change since 2004. It was described most recently in Andrews and Demchur (2016) and has previously been reviewed in Hong (2006).

FERRET is a traditional elasticity spreadsheet model with a few modifications. The core logic of the FERRET model is simple:

1. Observed baseline ridership from AFC and other sources is aggregated by fare product (combinations of tariff, user type, medium, and commuter rail zone) and by service or ride type (bus type, rapid transit station, commuter rail zone, and parking garage). Baseline sales are also totaled for each pass product. 2. Each combination of product and ride type is assigned a baseline fare (the actual fare at the time of analysis), a scenario fare, and a fare elasticity. 3. For each product and ride type, the percent change in per-ride fare or pass price from the baseline to the scenario is multiplied by the elasticity to find the percent change in ridership, which is applied to calculate scenario ridership levels (a “shrinkage ratio” formula): 푓푎푟푒푆푐푛 푟푖푑푒푟푠ℎ푖푝푆푐푛 = 푟푖푑푒푟푠ℎ푖푝퐵푎푠푒 ∗ (1 + ( − 1) ∗ 푒푙푎푠푡푖푐푖푡푦) 푓푎푟푒퐵푎푠푒 4. Diversion factors are assigned to a pair of product tariffs (pay-per-use and monthly passes), a pair of product media (CharlieTicket and CharlieCard), and a pair of services (bus and rapid transit). These factors describe the share of rides that switch across the pair when there is a 1% change in the ratio of their fare levels. These induced ride factors are used to further adjust ridership according to the following formula:

82

푓푎푟푒퐴,푆푐푛⁄푓푎푟푒퐴,퐵푎푠푒 푅푖푑푒푠 퐷푖푣푒푟푡푒푑 퐴 → 퐵 = ( − 1) ∗ 푑푖푣푒푟푠푖표푛 푓푎푐푡표푟 ∗ 푟푖푑푒푟푠ℎ푖푝퐴 푓푎푟푒퐵,푆푐푛⁄푓푎푟푒퐵,퐵푎푠푒

5. Scenario revenue for each pay-per-use product and ride type is calculated by multiplying scenario ridership and fare. Scenario revenue for each pass product is calculated by dividing total scenario ridership on the pass the baseline average rides per pass and multiplying by the scenario pass price.

푟푒푣푒푛푢푒푃푃푈,푆푐푛 = 푟푖푑푒푟푠ℎ푖푝푃푃푈,푆푐푛 ∗ 푓푎푟푒푃푃푈,푆푐푛

푟푖푑푒푟푠ℎ푖푝푝푎푠푠,푆푐푛 푟푒푣푒푛푢푒푝푎푠푠,푆푐푛 = ∗ 푝푟푖푐푒푝푎푠푠,푆푐푛 푟푖푑푒푠_푝푒푟_푝푎푠푠푝푎푠푠,퐵푎푠푒

A separate module of the FERRET model is used to perform Title VI equity analyses of fare and service change scenarios.

The main behavioral parameters in the FERRET model are elasticities and diversion factors. Elasticities for the model were initially estimated using simple before-and-after analysis of the 2004 fare change, estimates at peer agencies (including NYCTA, the CTA, and TTC in Toronto), and surveys in the academic literature. Elasticities have been incrementally adjusted based on before-and-after analysis of subsequent fare changes. Based on observed changes following the 2012 fare change, the elasticity for bus pay-per-use was increased in magnitude and elasticities for bus and rapid transit passes were decreased in magnitude (Andrews and Demchur 2016). (Pincus (2014) also estimated elasticities from the 2012 fare change using both before-and-after calculations and panel data regression modeling.) The diversion factors in FERRET were adapted from values in a spreadsheet model at NYCTA, which in turn were based loosely on peer agency experience and professional judgment; “conservative” (small) diversion factors were selected, even though NYCTA found that the diversion factors in their model were too low and should have been doubled to better predict substantial shifts in product market shares following their 2003 fare change (Hickey 2005; Bench 2006). The diversion factors in FERRET have not been updated over time.

Like other multiplicative spreadsheet models, FERRET has the advantage of being simple yet flexible. It is easy to understand and explain, but it can assume different elasticities and predict impacts separately for every fare product and ride type at the MBTA. Note, however, that only a few different elasticities are actually assumed; the disaggregate calculations in the FERRET model are primarily useful for summarizing results in different ways and generally do not improve the model’s precision or accuracy. Figure 3-5 shows revenue and ridership for different categories of fare products and ride types that are disaggregated in the FERRET model, plotted against assumed “mid” elasticity values used for analysis of the MBTA’s eventual 2016 fare change. While elasticities are differentiated for many smaller categories, 85% of total baseline revenue and 77% of baseline ridership falls in just five major categories with different elasticities: monthly and 7-day bus and rapid transit passes (with a FERRET “mid” elasticity assumption of -0.15), full-fare bus or subway rides (-0.25), full fare surface light rail rides (-0.3), commuter rail passes (-0.1), and commuter rail single-ride tickets (-0.2) (Andrews and Demchur 2016). Without much loss, FERRET’s predictions could be reduced to five simple calculations: elasticity adjustments to total ridership and revenue for these five categories of fare products and trip types. The FERRET model could potentially be improved by adding additional differentiation of elasticities (and

83 diversion factors) within these large categories. For example, analyses in Pincus (2014), Kamfonik (2013), and Chapters 5 and 6 of this thesis suggest that Monthly LinkPasses should be differentiated by sale channel, since Corporate LinkPasses have been much less sensitive to past fare changes than non- Corporate passes. Elasticities for single-ride fare products could also be differentiated by time of day. Additionally, AFC data on rider travel patterns could be clustered to produce new segmentation options within major fare products (such as the Monthly LinkPass or full-fare pay-per-use); cluster-based segmentation using AFC data is described in Basu (2018), and the Appendix demonstrates the applicability of this method to the MBTA by clustering users on temporal travel features.

Figure 3-5: MBTA FERRET Model Baseline (FY15) Revenue and Ridership by Assumed “Mid” Elasticity

Source: Andrews and Demchur (2016), MBTA FERRET model

One weakness of FERRET is its diversion factors, which shift ridership between fare products based on changes in fare ratios. This is not unlike how elasticities crudely shift ridership between driving and

84 transit, and it could be effective at predicting product shifts;20 however, the specific diversion factor values in FERRET do not have a strong empirical basis and have not been updated as MBTA fares have changed over time. FERRET’s very low “cash-pass” diversion factors (0.05) predict little product switching even under substantial changes in relative product prices, which likely contributed to overprediction of LinkPasses revenue for the MBTA’s 2016 fare change (discussed more in Chapter 7). Additionally, Hong (2006) notes that using a more detailed method like a logit fare product choice model would allow FERRET to better incorporate new fare media and fare products (since it would generate a synthetic baseline of fare product market shares rather than making incremental adjustments to market shares for existing products). 3.4.4 Summary and Typology The different kinds of prediction procedures and parameter estimation methods reviewed in this section vary in the level of abstraction used to represent transit fare structure and individual behavior. This suggests an analogy to the three levels of traffic simulation models:

 Microscopic models can represent detailed fare policy and individual-level behaviors. This is particularly useful for disaggregate analysis (by fare product, ride type, customer segment, etc.) of a detailed fare policy decision (such as pass multiples, mode or time-based fare differentiation, or transfer policies); however, it has significant data and computational requirements and may be infeasible for system-wide analysis.  Mesoscopic models represent some elements of fare policy and group-level behaviors (such as behavior for a representative sample of individuals). This is useful for analyzing broad fare policy decisions that still depend on some particularities of fare structure or individual decision making (such as setting pass multiples or comprehensive forecasting with some disaggregation by fare product, ride type, or customer segment). It requires custom processing of individual-level data, but model calculations can be performed on aggregations.  Macroscopic models only represent very general elements of fare policy and model average behaviors without describing the context of individual fare choices. This is ideal for high-level or long-term analyses of system-wide demand, especially at agencies with very simple fare structures. It requires only aggregate information on ridership and sales.

Each level of modeling includes a combination of estimation/calibration methods and forecasting/simulation/prediction methods. The lines between the levels are, naturally, blurry. For example, mode choice models typically have very abstract representations of transit fare policy, but they represent individual-level choice situations.

These different levels of modeling are complementary. Each has different direct policy applications on its own, or they can be used in combination; more detailed “microscopic” or “mesoscopic” models can be used to estimate parameters or develop insights and intuitions that can be applied to simpler and more abstract prediction models. Elasticities could be estimated using a discrete-continuous model that controls for the effect of fare product choice, but the resulting estimates could be applied in a simple multiplicative spreadsheet model. Likewise, fare product choice utility parameters can be estimated using

20 As discussed earlier, diversion factors (like elasticities and cross-elasticities) describe sensitivity to price changes at a specific place on the demand curve; they are conditional on specific price levels. As price levels change over time, fare products become more or less competitive with each other, requiring that diversion factors be updated.

85 data on individual customers, but the resulting choice formula could be apply to a group-level choice model using “representative individuals” for different customer segments. As another example, the clear importance of distinguishing marginal and average fares in individual-level product choice modeling could inform specification of an aggregate time series regression model; instead of using only average fare level, perhaps some sort of interaction with pass market share could better describe the behavioral impact of fares even in an aggregate analysis.

Figure 3-6 provides a subjective summary of the different prediction and estimation methods reviewed in this section, plotted according to the detail with which they represent fare policy and individual behavior.

Figure 3-6: Typology of Parameter Estimation and Scenario Modeling Methods

86

4 A Framework for Incremental Transit Fare Policy Analysis This chapter proposes a three-step procedural framework bridging the literatures reviewed in Chapter 3 that transit agencies could use to organize ongoing analysis of fare structures and fare levels:

1. Identify current pricing strategies

2. Describe and segment transit use

3. Model demand by fare structure and market segment

The first two steps develop understanding of the current situation -- the fare structure and levels that are in place, the pricing strategies that they imply, and the ways that those strategies are revealed in customer behavior. The third step formalizes the impacts of fares on customer trip making decisions in order to make quantitative predictions about specific changes in fare structure and levels. The chapter ends with an extended modeling example that develops basic intuition for the applications in Chapters 5 through 7. 4.1 Step 1: Identify Current Pricing Strategies The first step to analyzing incremental changes in fare policy is understanding the pricing strategies that are explicit or implicit in the current fare structure and levels. (As a reminder, "fare structure" in this thesis means all of the rules by which fares and prices are calculated.) This requires a clear picture of the different dimensions used to differentiate fares and prices.

A simple way to begin is to define existing fare “products” by their attributes. This is useful because fare product attributes both reflect and determine fare structure; for example, if an agency’s fare structure is not based on distance, then no fare products will have distance as an attribute. As an illustration, the MBTA offers over 150 distinct fare products; however, for practical purposes these products are defined or distinguished by only a handful of different attributes (identified in Chapter 2): user type, service validity, tariff, and medium. A full-price MBTA Monthly LinkPass could be defined as a monthly pass (tariff), valid on Rapid Transit and Local Bus (validity), available to any customer (user type) on either a CharlieCard or CharlieTicket (medium).

Existing fare "products" may, however, have additional rules about how fares and prices are calculated. For example, the MBTA pay-as-you-go or pay-per-use CharlieCard "product" has different fares for bus rides and rapid transit rides, and rules related to step-up transfer fares. The trip attributes that determine these marginal fare levels – trip mode and transfer type – are additional dimensions of the MBTA’s fare structure (even though they are not used to define a "product" per se).

Once the different dimensions of a fare structure are identified, the pricing strategies associated with each dimension can be identified. A range of different pricing strategies was described in Chapter 3. Continuing the example of the MBTA, user type, tariff, service validity, transfer type, and medium dimensions each have associated pricing strategies:

87

User Type: Discounted MBTA fare products are available to university students, grade school students, youths, seniors, and people with disabilities. Free service is provided to children, the blind, and service officials (such as police). MBTA user types reflect the use of group pricing to charge lower fares to certain groups of riders based either on willingness-to-pay (e.g. parents with young children or college students) or on equity objectives (such as people with disabilities).

Service Validity: MBTA fare products are valid on different combinations of local bus, express bus, rapid transit, commuter rail (by zone), and ferry services (by route). Many (though not all) MBTA fare products are valid on all lower-priced services. Because of this practice, products are generally named or “branded” by the highest-value service (which can serve as a proxy for overall service validity).

The differentiation of fare products based on service validity reflects a combination of cost-based pricing (modes with higher fares generally have higher operational costs) and product differentiation. Fare products can be differentiated horizontally to the degree that different services are relevant to different customers based on their geography; this targeting could potentially serve revenue objectives (if service geography is correlated with willingness-to-pay, such as with suburban commuter rail) or equity objectives (if service geography is correlated with target populations, such as low-income urban residents). Fare products are also differentiated vertically wherever service is redundant, for example on corridors with both commuter rail and heavy rail service. If fare differentials between redundant services are greater than the difference in operating costs, this vertical differentiation can increase revenue by offering a “premium” ride at an increased fare to customers with a higher willingness-to-pay.

Tariff: MBTA fare products include “pay-as-you-go” (PAYG) products (like cash and stored- value smartcards), limited-use products (such as round-trip or 10-ride commuter rail tickets), and unlimited-use pass products (1-day, 7-day, and calendar month). These different tariffs reflect a combination of both group pricing based on frequency of travel and self-selection based on the inconvenience of pre-payment (or, conversely, the convenience of infrequent payment). The unlimited-use products offered by the MBTA provide a per-ride discount for regular commuters or other frequent travelers who use transit, which builds loyalty with a core group of customers and encourages use of transit for non-work travel as well. The option of either monthly or 7-day passes for the same customers reflects the tradeoff between group pricing and self-selection; 7- day passes attract some less-regular travelers (such as tourists) who might not respond to a loyalty discount, but it also captures additional regular commuters who find monthly pre-payment to be too burdensome.

Medium: The MBTA offers fare products on several different media – cash, paper magnetic stripe tickets (CharlieTickets), other paper tickets (commuter rail and ferry), smartcards (CharlieCards), and mobile devices (mTicket). Cash fares are generally higher than ticket fares, and ticket fares higher than smartcard fares. This difference across media primarily reflects cost- based pricing; the operational cost to the MBTA is higher for cash and ticket use than for smartcard use.

Transfer type: MBTA pass-holders are able to transfer for free between any services covered by their passes. For transfer trips on pay-as-you-go fare products, the MBTA effectively offers free

88

transfers across local bus and rapid transit services. In practice, riders are charged the highest fare for any single leg or “tap” on their trip; for example, a rail-to-bus transfer is free (since the rail ride had a higher fare than the bus ride), but there is a “step-up” fare for a bus-to-rail transfer equal to the difference between the bus fare and the rail fare. This free transfer policy is best seen as the default – a choice not to differentiate prices between transfer trips and single-seat trips. At agencies that do not provide free transfers, the price of a transfer could reflect cost-based pricing if transfer trips are longer distances, and it could reflect vertical differentiation if transfer trips provide a higher quality service than trips without transfers (for example, if a bus transfer is used to replace a first- or last-mile walk and provide nearly door-to-door service). However, vertical differentiation often favors inexpensive or free transfers, since transfer trips are clearly lower quality than a one-seat trip with the same origin and destination; transfers are often made because direct transit service is not available, and they have more circuitous routes and unpredictable journey times than one-seat trips.

Beyond fare structure, it is also critically important to identify strategic implications of fare product distribution. Some sale channels have little practical distinction from a customer perspective (such as a ticket window versus a ticket vending machine at ). Others, however, are associated with meaningful differences in access and prices. As a notable example, the MBTA Corporate Program allows employees at participating companies to purchase a number of pass products pre-tax through payroll deductions, resulting in a lower effective price. These discounts are federal tax expenditures, not prices set by MBTA; however, the MBTA can take strategic advantage of the difference in effective prices. The Corporate Program is functionally group pricing that provides discounts and convenience to employed people who frequently use transit, attracting new riders and building loyalty within this group of customers. Whether this is considered a separate pass product or merely a different distribution channel, it plays a major role in fare policy at the MBTA. Sale channels are the focus of Chapter 5. 4.2 Step 2: Describe and Segment Transit Use The pricing strategies identified in an agency’s fare structure suggest hypotheses or expectations about how customers will choose and use fare products; for example, customers might be expected to self-select into pass products when their transit use is above the pass “multiple.” The second step uses summaries to explore these supposed connections between pricing strategy and actual behavior. While any data on customer behavior could be used, AFC data is a natural starting point since it is free (once in place) and has good coverage.

Just as the dimensions of fare structure point to pricing strategies, these dimensions can be used to organize AFC data and to explore connections between pricing strategies and behavior. Using the example above, customers’ frequency of use can be described across different tariff to see if frequent users have self-selected into pass products. This kind of “business intelligence” analysis was discussed in Chapter 3, and it is used throughout this thesis. One side benefit of summarizing transit ridership and revenue by the different dimensions of fare structure is identification of gaps in automated data. If it is difficult to describe ridership at the level of fare rules, then it will be difficult to predict the impacts of changing those fare rules. Similarly, if ridership cannot be described along dimensions of proposed or potential fare structures, then it will be difficult to make informed decisions about those fare structures. For example, if a transit agency is considering introducing distance-based fares but cannot

89 easily describe its current ridership by trip distance, its ability to predict the impacts of that change in fare structure will be limited.

While AFC typically contains useful fields describing fare products and trips, extensive cleaning is often required to make use of AFC data for fare policy analysis. Three cleaning steps were particularly useful for the analyses in this thesis; some had already been performed at the CTA and MBTA for other applications, and others were performed in the course of this research. First, fare product definition tables were created to describe each fare product using a set of standardized fare structure dimensions (as discussed earlier). Joining these tables to AFC-encoded fare product names or numbers facilitated summaries along those dimensions. Second, AFC validations for each card or ticket or account were processed to identify linked trips — one or more validations that were part of the same journey. Transit user purchase and travel decisions are driven mostly by attributes of trips, not trip segments, and any transfer pricing policies depend on trip-level information. Third, “use value” was calculated for all fare products. “Use value” is used in this thesis to describe the cost of transit trips under pay-per-use fare rules (regardless of what fare product was actually used for the trips). Use value describes the cost that is actually incurred for pay-per-use trips, and it is useful for describing and comparing use of pass products.

Summaries of ridership and revenue across different dimensions of fare structure and other elements of pricing strategy will naturally suggest certain segmentations – meaningful groupings of customers and trips that are likely to respond in different ways to fare policy changes. The fare product attributes (such as user type, tariff, and fare medium) and trip type attributes (such as mode or time of day) are often readily available and simple to work with in AFC data, proving many easy options to define segmentations. Once defined, the fare product choices, purchases, and ridership patterns of different segments can be described and compared to further explore relationships between fare strategy and behavior.

The ease of using fare product attributes or basic trip attributes for segmentation does come with a risk; as discussed in Chapter 3, segmentations are not simply convenient groupings of customers or trips, but rather meaningful groupings that reflect differences in relevant behaviors. Ridership could be segmented by weekday because it is easy to extract from AFC timestamps, but there may be little difference between the fare-related behaviors of customers who travel on a Tuesday versus those who travel on a Wednesday. Any feature describing customer behavior can be used to define a segmentation, and important features may need to be developed specifically for fare policy analysis. For example, the MBTA AFC system does not record pass sales for certain sale channels, including the Corporate Program. In order to segment pass-holders by participation in the Corporate Program in Chapters 5 and 6, historical Corporate pass sales were identified using other sources and joined to AFC data. This turns out to be a very meaningful distinction between pass-holders for fare policy analysis at the MBTA. As discussed in Chapter 3, clustering provides opportunities to succinctly describe transit customer travel patterns in richer ways than simple summary metrics like frequency of ridership. For example, the Appendix presents clusters of MBTA transit cards based on temporal travel features; these clusters could potentially be used to segment customers for fare policy analysis.

90

4.3 Step 3: Model Demand by Fare Structure and Market Segment Finally, key fare-related behaviors can be modeled to predict how customers would respond to particular changes in fare policy. Fare models are all abstractions (or as statistician George Box famously put it, “all models are wrong, but some are useful”); fare models cannot represent every detail of an agency’s fare structure and cannot describe every fare-related behavior for every customer, so choices must be made about what to model and how to model it. These modeling choices can be organized around four questions discussed in the sections below:

1. Fare policy scenarios. What are the desired applications of the model (i.e. what range of fare policy scenarios will be explored)? 2. Theory of behavior. What are the key customer behaviors that need to be represented in the model? 3. Prediction procedure mechanics. How will these key behaviors be represented in a prediction procedure? 4. Parameter estimation. How will the behavioral parameters in the model be determined? 4.3.1 Guiding Questions

Fare Policy Scenarios A single decision maker might begin by articulating important objectives for a fare change, selecting a general fare strategy to meet or improve those objectives, and then specifying a range of scenarios that implement that strategy. Reality is not so simple or linear; as discussed in Chapter 1, different stakeholders and analysts will have different priorities, favor different strategies, and imagine different scenarios. While consensus on objectives is unlikely, it is important to identify the key objectives of different stakeholders – such as system-wide ridership and revenue, equity or impacts on particular customer groups or populations, or simplicity – so that model outputs can include metrics to evaluate performance with respect those objectives. Similarly, consensus on pricing strategies (cost-based pricing, group or channel pricing, horizontal or vertical differentiation, etc.) is unlikely even with common objectives; however, it is important to limit the scope of potential scenarios where possible to reduce the complexity required in a fare model.

Theory of Behavior Modeling choices are driven by theories about how people make fare-related decisions. A fare model cannot represent all the complexities of these decisions, but the exploration of connections between fare strategies and transit customer behavior in the earlier steps can be used to decide which behaviors and which elements of fare structure are most important to include in a fare model. For example, the theory of behavior behind a simple elasticity spreadsheet model is that marginal changes in customers’ use of any particular fare product are driven by changes in that product’s price or fare level.

Beyond the factors that drive customer decisions, another aspect of a theory of behavior is customer heterogeneity. Building directly on the earlier steps, a segmentation – a grouping of customers who are likely to respond differently to fare policy changes – can be specified to selectively differentiate the

91 behaviors in a fare model across customers. For example, changes in a product’s price might be the driving force behind customer choices about the product, but the particular sensitivity to price may vary across different groups of customers.

Prediction Procedure Mechanics The key behaviors that are identified then have to be translated into a concrete set of calculations. The primary consideration here is the desired level of abstraction — how precisely or accurately or individually to represent the selected behaviors. This choice is constrained in part by available data, which once again builds on previous steps; exploration of the relationship between fare structure and customer behaviors will have revealed gaps or oddities in AFC and other data that will prevent certain calculations or procedures. For example, if transit card turnover is very high, a procedure that relies on long, balanced panel data series in fare product and ridership choices would not be feasible. Another constraint on design of model calculations is the ability to estimate the behavioral parameters that are used in the calculations (as described in the next section); if a parameter cannot be estimated using readily available data and there are no estimates that can be borrowed from prior studies, then different parameters will need to be used. Other considerations include simplicity and flexibility of the calculations. Simplicity and clarity may be particularly important if the model methodology needs to be explained and defended to agency leadership or the public. Flexibility is especially desirable if the scope of model scenarios or available data and parameters are likely to change during present analyses, or if the model will be used for future analyses of unknown scope.

Parameter Estimation As discussed in Chapter 3, some fare models using integrated methods for both parameter estimation and prediction while other prediction procedures require use of separate methods to estimate parameters. Parameter estimation always requires data with variation in the factors that are assumed to drive fare change impacts, so data availability is once again a significant constraint. AFC data provides some opportunities to estimate parameters using the experience of a single agency; it always contains variation in fare product choices and ridership – across customers at any one point in time, within customers over time, and system-wide over time – and may or may not contain useful variation in fares and prices (depending on the timing of recent fare changes). AFC data can correspondingly be formatted as individual cross-sectional data, panel data, or aggregated time series data to fit the requirements of a particular estimation method. General considerations for using AFC data are discussed at length in Chapter 6.

If parameters cannot be estimated for a particular modeling exercise, existing estimates can sometimes be drawn from prior studies in academia or the transit industry. Opportunities to use existing estimates are much greater when the behavioral parameters used for prediction take standard forms, such as fare elasticities. 4.3.2 A Conceptual Example: Pass Pricing An example application of these four guiding questions clarifies how they relate. The example is also designed as a simplified version of the fare modeling approach developed for the CTA and MBTA case studies in this thesis; it anticipates the parameter estimation and prediction modeling work in Chapters 6 and 7, and it develops several useful intuitions about pass pricing.

92

Guiding Questions First, what are the fare policy scenarios in the example? Suppose that a transit agency has two primary objectives for fare policy – increasing revenue and increasing ridership. The agency currently offers customers a single tariff option – flat, pay-per-use fares that are differentiated by transit mode – but wishes to explore self-selection of frequent transit users as a new pricing strategy. Specifically, they want to model the potential impacts of introducing a weekly pass product without changing pay-per-use fares. The range of possible modeling scenarios is limited by focusing on two simple decisions – whether to offer a pass and where to price it.

Second, what is the theory of behavior? Under the current fare structure, the agency assumes that the impact of changing fares for one of the modes (e.g. bus) simply depends on the relative change in the fare for that mode. However, the theory of behavior used for modeling has to sufficiently describe customer decision making under both current and potential future fare structures – in this case, both pay-per-use and a new weekly pass. Under a fare structure that includes both pay-per-use and pass options, the prediction procedures reviewed in Chapter 3 and the case studies explored in this thesis suggest that there are three key behaviors or customer decisions that drive the impacts of fare changes:

1. Which fare product to choose? Or, from the agency’s perspective, how many customers will switch from pay-per-use to the new pass? 2. What additional rides to take with a pass? For existing customers that do switch to the new pass, how will their ridership change? 3. Whether and how much to ride? For potential transit customers that currently use other modes, how many would purchase the new pass and start using transit? (If pay-per-use fares were changing, this would also include consideration of current customers’ decisions about whether to continue using transit and how frequently to ride.)

This boiled-down theory of fare-related behavior is informed by both existing and proposed pricing strategies. At an agency that did not currently offer a pass option and did not wish to introduce one, the first two behaviors or questions would be irrelevant. Figure 4-1 illustrates how these three key behaviors relate; while it is convenient to describe them sequentially or chronologically, they are interrelated and may all be decided simultaneously.

93

Figure 4-1: Three Key Fare-Related Customer Behaviors

Third, how can these three behaviors be represented in a set of calculations to predict the impacts of a new pass? Chapters 6 and 7 of this thesis identify three parameters corresponding to these behaviors that can form the basis of a prediction procedure:

1. Fare product choice logit utility parameters, specifically parameters representing sensitivity of product choices to: a) expected weekly cost of transit travel, and b) a customer’s inherent preference for one product over another. 2. Induced ride factors, multiplicative factors describing the relative increase in ridership when a customer switches from riding transit with pay-per-use fares to riding with a zero-marginal-cost pass. 3. Elasticities, additional multiplicative factors that scale ridership on a fare product up or down linearly with the percent change in price for that fare product (capturing both wholesale switching between transit and other modes as well as changes in ridership frequency for continuing transit users).

These parameters can be combined sequentially in calculations that first allocate customers to different fare products (pass or pay-per-use) using a logit fare product choice formula and then scale ridership and revenue using induced ride factors and elasticities. This general prediction procedure (presented in Chapter 7) approximates complex, simultaneous customer decisions using calculations that are relatively intuitive, simple to explain, easy to implement in standard spreadsheet software, and flexible to the addition of arbitrary customer segmentation.

Fourth, how can these parameters be estimated? Chapter 6 demonstrates that fare product choice models and induced ride factors can be estimated or approximated from cross-sectional AFC data at agencies offering both passes and pay-per-use fares; however, in this contrived situation that would not be possible because there is no history of observed customer choices between passes and pay-per-use. Customer surveys would need to be conducted to estimate these parameters, or they would need to be drawn from prior studies. Similarly, pass and pay-per-use price elasticities could be estimated using AFC panel data

94

(as in Chapter 6) or aggregate AFC time series data, or they could be drawn from prior studies (as in Chapter 7).

Note that parameter estimation in this conceptual example is difficult because the example introduces an entirely new fare product. This thesis primarily focuses on analysis of incremental fare policy changes, rather than major changes like new products; however, this example of offering a new pass product was selected for its usefulness in building intuition about pass pricing (in the next section).

Numerical Results and Implications A simple model can be used to examine the shape of ridership and revenue impacts for introducing a pass at different prices. This model logic relates closely to the CTA fare change scenario model presented in Chapter 7 as well as some of the theory on pass pricing developed in Carbajo (1988); however, it is highly simplified in order to illustrate some basic lessons.

The assumed distribution of ridership – the frequency of customer-weeks at each level of transit ridership – is adapted from Ventra data for major, account-based fare products at the CTA and shown in Figure 4-2. The ridership distribution follows the expected bimodal pattern; many active customers take a small number of trips, and other transit users follow a distribution centered on “commuter” travel frequencies. For convenience, the pay-per-use fare is assumed to be $1 per trip and is never changed in this example analysis; weekly ridership is the same as the weekly cost of using pay-per-use.

Figure 4-2: Assumed Weekly Trip Frequency Distribution for Pass Pricing Model Example

Source: Based on CTA Ventra data for full-fare, account-based pay-per-use, 3-day, 7-day, and 30-day pass. Notes: This distribution of ridership frequency is merely illustrative. It is based on data that included use of pass products, without any adjustment for induced ridership on passes. The underlying data also excluded single-use tickets and cash for convenience. Including these products would create a large spike at low trip frequencies, which would make passes look more favorable; however, it may be appropriate to exclude these products since customers using single-ride tickets and cash might not be in the market for pass products.

95

There are four behavioral parameters in the simple model. Two describe the choice between fare products using a logit formula, where the systematic utilities are:

푉푃푃푈 = β ∗ (weekly use value)

푉푃푎푠푠 = 훼 + β ∗ (weekly use value)

The parameter β describes sensitivity to weekly costs, and 훼 describes inherent preference for passes relative to pay-per-use. An elasticity parameter, 휀, captures the marginal impact of changes in transit costs on mode choice (switching between transit and, e.g., driving). Finally, an induced ride factor, 푓, describes the additional rides taken when a customer uses a pass instead of pay-per-use. Some arbitrary (but perhaps not unreasonable) initial parameter assumptions are shown in Table 4-1, and sensitivities are examined later.

Table 4-1: Initial Parameter Assumptions for Pass Pricing Model Example Model Parameter Initial Assumption Product choice model weekly cost coefficient (훽) -0.3 Product choice model pass constant (훼) 0 Price elasticity (휀) -0.5 Induced ride factor (푓) 1.1

When a pass is introduced at a certain price P, some existing customers will shift from pay-per-use to the pass. The probability of switching depends on the customer’s transit use and is calculated for customers with a given level of transit use using the logit choice formula:

푒푉푃푎푠푠 푃(푝푎푠푠) = . 푒푉푃푃푈+푒푉푃푎푠푠

Those customers that do switch to the pass ride more frequently (scaled up by the induced ride factor, 푓) because they are now riding with zero marginal cost. Finally, the new pass attracts some users from other modes (such as driving alone). This choice depends on the travel frequency distribution and preferences of non-transit users. It is approximated by applying the elasticity (휀) to the number of new pass purchasers based on the average percent reduction in cost for those pass purchasers (ignoring those who chose to pay more for a pass). Since the travel frequency distribution of these mode-switchers is unknown, their ridership is assumed to follow the same distribution as the existing pay-per-use customers who switched to a pass.

On one extreme, if the price of a new pass is too high then no existing customers will choose to switch from pay-per-use, and no drivers will choose to switch to transit because of the pass. As the price is lowered, however, pass adoption begins to increase. Consider the incremental impacts on ridership and revenue when the potential price of a new pass is lowered from P to P-1, which are all determined by the distribution of ridership frequency and the behavioral parameters shown above in Figure 4-2 and Table 4-1:

96

1. Existing pass purchasers (i.e. those who would purchase a pass at price P, including former users of other modes) continue purchasing a pass but pay less, lowering revenue but not affecting ridership. 2. Some existing pay-per-use customers with use value below the new pass price (P-1) will “buy up,” switching to the pass even though it will cost them more than pay-per-use. This choice could be a result of error in the customers’ predictions about future ridership frequency, or it could be driven by non-cost considerations such as the convenience of a pass. These shifts increase revenue and ridership (relative to these customers’ prior spending and ridership on pay- per-use). 3. Other existing pay-per-use customers with use value above the new pass price (P-1) will also switch to the pass to realize new cost savings. These shifts lower revenue but increase ridership (relative to these customers’ prior transit activity). 4. Finally, some additional users of other modes will switch to transit because there is a cheaper pass option. These shifts are entirely new ridership and revenue.

With this logic in place, Figure 4-3 shows the percent change in total ridership and revenue from introducing a new pass at different weekly pass “multiples” (the number of trips needed to break even on a pass). Since the pay-per-use fare is fixed at $1 per trip, the pass “multiple” is also the weekly pass price. As expected, ridership grows at an increasing rate as the pass price is lowered. Revenue increases initially as the pass price is lowered, but eventually the incremental revenue loss (from existing and new high-frequency pass purchasers) is greater than the incremental revenue gains (from pay-per-use customers who “buy up” into the pass and customers who switch from other modes).

Figure 4-3: Revenue and Ridership Changes Relative to Pay-Per-Use-Only Baseline for Alternative Pass “Multiples” (Under Initial Parameter Assumptions)

There are a few basic lessons for pass pricing in this one example:

97

 Use of passes can potentially increase revenue and ridership. The transit agency experience summarized in McCollum and Pratt (2004) suggested that “When an unlimited ride pass is introduced for the first time and without an overall fare increase, revenue loss relative to not having the pass almost always occurs.” However, this is clearly not a theoretical inevitability, and it may have resulted from agencies prioritizing ridership gains and setting low initial pass multiples. Antos and Eichler (2016) reached a similar conclusion in their study of potential new passes at WMATA.  There is a revenue-maximizing pass price that should never be exceeded. The revenue- maximizing weekly pass multiple under the initial parameter assumptions is 13.6 with a resulting revenue increase of 7.5% and ridership increase of 3.3%. Ridership increases monotonically as pass prices decrease, so it would never make sense to price a pass any higher than this revenue- maximizing point.  Considering ridership, pass prices should be set below the revenue-maximizing price. Revenue is relatively flat around revenue-maximizing price, so there is an exceptional ridership “return” or “bang for the buck” from foregoing a modest amount of revenue and pricing a pass lower than this point. Accepting 2.5% lower revenue (a 5% overall revenue gain) at a pass multiple of 10.3 nearly doubles the ridership increase to +6.3%, and a revenue-neutral pass product at a weekly pass multiple of 8.3 would increase ridership by 9.4%.

The above results are for a single set of parameter assumptions. What if those parameters are different? Figure 4-4 show the revenue-maximizing weekly pass multiple (solid black lines) and the resulting percent change in revenue (blue lines) as the parameters are altered. As explained above, the “optimal” pass price accounting for ridership objectives is likely below the revenue-maximizing one; the dotted black lines show the pass multiple that would achieve 5 percentage points lower gain in revenue (relative to the ), and the shows the resulting boost in ridership from accepting the lower revenue level. A few patterns are worth noting:

 In general, the revenue-maximizing pass multiple is insensitive to a wide range of plausible parameter values; it is primarily determined by the distribution of ridership frequency.  Maximum potential revenue, however, varies widely with product choice parameters (훼 and 훽).  The ridership gain from accepting less-than-maximal revenue is higher when elasticities and induced ride factors are higher; lower pass prices are more advantageous in a more competitive environment.

98

Figure 4-4: Sensitivity of Ridership, Revenue, and “Optimal” Pass Multiples to Parameter Assumptions

Note: In each sensitivity graph, all other parameters are held constant at initial assumed values.

While there are good reasons to estimate these parameters, there are also strategies by which transit agencies can alter the parameters. An agency looking to increase revenue could either raise pass prices toward the revenue-maximizing level (which decreases ridership) or could influence behavior in other ways (potentially increasing ridership at the same time). Cost sensitivity of product choice (훽) and mode choice (휀) both affect potential revenue levels, and both are affected by marketing and price salience. Targeted marketing can increase awareness of transit options and potential cost savings, and price salience can be influenced by design of agency sale channels and customer communications (such as saved payment methods and options for automatic purchase renewal). Inherent preferences for passes versus pay-per-use (훼) also have a large impact on potential revenue and could similarly be altered by making purchase of passes more convenient relative to pay-per-use or by enhancing the value of a pass in other ways. For example, one promising option is bundling transit passes with discounts on other services like bike share or ride hailing. This would make passes more attractive to all existing pay-per-

99 use customers (a net ridership and revenue win within a range of reasonable pass prices), and perhaps more importantly it would attract new customers from other modes.

The other key input to the simple model is the distribution of ridership frequency. “Optimal” pass multiples (by whatever definition) are determined in large part by this distribution; in the extreme, a transit agency with all infrequent riders would find it advantageous to price a pass much lower than an agency with all frequent riders. While this can be measured more directly by a transit agency using AFC data, it may vary in two ways. First, different agencies may have different ridership frequency distributions depending on the quality, coverage, and cost of transit service; availability of alternatives; and local demographics and culture. McCollum and Pratt (2004) show wide variation in ride frequency across nine U.S. transit agencies. Second, frequency of use is likely to vary across customer segments with different fare-related behaviors. For instance, McCollum and Pratt (2004) also show variation in trip frequency across modes within transit agencies, and transit mode use may be related to product preferences or cost sensitivity. (See the CTA product choice model presented in Chapter 6.) As another example, commute-only transit users may have a frequency distribution concentrated around eight to ten trips per week and may be insensitive to transit costs, while users who depend on transit for both work and non-work trips may have a higher frequency distribution and greater sensitivity to costs. In these cases, it could be important to evaluate each market segment separately rather than aggregating or averaging, and it may suggest potential for multiple pass or multi-ride products targeting different markets. This is a limitation of the simple example shown here, which cannot be applied directly to an agency like the CTA or MBTA offering multiple pass products different groups of customers. Third, frequency of use may vary over time. For example, if the advent of ride hailing (and, later, autonomous vehicles) causes users to divide their current transit travel between transit and car, the frequency distribution would be lowered; in this case, pass multiples would need to be lowered over time to maintain the same balance between revenue and ridership gains.

This example demonstrates the complexity of making fare policy decisions under a structure with both passes and pay-per-user fare products; there are no simple rules of thumb for pricing passes, even in a simple model. However, it does provide some intuition and highlights opportunities to grow ridership and revenue using both strategic pass pricing and policies that influence different fare-related customer behaviors. 4.4 Conclusions This chapter proposes a loose three-step process by which agencies can improve analytical capabilities related to changes in fare structure and fare levels. The first step – identifying current pricing strategies – is an easy way for agencies to organize and motivate their efforts, and the process should be accessible to any transit agency that maintains an automated, transaction-based fare collection system. A simple modeling example focused on a widely-relevant fare policy question – how to price pass products relative to pay-per-use – sets the stage for the applications in the following chapters.

The three-step process in this chapter does have two important limitations. It is focused on incremental changes to existing fare products and does not necessarily apply to fundamental changes in fare structure (such as totally new products or fare integration with new services). Relatedly, it is limited by data availability. AFC data might not include important components of existing fare structure or pricing strategy; for example, MBTA AFC data does not include information on pass sale channels or ridership

100 on the commuter rail system. Data may also be unavailable for components of potential future pricing strategy or attributes of possible new fare products. The process proposed here helps to identify these data gaps, but additional data fusion or stated preference customer survey work may be needed.

101

5 Describing the Role of Pass Sale Channels 5.1 Potential Significance of Pass Sale Channels and Empirical Questions The sale channel by which passes are sold and distributed could potentially be significant for transit agencies in several different regards. This section discusses three fare policy opportunities presented by pass sale channels – accessibility and equity, economic efficiency and behavioral targeting, and capturing tax benefits – and identifies empirical questions to explore whether these opportunities are being realized at the MBTA and the CTA. The objective is to learn whether pass sale channels provide any new information about customers (that was not already known from fare products). Does the sale channel used by a customer tell us anything about their demographics, likely fare product choices, purchase patterns, or ridership behaviors? 5.1.1 Accessibility and Equity Perhaps the most direct benefit of offering multiple sale channels is that customers are given options and can purchase passes using whichever channel is most convenient for them. For example, fare vending machines are most commonly located in rail stations and may not be easily accessible to bus-only customers; however, bus-only customers may be able to conveniently purchase passes sold at nearby retail stores. In reality, though, some sale channels are completely inaccessible to some customers (i.e., they are not real options). Mobile apps such as the Ventra app in Chicago and the mTicket app for commuter rail in Boston depend on smartphone ownership and use. Similarly, employer-based pass purchase programs like the Transit Benefit Program (or “Pre-Paid Benefits”) in Chicago and the Corporate Program in Boston are only available to employees at companies that have enrolled in the programs.

The actual use of a variety of sale channels, as shown in Table 2-4 and Table 2-7, already provides evidence that different sale channels reach different customers (as opposed to simply providing multiple options for all customers). To further develop this point, this chapter asks whether the customers that purchase passes through different sale channels exhibit different characteristics. Are different sale channels used by different types of customers in terms of their fare product purchases and their travel behavior? The use of a sale channel by a group of customers indicates that the sale channel is more convenient or less costly to use (in a holistic sense) than alternative sale channels, given current accessibility; however, this could mean either that sale channels are well-tailored to meet the needs and preferences of the customer groups, or that the customer group has poor access to alternative sale channels that would be preferred if they were more accessible. 5.1.2 Economic Efficiency and Behavioral Targeting Sale channels provide opportunities for price discrimination via "channel pricing" -- charging different prices for products obtained by different channels. It was hypothesized above that different sale channels

102 are used by different groups of customers. If the differences between those groups affect their willingness-to-pay for transit in general or for passes specifically (such as variation in the value they place on convenience), then agencies can capture additional surplus by pricing the same pass (or slight variations) differently across sale channels. By this narrow logic, higher prices should be charged through sale channels reaching customers who have a higher willingness-to-pay (i.e. who are less sensitive to price).

Beyond setting different prices, sale channels offer other opportunities to shape customer behaviors. Agencies might use different sale channels to influence the salience of pass prices (as opposed to reaching customers that already have different sensitivity to price). For example, offering auto-renewal of pass purchases makes the price of subsequent pass purchases less salient, since customers can “set it and forget it.” Sale channels also may play a role in forming habits of transit ridership. For example, a convenient sale channel for college and university students may establish routines of purchase and zero-marginal-cost ridership that continue after students graduate.

These opportunities lead to an empirical question about the MBTA and the CTA: Is there evidence that groups of customers preferring or using different sale channels have different willingness to pay for passes or experience different price salience? In light of any differences in sensitivity to price, are the agencies pricing products differently across sale channels to capture additional surplus? 5.1.3 Capturing Tax Benefits Available Through Employers In the U.S., one specific sale channel, employers, presents an opportunity for transit agencies to benefit from federal and state tax benefits for commuting expenses. At the federal level, transit purchases up to a fixed amount ($255 for 2017) made through employers can be excluded from employee wages (effectively exempting them from federal income taxes).21 This preferential tax treatment is currently part of U.S. federal tax law on fringe benefits, so employees can only receive this federal tax benefit if pre-tax transit purchases are offered by their employer.22 At the state level, tax treatment of transit expenses varies; for example, Massachusetts offers the same employee fringe benefit (pre-tax purchase of passes through payroll deduction at participating employers), but also allows deduction of eligible commuting expenses (including MBTA passes) over $150 and up to $750 that are not deducted from wages (e.g., if an employer did not offer a transit benefit).23 The details and implications of these tax benefits are somewhat complicated, so they are described here as general background.

There are many benefits of this favorable tax treatment and the purchase of transit fare products via direct payroll deduction, which are described in detail in Kamfonik (2013). If an employee pays for their transit use via payroll deduction, then the employee saves federal (and perhaps state) income and payroll taxes on the value of their transit purchases, and their employer saves its portion of payroll taxes. Once an employer payroll system is in place to facilitate these pre-tax transit benefits, it is also relatively easy for employers to subsidize or fully cover the cost of their employee’s transit use as an employee benefit. This is even better for employees than an equal raise in pay (since the employee is not taxed); employers were

21 https://www.irs.gov/pub/irs-pdf/p15b.pdf 22 Employees working for employers that do not offer pre-tax transit purchases are not eligible for the federal tax benefit. There is currently no option for these employees to purchase transit passes on their own and later deduct the expense on their personal federal tax returns. 23 https://www.mass.gov/service-details/learn-about-the-commuter-deduction

103 previously allowed to deduct these transit purchases as a business expense, though that deduction is no longer allowed beginning in 2018.24 Regardless of who pays, transit agencies receive full retail price for their fare products, the cost of fare collection may be reduced, and the effective discount received by customers should boost ridership and revenue. Moreover, employees sign up for payroll deductions in advance and typically on an automatically-recurring basis; the convenience of these automatic purchases (or the inconvenience of canceling them) contributes to more consistent employee transit purchases even if their transit ridership varies month to month or declines. Tax benefits and employer subsidies reduce effective transit prices and boost revenue, providing additional funding beyond limited state sources like sales tax revenue (and, in the case of the federal tax benefits, from outside of the state entirely). By increasing use of zero-marginal-cost passes, they also boost transit ridership – the simplest proxy for all of the external, societal benefits of transit (reduced congestion and pollution, economic development, etc.) and therefore the political currency that ensures continued public subsidies for transit.

In the past, this benefit was limited to transit pass products purchased directly from transit agencies; store transit value for pay-per-use ridership was not tax deductible. This made pass products more attractive relative to pay-per-use fares by effectively lowering pass prices for employees (thereby lowering the pass “multiple” – the number of rides a customer would need to take to “break even” on a pass purchase). Given the many potential strategic advantages of pass products to transit agencies – upfront revenue, lower fare collection costs, loyalty from frequent riders, encouraging habits of zero-marginal-cost ridership, lower sensitivity to fare changes – this favorable tax treatment for passes was a boon for transit agencies. But as described in IRS Revised Rule 2014-32, since 2012 these payroll deductions can be applied either to direct purchase of any transit fare products or transit value from a transit agency (such as a pre-loaded transit card or value loaded to a customer account) or to terminal-restricted debit cards that can only be used at transit agency sales points (such as fare vending machines).25 Mechanically, there are two primary ways these purchases take place: 1) an employer participates in a transit agency employer purchase program offering direct sale of fare products or transit value to the employer, or 2) a third-party employee benefits administrator hired by a company (such as WageWorks or Benefit Strategies) either participates in the transit agency purchase program on behalf of the company or provides terminal- restricted debit cards to employees. Regardless of the mechanics, if both stored transit value and pass products are made available pre-tax at equal convenience to employees (such as through terminal- restricted debit cards), then the tax treatment will not favor passes relative to pay-per-use ridership; however, all of the other benefits of the employer sale channel discussed above still apply.

As with any subsidy, the beneficiaries of this federal tax expenditure are determined in part by the relative elasticities of supply and demand for passes. This is true regardless of the fact that employees and employers actually receive the subsidy (not the transit agency). It is instructive to illustrate the situation using simple supply and demand curves. In this case, the relevant market is employees purchasing transit passes. Introduction of a pre-tax employer sale channel shifts demand for passes out for two reasons. First, the lower price draws some employees from other commute modes (like cars). Second, if only passes are eligible for pre-tax purchase, then they become more attractive relative to pay-per-use ridership, causing some existing transit customers to switch from pay-per-use to a pass. Short-run supply

24 https://www.nctr.usf.edu/programs/clearinghouse/commutebenefits/ 25 https://www.irs.gov/pub/irs-drop/rr-14-32.pdf

104 of passes by the transit agency is perfectly elastic at the retail price of a pass. The shift in demand and perfectly elastic supply result in an increase in customer surplus, summarized in Figure 5-1.

Figure 5-1: Employee Market for Passes With Introducing a Pre-Tax Employer-Based Pass Program (Impact on Employees)

Consider, now, the perspective of the transit agency. Under perfect competition (prices set at marginal cost) and perfectly elastic supply, the surplus generated by a subsidy (here, the tax benefit) would accrue entirely to consumers; however, transit pricing is not competitive, so retail pass prices are distinct from agency marginal costs. The marginal cost of an employer-based pass to the transit agency is either a) whatever fares the customer would have paid if a pre-tax employer pass were not available (i.e. the opportunity cost for existing transit customers), or b) the marginal cost of supplying additional transit service, for customers that would have used a non-transit mode in the absence of a pre-tax employer pass. Restricting focus to the opportunity cost for existing transit customers, how does this marginal opportunity cost compare to the price of an employer-based transit pass? There are reasons to suspect that it is below the price. Customers at the “top” of the demand curve (i.e. who value an employer-based pass much more than the pass price) would likely purchase passes even if they were not available through the employer, but the subscription nature of the employer sale channel makes it likely that they will purchase passes even in months with lower ridership. As we move down the demand curve, we pick up customers with lower and lower expected transit use. If only passes are offered pre-tax, then some of these lower-use customers will have switched from riding pay-per-use to purchasing a pre-tax pass; the agency opportunity cost for these customers is lower (oddly, downward-sloping) since they had lower spending before switching to a pass. If both passes and stored value are available pre-tax at the same convenience, then the agency marginal cost curve would remain flat, and the increase in pass sales would be somewhat reduced (since, as discussed above, the shift in demand would be smaller). Figure 5-2

105 summarizes the surplus captured by the transit agency from selling passes to employees, both before and after introduction of tax benefits. The tax benefit lowers the effective price of a transit pass, making it attractive to additional company employees with lower average transit use; as more employees purchase the pass, the transit agency continues to receive the full retail pass price but bears an even lower marginal opportunity cost.

Figure 5-2: Employee Market for Passes With Introducing a Pre-Tax Employer-Based Pass Program (Impact on Transit Agency)

As described above, there are additional important benefits to transit agencies of the employer sale channel that are not explicit or not reflected in the diagram. First, employees who enroll in automatic pass purchases through their employers are less likely to modify or promptly cancel their “subscription” when their demand changes (such as during a month with a school vacation); the automatic renewal lowers the price salience as discussed above under behavioral targeting. This contributes to the lower average “use value” of passes purchased through employers, which translates to lower marginal costs for the transit agency. Second, while the graphs describe “surplus” in terms of fare revenue, the additional pass use under employer pass programs also boosts ridership; ridership is essential to agencies’ bottom line, since it justifies ongoing public subsidy of transit. Third, while this analysis focuses on employees and the transit agency, recall that employers also benefit from offering transit passes pre-tax, both via payroll tax savings and by competing for employees. This creates an incentive for employers to further subsidize employee transit passes, which enhances the transit agency benefits described above.

There are many interesting questions related to employer sale channels and tax-preferred transit expenses. Only one empirical question is considered here: Does AFC data provide evidence about the magnitude of the transit agency surplus captured as a result of the federal tax incentives and employer pass sales?

106

5.2 Pass Sale Channels at the MBTA and CTA The MBTA and CTA both use many different sale channels for passes and other fare products. Table 5-1 lists these sale channels by type.

Table 5-1: Sale Channels at the MBTA and CTA Sale Channel Type MBTA Sale Channel Names CTA Sale Channel Names Ticket Vending Machine (TVM), Vending Machine Fare Vending Machine (FVM) Ventra Vending Machine (VVM) Bus and Green Line Farebox, Onboard Bus Farebox Commuter Rail Conductor Ticket Office Machine (TOM) / Ticket Office Sales Office Terminal (SOT) / Ventra Customer Service Center Ticket Booths / CharlieCard Store Phone Call Center Web Site Online / MyCharlie Web Site Patron Website Mobile App mTicket (commuter rail only) Mobile Ventra Retail Locations Retail Sales Locations Retailers / Retail Network MyCharlie Recurring Pass Automatic Renewal Threshold Autoload Program Bulk Orders Group Orders Program Group Sales Program Pre-Paid Benefits (PPB) / Transit Employers Corporate Pass Program Benefits Program Schools Student Pass Program Student Reduced Fare Semester Pass Program Universities UPass Program Mobility Pass Program Other Distribution Youth Pass Program Distributor Order, RTA Partnerships Sources: MBTA Accounting, CTA Ventra

The focus of this chapter is on pass sale channels; some channels such as fareboxes do not offer pass purchases. This analysis is also limited to full-fare passes, both for convenience and because full-fare products generate a large majority of fare revenue; sale channels dedicated to discounted fare products, such as student and youth pass programs, are not analyzed. 5.3 Prior Work on Pass Sale Channels The literature related to pass sale channels has focused on employers and the commuter benefits programs that they provide to employees. TCRP reports in 2003, 2005, and 2010 evaluated the impacts of commuter benefits programs and provided guidance to transportation agencies on improving the effectiveness of these employer programs (ICF Consulting 2003, 2005; Kuzmyak, Evans, and Pratt 2010). See Bueno et al. (2017) for a recent review of this literature.

The most pertinent study to the questions in this chapter is Kamfonik (2013), which described and analyzed the Corporate Pass Program. The Corporate Program facilitates pre-tax purchase of MBTA passes via employee payroll deductions at participating employers. Kamfonik found that Corporate passes had lower use and were cancelled less frequently than other passes; 4-6% of Corporate Monthly LinkPasses had no use at all. She estimated that Corporate Monthly LinkPasses alone contributed

107 approximately $4.4 million annually in revenue that would not be collected if the Corporate Pass Program did not exist, and that the program generated $10 million in additional revenue across all pass types (including commuter rail); this calculation is described and repeated later in this chapter. Kamfonik also observed that Corporate passes exhibited a lower sensitivity than other passes to the MBTA’s 2012 fare change. Finally, she conducted a survey of participating employers and explored relationships between employer characteristics and employee participation rates in the Corporate Pass Program. This study provided the foundations for recommended improvements and expansions of the MBTA Corporate Program in Filler (2015) and Dawson (2018).

Another study relevant to MBTA pass sale channels, Rosenfield (2018), explores the impacts of the AccessMIT commuter benefits program at the Massachusetts Institute of Technology (MIT). One element of this program is provision of free transit access to all of its faculty and staff through an MBTA smartcard chip in MIT employee ID cards; MIT is then billed retroactively by the MBTA only for actual use of the cards (based on pay-per-use fares), making it financially feasible for MIT to provide transit access to all employees (even those who ride too little to justify purchase of monthly pass). This creative extension of zero-marginal-cost transit access and employer subsidies grew out of MIT participation in the MBTA Corporate Pass Program, and it was only realized after years of collaboration between MIT and the MBTA and a successful pilot program at MIT. This experience highlights the need for long-term strategies that strengthen institutional relationships.

Beyond research on employer transit benefit programs (of which employer pass sales are one part), there has been little research on differences in transit user behavior across pass sale channels or on the strategic potential of pass sale channels for transit agencies. 5.4 Using AFC Data to Describe the Role of Pass Sale Channels at the CTA and MBTA

5.4.1 Accessibility and Equity As discussed above, the empirical question about accessibility and equity is a descriptive one – whether customers using different pass sale channels have different sets of characteristics. AFC data does not provide information on customer demographics or travel on non-transit modes, but it does allow description of cards and accounts in terms of their fare product purchases and their transit travel behaviors.

At both the CTA and MBTA, substantial variation is observed in the pass types sold through each sale channel. Table 5-2 shows the revenue shares for MBTA full-fare monthly and 7-day pass types by sale channels in FY2017. By far the largest two sale channels are the Corporate Program and fare vending machines (FVMs), which account for 51% and 30% of the MBTA’s full-fare pass revenue (respectively). (Recall that many passes purchased at fare vending machines are also purchased using pre-tax debit cards provided by employers; it is not currently possible to distinguish these purchases within the FVM sale channel.) Ticket offices and the mTicket app have much lower total sales but represent a sizeable share of commuter rail and commuter boat pass sales. Online, Retail, and Semester Pass sale channels represent small shares of pass revenue. Looking across pass types, the vast majority of pass revenue

108 comes from monthly LinkPasses (rapid transit) and monthly commuter rail passes, followed by 7-day LinkPasses.

Table 5-2: MBTA Full-Fare Pass Sales by Sale Channel, FY2017 (July 2016 – June 2017)

What Passes are Purchased at Each Sale Channel? (Pass Sale Channel Composition by Pass Type) Monthly 7-Day Bus & Monthly Monthly Bus & Rapid Rapid Monthly Monthly Commuter Commuter Transit Transit Monthly Inner Outer Rail Boat Total Total ("LinkPass") ("LinkPass") Local Bus Express Bus Express Bus (+ Bus & RT) (+ Bus & RT) (%) ($MM) Corporate 45% - 1% 2% <1% 49% 2% 100% 189.3 FVM 48% 41% 2% 2% <1% 7% <1% 100% 112.4 TOM/SOT 6% 2% <1% 1% <1% 90% <1% 100% 23.9 mTicket - - - - - 96% 4% 100% 16.4 Retail 34% 37% 10% 2% <1% 16% <1% 100% 10.3 Online 53% - 3% <1% <1% 43% <1% 100% 9.4 Semester 77% - 2% <1% <1% 20% <1% 100% 7.8 Other 62% 9% - - - 30% - 100% 0.8 Overall (%) 42% 13% 2% 2% <1% 39% 1% Total ($MM) 155.6 49.9 6.3 6.4 1.6 146.0 4.5

Where is Each Pass Type Purchased? (Pass Type Composition by Pass Sale Channel) Monthly 7-Day Bus & Monthly Monthly Bus & Rapid Rapid Monthly Monthly Commuter Commuter Transit Transit Monthly Inner Outer Rail Boat Overall Total ("LinkPass") ("LinkPass") Local Bus Express Bus Express Bus (+ Bus & RT) (+ Bus & RT) (%) ($MM) Corporate 55% - 42% 57% 50% 64% 83% 51% 189.3 FVM 35% 92% 34% 34% 38% 5% <1% 30% 112.4 TOM/SOT <1% <1% 1% 4% 7% 15% 2% 6% 23.9 mTicket - - - - - 11% 14% 4% 16.4 Retail 2% 8% 17% 3% 4% 1% <1% 3% 10.3 Online 3% - 4% <1% 1% 3% <1% 3% 9.4 Semester 4% - 2% <1% <1% 1% <1% 2% 7.8 Other <1% <1% - - - <1% - <1% 0.8 Total (%) 100% 100% 100% 100% 100% 100% 100% Total ($MM) 155.6 49.9 6.3 6.4 1.6 146.0 4.5

Source: MBTA Accounting

For the most part, Rapid Transit and bus passes are distributed similarly across the different sale channels; however, a lower share of Local Bus passes is sold through the Corporate Program, and a larger share is sold in the Retail Network. This could indicate that a smaller share of Local Bus pass-holders is employed, that fewer sign up for their employers’ pre-tax transit benefits, or that their employers are less likely to offer pre-tax transit passes.26 Relative to rapid transit and local bus, a higher share of commuter rail and commuter boat passes are sold through the Corporate Program, ticket offices, and the mTicket

26 Regardless of the reason, it seems likely that lower Corporate Program enrollment causes some Local Bus pass- holders to use FVMs and some to use the Retail Network. FVMs are primarily located at rapid transit stations or major bus terminals, so customers using only Local Bus would otherwise be expected to have lower FVM use.

109 app (which only offers those products). This could have the reverse explanation from Local Bus pass- holders – a larger share are employees at companies offering pre-tax passes – but it is also likely a function of visual validation on these modes and general customer preference for digital tickets or CharlieCards over CharlieTickets.27 Seven-day passes are not available through the Corporate Program or the MBTA web site, and they are sold mostly at FVMs but also disproportionately through the Retail Network (suggesting higher bus-only ridership).

Taken together, sale channels are clearly correlated with fare product choices (and the different customer characteristics implied by fare products). A random customer in the Corporate Program or at a ticket window is more likely to use commuter rail or commuter boat. FVM customers are more likely to purchase 7-day passes, and Retail Network customers are more likely to purchase Local Bus or 7-day passes (all relative to overall shares of revenue for each pass type).

AFC also allows examination of whether fare products are used differently across sale channels. As a demonstration, the figures below show how AFC use frequency, time of use, and transit mode for the monthly LinkPass vary by the sale channel that was used for purchase. (Note that it was not possible to distinguish Online sales from other sale channels for this analysis, so they are included in the Other category.) Figure 5-3 shows significant variation in the median frequency with which LinkPasses are used across sale channels. Corporate and Other (mostly Online) passes are used least frequently, and passes sold on the Retail Network are used most frequently. This variation is the focus of the next section on economic efficiency and behavioral targeting.

Figure 5-3: MBTA Monthly LinkPass Frequency of Use (Taps), October 2016

Sources: MBTA AFC, MBTA pass sale program administrative data Notes: Only includes LinkPasses that were used in Oct. 2016 (excludes LinkPasses with zero use). Outliers are not shown.

LinkPasses also vary in the times of day that they are used. Figure 5-4 shows that passes sold through the Corporate Program are the most peaked in their use, followed by passes sold in the FVM and

27 Commuter rail and boat passes are only available as CharlieTickets at FVMs, since a pass on a regular CharlieCard could not be validated visually by a conductor. They are offered as digital tickets on the mTicket app and CharlieCards (printed with the current month) through the Corporate Program.

110

Other/Online sale channels. Semester and Retail Network passes are least peaked and have higher evening and late night ridership.

Figure 5-4: MBTA Monthly LinkPass Time of Use, October 2016

Sources: MBTA AFC, MBTA pass sale program administrative data Notes: All days of the week are combined

Finally, Figure 5-5 shows variation in the percentage of LinkPass taps that are made at a gate or a farebox, as proxies for rail and bus ridership. Corporate, FVM, TOM/SOT, and Other/Online LinkPasses are used more at gates, while Retail Network and Semester passes have a higher share of use on fareboxes (which includes the surface Green Line and Mattapan Trolley light rail).

Figure 5-5: MBTA Monthly LinkPass Transit Mode Shares, October 2016

Sources: MBTA AFC, MBTA pass sale program administrative data

Some of the differences across sale channels at the MBTA are also observed at the CTA. Table 5-3 summarizes the revenue share by sale channel for CTA’s full-fare pass types in 2017. The CTA has only two major pass types that account for over 90% of full-fare pass revenue -- a 30-Day Pass valid on CTA rail, CTA bus, and Pace bus (the suburban bus operator), and a 7-day pass valid on the CTA rail and bus. These two passes have different primary sale channels; about 70% of 30-day passes are sold through the Pre-Paid Benefits Program and online channels (mobile app, auto load, and web site), while about

111

70% of 7-day passes are sold at Ticket Vending Machines and in retail stores. This indicates that the CTA's sale channels reach different groups of customers. It also suggests that 7-day passes and 30-day passes represent different markets for the CTA rather than merely alternatives for each customer.

Table 5-3: CTA Full-Fare Pass Sales by Sale Channel, 2017

What Passes are Purchased at Each Sale Channel? (Pass Sale Channel Composition by Pass Type) CTA/Pace 30- 30-Day Metra CTA 7-Day CTA/Pace 7- CTA 3-Day Total Day Pass Link-Up Pass Pass Day Pass Pass Total (%) ($MM) TVM 45% - 41% 7% 6% 100% 39.7 PPB 100% - - - - 100% 33.6 Retailers 30% - 58% 11% 1% 100% 31.9 Mobile Ventra 62% 2% 30% 5% 1% 100% 20.8 Threshold Autoload 97% - 3% <1% <1% 100% 17.8 Patron Website 67% <1% 27% 5% 2% 100% 9.8 Distributor Order - 34% 65% <1% - 100% 9.3 Other 50% <1% 33% 15% 2% 100% 1.9 Overall (%) 60% 2% 31% 5% 2% Total ($MM) 98.8 3.6 50.9 8.1 3.5

Where is Each Pass Type Purchased? (Pass Type Composition by Pass Sale Channel) CTA/Pace 30- 30-Day Metra CTA 7-Day CTA/Pace 7- CTA 3-Day Overall Total Day Pass Link-Up Pass Pass Day Pass Pass (%) ($MM) TVM 18% - 32% 35% 72% 24% 39.7 PPB 34% - - - - 20% 33.6 Retailers 10% - 36% 41% 13% 19% 31.9 Mobile Ventra 13% 11% 12% 12% 8% 13% 20.8 Threshold Autoload 17% - 1% <1% <1% 11% 17.8 Patron Website 7% <1% 5% 6% 5% 6% 9.8 Distributor Order - 89% 12% <1% - 6% 9.3 Other <1% <1% 1% 4% 1% 1% 1.0 Total (%) 100% 100% 100% 100% 100% Total ($MM) 98.8 3.6 50.9 8.1 3.5

Source: CTA Ventra

As at the MBTA, the use of CTA passes sold through different sale channels provides further evidence that sale channels capture behaviorally distinct groups of customers. Figure 5-6 and Figure 5-7 show that passes sold pre-tax through payroll deductions ("Pre-Paid Benefits"/PPB) and passes sold online (via mobile app, auto load, and web site) are used less frequently and more at peak times than passes sold at fare/ticket vending machines (FVM/TVM) and on the retail network. As at the T, Figure 5-8 shows that passes sold in the retail network have the highest share of rides on bus. These patterns hold for both 7- day passes and 30-day passes.

Incidental to the main focus of this chapter, comparing the two pass types to each other also highlights some interesting distinctions in use. Seven-day passes have much higher tap frequency than 30-day

112 passes on a weekly basis (partly a function of taking more transfer trips, which have multiple taps). Seven-day passes also have a less peaked time distribution, and they rely more on bus service than 30-day passes.

Figure 5-6: CTA Pass Frequency of Use (Taps), October 2017

Source: CTA Ventra Notes: Horizontal axes on the two graphs aligned such that 30-day pass taps equal 30/7 times 7-day pass taps (for comparability across pass types).

113

Figure 5-7: CTA Pass Time of Use, October 2017

Source: CTA Ventra Notes: All days of the week are combined

Figure 5-8: CTA Pass Transit Mode Shares, October 2017

Source: CTA Ventra

114

5.4.2 Economic Efficiency and Behavioral Targeting The previous section shows that sale channels capture behaviorally different groups of customers, based on behaviors that can be observed directly using AFC data. In the context of pass pricing, the most important behavioral differences are willingness to pay (WTP) for passes and price salience, which cannot be observed directly. However, two observations suggest that WTP and price salience do vary systematically across sale channels: variation in “use value” distributions (related to frequency of use) and different observed responses to fare changes.

Use Value Distributions Frequency of transit ridership was seen earlier to vary systematically across sale channels within the same pass type. Obviously, frequency of use is closely connected to WTP for passes. As a customer’s expected frequency of transit use increases, the cost of taking their expected trips using pay-per-use increases; the cost of taking a set of trips using pay-per-use fares is referred to as the “use value” of those trips. As a customer’s weekly or monthly use value increases, a fixed-price pass becomes more attractive relative to pay-per-use. In other terms a customer cares about the expected “opportunity cost” of buying a pass -- the pass price minus the cost of taking their expected rides using pay-per-use (i.e. the pass price minus the expected “use value”), or how much money they could expect to save by riding pay-per-use rather than buying a pass.

With these concepts in mind, WTP for a pass could vary across groups of customers in two ways. First, the opportunity cost of a pass could vary. If one group of customers has higher use value, the opportunity cost of an unlimited-use pass will be lower and they will be willing to pay more for the pass (on average). Alternatively, if the effective price of a pass is lower for one group of customers (such as customers who can purchase a pass pre-tax through payroll deduction), the opportunity cost of purchasing a pass will also be lower for that group. Second, sensitivity to opportunity cost could vary across groups of customers; at any given expected “use value” and pass price, willingness to pay or the probability of selecting a pass over pay-per-use could vary. For example, between otherwise similar groups of customers, a group that places greater value on the convenience of a pass will be willing to pay more on average than a group that places less value in that convenience. Alternatively, a group that pays little attention to pass prices may be more likely to purchase a pass (or to continue purchasing passes) than another group with similar use value. Both of these sources of variation in WTP for passes could be important for a targeted pricing strategy.

Unfortunately, this analysis does not distinguish between these two sources of variation in WTP – variation in the opportunity cost of passes and variation in sensitivity to pass costs. This chapter looks only at pass use across sale channels. (Expected use value is not observed in AFC data, but actual use value can serve as a proxy.) Passes on their own can only show the distribution of use value conditional on choosing a pass. Differences in these pass distributions across sale channels could be driven by a combination of different opportunity costs and different cost sensitivities. One possible extension of this analysis is to assign pay-per-use customers (or "customer-weeks") to sale channels in order to observe a more complete sale channel distribution of use value (with pass “market shares” at each level of use value); this would be more difficult than identifying the sale channel of any particular pass. Chapters 6 and 7 present complete use value distributions across fare products, but they do not differentiate by sale channel.

115

Nevertheless, it is still informative to look at how pass use value varies across sale channels. Figure 5-9 and Figure 5-10 show use value distributions for the Monthly LinkPass at the MBTA and the 7-day and 30-day passes at the CTA, with pass prices overlaid as black vertical lines. As with earlier plots of tap frequency, passes sold through pre-tax payroll deduction via employers and through online sale channels have considerably lower use value distributions than passes sold in retail stores. The breakdown of online sales at the CTA shows that autoload passes have use value similar to employer-based pass programs, while other mobile app and web site sales fall in the middle. Autoload customers at the CTA and the MBTA (in the Other/Online category) likely include many employees who are receiving pre-tax debit cards through third-party benefits administrators and using them to automatically purchase passes (rather than making direct pre-tax pass purchases). Passes sold at TVMs fall in the middle of the use value distributions at the MBTA, but at the CTA they have a similar distribution to the retail network.

Figure 5-9: Use Value of MBTA Monthly LinkPasses Sold in October 2016

Sources: MBTA AFC, MBTA pass sale program administrative data Notes: Includes passes that are sold but never used (use value = $0)

116

Figure 5-10: Use Value of CTA Passes Sold in October 2017

Source: CTA Ventra Notes: Black lines show pre-2018 pass prices. For comparability across graphs, the axis for the 30-day pass graph is scaled by a factor of 30/7 from the 7-day pass graph; the price of a 7-day pass is higher per day/week/month than the price of a 30-day pass. Charts include only full-fare, account-based passes. CTA passes are for rolling periods and are activated on first use, so passes that have not yet been used or were still in use at the time of analysis were excluded (about 1% of 7-day passes and 3% of 30-day passes).

As discussed above, these differences in use value across sale channels at both the CTA and MBTA could result from several different factors:

 Different underlying demand for different groups of customers. For example, customers using employer-based pass sale programs might use transit primarily for commute trips, while customers using the retail network might rely on transit for both work and non-work travel (giving them a higher distribution of transit travel frequency).  Different effective prices in different sale channels. Employer-based passes are tax preferred and sometimes subsidized by employers, and some customers in other sale channels also purchase passes pre-tax using prepaid debit cards.  Differences in price salience. Employer-based pass programs and automatic purchase renewal through online sale channels make pass prices less salient, while at TVMs and retail stores prices are visible and decisions are likely more deliberate.  Different preferences over upfront costs and convenience for different groups of customers. Sale channels likely influence use value distributions through price salience, but there is also self-

117

selection; for example, customers who have lower use value but benefit from the convenience of automatically-recurring pass purchases may be willing to pay more than their use value for that convenience.

The simple conclusion across all of these factors is that WTP for passes does vary systematically across sale channels. Testing for the specific sources of that variation is left to future work.

Response to Fare Changes A second way to look for variation in WTP across pass sale channels using AFC data is to observe changes in pass sales around fare changes. As discussed at length in Chapter 6, changes in total sales following fare changes are the net effect of both shifting to other modes (typically captured in a fare elasticity parameter) and switching to other fare products. Chapter 6 presents an attempt to differentiate those changes following the July 2016 fare change at the MBTA in order to estimate elasticities for MBTA Corporate and Non-Corporate Monthly LinkPasses. However, both mode switching and fare product switching reflect WTP for passes; this section simply summarizes net changes as evidence that WTP varies across pass sale channels. Similar to the use value analysis above, variation in the impacts of a fare change across sale channels could stem from either different changes in costs or different sensitivity to changes in cost. These are not distinguished here. The two sources of variation are represented more explicitly in the fare change scenario prediction procedure presented in Chapter 7 (using empirical distributions of use value across all fare products and assumed differences in fare elasticities), but they are not differentiated by sale channel.

The MBTA increased pay-per-use fares and pass prices on July 1, 2016. As described in Chapter 2, Monthly and 7-Day LinkPass prices were increased by about 12%, while pay-per-use fares on CharlieCards were increased by about 6%. Figure 5-11 shows year-over-year percent changes in MBTA Monthly LinkPass sales for the Corporate Pass Program and all other sale channels, both before and after the fare change. (Inconsistencies in MBTA accounting data make it difficult to disaggregate non- Corporate sale channels.) Sales in both the Corporate Program and other sale channels appear to drop following the fare change, but the reduction is much larger for Non-Corporate sale channels. The reduction in Corporate Program passes beginning in September 2016 is largely explained by MIT employees moving from the Corporate Program to the new Mobility (Universal) Pass Program (rather than a response to the fare change); summaries of pass sales in Chapter 6 adjust for this change. Regardless, Corporate LinkPasses appear to be much less sensitive to price changes than other sale channels (combined).

118

Figure 5-11: Relative Change in MBTA Monthly LinkPass Sales by Sale Channel, FY16-17

Source: MBTA Accounting

The CTA increased pay-per-use fares by $0.25 (11-12%) and increased 30-day pass prices by $5 (5%) on January 7, 2018. The impact on 30-day pass sales was unknown; some customers would likely stop purchasing passes due to the higher price, but other customers would switch from pay-per-use to the 30- day pass (since the pass “multiple” went down in spite of the price increase). Seven-day pass sales were expected to increase as customers switch away from pay-per-use toward 7-day passes.

Figure 5-12 shows year-over-year percent changes in CTA 7-day and 30-day pass sales by pass sale channel for six months before the fare change and three months after the fare change. (Since the fare change took effect on January 7th, January 2018 was only partially affected.) It is likely too early for the full effect of the fare change to be visible, but there appears to be a small uptick in 7-day pass sales relative to trends before the fare change across all sale channels. The largest relative increases are in online sale channels – autoload, the Ventra web site, and the Ventra mobile app. This is somewhat surprising, given that previous summaries suggest customers using these online sale channels have a higher willingness-to-pay for pass products (associated with lower sensitivity to transit costs in general); however, these online sale channels have relatively low volumes of 7-day pass sales (especially autoload). There are no obvious patterns yet in 30-day pass sales following the fare change; an uptick in autoload passes seemed to reverse in March 2018, and a dip in web site sales in January and February may have been caused by customers shifting their purchases up to December (before the fare change).

119

Figure 5-12: Relative Change in CTA Pass Sales by Sale Channel, July 2017 – March 2018

Source: CTA Ventra Notes: Graph includes only sales of full-fare passes. Seven-day passes include account-based CTA 7-day passes and CTA/Pace 7-day passes. Seven-day pass tickets (roughly 10% of full-fare 7-day pass sales) are excluded; these tickets use separate group and bulk sale channels and have high variability in month-to-month sales.

The July 2016 fare change at the MBTA shows a clear difference in price sensitivity of Monthly LinkPasses across sale channels. So far, CTA sale channels have not differed substantially in their response to the January 2018 fare change; however, later summaries may tell a different story once the fare change has had its full effect.

Agency Strategy Use value distributions and sales around fare changes at the CTA and MBTA show that WTP for passes varies systematically across sale channels. There are at least two ways in which transit agencies might logically use this variation to capture greater surplus.

1. First, pass prices could be set higher in sale channels with higher WTP (“channel pricing”). This would be most effective if the sale channel itself were thought to affect WTP or price salience (as opposed to merely correlating with them) or if it were difficult for customers to switch sale channels to avoid a price premium. Channel pricing would effectively use sale channels to define new fare products at different prices, which could clearly have important implications for equity (and Title VI review). 2. Second, sale channels with higher WTP could be promoted. This would only be appropriate if (as above) the sale channel itself affected WTP or price salience, and if it were relatively easy for customers to switch sale channels or adopt a new sale channel. Promotion could both expand

120

pass sales and mitigate pass-holder attrition by making purchase decisions less sensitive to variation in monthly travel demand or changes in fare policy.

These two strategies might conflict with each other; the first would increase prices on sale channels with higher WTP, and the second might even lower prices on the same sale channels to promote their use.

Current fare structures at the MBTA and the CTA generally do not differentiate prices across sale channels. The closest example is at the MBTA, where commuter rail passes on the mTicket app are sold at a $10 discount; however, the mTicket commuter rail passes do not include travel on bus and subway. Depending on average bus and rail ridership of mTicket users, this discount could reflect price discrimination.

Both agencies do, however, offer employer-based pass sale channels, which have lower effective prices than other sale channels because passes can be purchased pre-tax through payroll deductions. Earlier summaries of use value and sales suggest that customers buying passes through these employer-based sale channels have a higher WTP than other pass purchasers. This might suggest charging these customers a higher pass price, but instead they are charged a lower effective price. This makes sense for a few reasons. Agencies do not control the tax benefits offered by the federal and state government, and they receive full retail price from these tax-preferred passes. Additionally, the advance commitment and automatic renewal of pass purchases through payroll deduction likely decreases price salience; these programs might be worth promoting through price discounts in order to secure more stable pass purchase behavior and shield the agency from adverse impacts of future fare changes. Finally, customers who use the employer-based sale channels likely have easy access to other sale channels, so charging a price premium might be ineffective (and, on the flip side, a price discount may be critical to entice customers from other sale channels).

Both the MBTA and the CTA have made improvements to online sales channels in recent years, including launch and improvement of the CTA Ventra app and the MBTA mTicket app (for commuter rail). These improvements have attracted additional customers to those sale channels. This is consistent with the promotion strategy described above; use value distributions and fare changes suggest that customers using online sale channels have higher WTP, and it seems likely that these channels contribute to lower price salience (especially automatically-renewing purchases). However, neither the CTA nor the MBTA have done much targeted promotion of employer-based pass sale channels in recent years. These employer programs appear similar to online channels in terms of customer WTP; tax benefits lower the effective price of passes in these programs, employer subsidies at some companies lower prices even more, and the use of automatically-recurring payroll deductions for pass purchases is sure to reduce price salience. A recent Pioneer Institute report similarly argued for aggressive promotion of the MBTA Corporate Pass Program and provided ideas for marketing the program (Dawson 2018). Similar promotion efforts could be adopted by the CTA. 5.4.3 Capturing Tax Benefits Available Through Employers This section describes the contribution of employer-based pass sale channels by estimating the agency revenue that would be lost if the channels were eliminated. This also suggests the potential benefit of promoting and expanding these programs.

121

A simple calculation from Kamfonik (2013) is used to estimate revenue attributable to the existence of the employer-based pass sale channels. The logic is as follows:

1. Customers purchasing passes from an employer-based sale channel are divided into two groups: A. Those who have use value below the retail pass price B. Those who have use value above the retail pass price 2. For group B with use value above the pass price, elimination of the employer sale channel (and their ability to purchase passes pre-tax) will cause them either to purchase the same pass through another sale channel or to switch to pay-per-use fares (which become relatively more attractive without the tax-preferred treatment of the monthly pass).28 Those who continue purchasing passes through another sale channel do not affect agency revenue at all. Those who switch to pay-per-use are likely to generate additional fare revenue, since they appeared to be saving money by purchasing a pass. To be conservative, this additional revenue is ignored; it is assumed that group B all continues purchasing passes and has no impact on revenue. 3. For group A with use value below the pass price, elimination of the employer sale channel will likewise cause some customers to continue buying passes through other sale channels and some customers to switch to pay-per-use. However, in this case those who switch to pay-per-use generate less revenue than when they had purchased a pass. To estimate the number of customers who switch to pay-per-use, it is assumed that customers who continue purchasing passes through other channels will now exhibit the same conditional cumulative distribution of use value as all other pass-holders -- that is, that 푃(푢푠푒 푣푎푙푢푒 ≤ 푝푎푠푠 푝푟푖푐푒 | 푝푎푠푠 푐ℎ표푠푒푛) will now be the same for former employer-based pass purchasers and all other pass purchasers). Customers in group A are moved from passes to pay-per-use until this is achieved. To simplify this calculation even further, the difference in the share of passes with use value below the pass price for non- employer pass sales and employer pass sales can be calculated and then multiplied by the total number of employer-based passes (group A + group B); this is conservative, since additional passes would need to switch to pay-per-use to equalize the conditional probability. 4. Finally, the revenue that would be lost due to elimination of the employer-based sale channel is calculated by multiplying the number of customers in group A that switch to pay-per-use by the average use value for customers in group A. This assumes that customers within group A are equally likely to switch to pay-per-use, regardless of their particular use value.

This simple calculation has a few drawbacks. It would be more sound to equate 푃(푝푎푠푠 푐ℎ표푠푒푛 | 푢푠푒 푣푎푙푢푒 ≤ 푝푎푠푠 푝푟푖푐푒) between all customers with the option to purchase an employer-based passes and all other customers (rather than looking only at customers who actually chose to purchase passes), but unfortunately it is not possible to identify transit customers who had access to an employer-based pass and chose not to purchase it using only AFC data. Equating 푃(푢푠푒 푣푎푙푢푒 ≤ 푝푎푠푠 푝푟푖푐푒 | 푝푎푠푠 푐ℎ표푠푒푛) for employer-based passes and all other passes is more straightforward but runs the risk of misattributing differences in the underlying distribution of use value for the two groups to direct impacts of the employer-based sale channel (chiefly the lower effective price of a pre-tax pass and the lower price salience of pre-scheduled and automatically-recurring purchases). This would lead to

28 Recall the use value distributions presented earlier in Figure 5-9 and Figure 5-10; customer choices are not determined solely on weekly or monthly cost, and many customers with use value above the pass price still choose to ride pay-per-use.

122 overestimation of the revenue attributable to the employer-based sale channel. In the other direction, this method assumes that all surplus from an employer-based pass program is generated by attracting existing customers from pay-per-use to a pass; as discussed at the beginning of this chapter, customers also may switch to transit from other modes due to the lower effective price of pre-tax passes. As noted in Kamfonik (2013), customers switching from a pass to pay-per-use would also have reduced ridership once faced with marginal fares (called the “induced ridership” effect in this thesis); this benefit of employer-based sale channels is not included in the calculation.

That said, this calculation does provide a quick, initial estimate of the value of a particular pass sale program. Table 5-4 and Table 5-5 show the calculation for both MBTA Corporate LinkPasses and CTA Pre-Paid Benefits 30-Day Passes.

Based on pass sales and use in October 2016, MBTA Corporate LinkPasses are estimated to contribute about $704,000 per month ($8.4 million per year) in revenue that would not be collected if the Corporate Program did not exist. This is about 58% higher than the same calculation performed in Kamfonik (2013) using data for May 2012 due to intervening changes in fares (which altered the share of passes with use value below the LinkPass price and the loss per pass if a customer were to switch to pay-per-use).29 The MBTA also sells commuter rail passes and bus passes through the Corporate Program; however, bus passes are a very small share of revenue, and the use value of commuter rail passes cannot be calculated from AFC records.30 For a very rough sense of scale, the MBTA sold 31,120 Corporate commuter rail passes and 19,278 non-Corporate commuter rail passes in October 2016. If the results for the LinkPass are applied to these Corporate commuter rail passes (22% would switch to pay-per-use if the Corporate Program were eliminated at a revenue loss of $38.43 per pass), then Corporate commuter rail passes would contribute an additional $263,000 per month or $3.2 million per year. (The actual contribution would depend on distributions of use value and pass prices for each commuter rail zone.)

Increased use of zero-marginal-cost passes through the Corporate Program boost ridership too, though the increase in ridership for customers that would otherwise use pay-per-use is likely much smaller than the increase in revenue. (The relatively low ridership of these customers compared to other LinkPass holders increases the revenue benefit of the pass but reduces the likely ridership impact.) In October 2016, average ridership on LinkPasses with use value less than the LinkPass price was 24.9 validations (AFC taps). For the estimated 22% of Corporate LinkPass purchasers who would ride pay-per-use if the Corporate Program did not exist, some of this ridership is attributable to using a zero-marginal-cost pass and would disappear if the same customers switched to pay-per-use. If it were assumed that pass use increases ridership by at least 5% on average (which seems conservative based on findings in Chapter 6), the total ridership impact would be at least 22,000 rides per month or 261,000 rides each year.31 The Corporate Program also increases MBTA commuter rail ridership, though the impact is more difficult to estimate because commuter rail ridership is not captured by the AFC system.

29 Kamfonik (2013) estimated additional revenue of $311,805 per month at pass prices of $59, which is $446,568 per month after scaling to the current LinkPass price of $84.50. 30 Commuter rail validation is performed visually by conductors. Some commuter rail passes are used on the AFC system (bus and rapid transit); however, this is not a good reflection of variation in total use value across sale channels. 31 18,312*24.9*(1 –1/1.05) = 21,730

123

Table 5-4: Monthly Revenue Loss if MBTA Corporate Monthly LinkPasses Were Eliminated (October 2016) A) Use Value B) Use Value <= $84.50 > $84.50 Total Corporate LinkPasses Number of Passes 54,095 28,549 82,644 % of Passes 65% 35% 100% Average Use Value $46.07 Non-Corporate LinkPasses Number of Passes 33,801 44,265 78,066 % of Passes 43% 57% 100% Corporate LinkPasses Switching to PPU % of Total Corporate Passes 22% Number of Passes 18,312 Loss per Pass ($84.50 - Avg Use Value) $38.43 Total Monthly Revenue Loss $703,726 Sources and notes: Only includes full-fare LinkPasses. The total number of Corporate and non-Corporate passes sold in October 2016 is from MBTA accounting data. The share of passes with use value below the pass price is from analysis of MBTA AFC and sales data for October 2016, including passes that were sold but never used on the AFC system in October 2016 (5.6% of Corporate LinkPasses and 0.7% of non-Corporate LinkPasses).

The same revenue calculation for CTA 30-day passes sold in October 2017 gives an estimate of $249,000 per month ($3.0 million per year) in revenue that would not be collected if Pre-Paid Benefits 30-Day Passes were not available. While this is an important contribution, it is less than half the value at the MBTA. The CTA has lower overall pass sales than the MBTA, due in large part to much higher pass multiples for CTA 30- and 7-day passes than for MBTA passes (see Table 2-3, Table 2-6, and Figure 2-20). The CTA also sells a lower share of its passes through employers than the MBTA; while neither agency has aggressively expanded their employer programs in recent years, the MBTA still benefits from marketing of the Corporate Program in the 1980s and 90s (Kamfonik 2013).

124

Table 5-5: Monthly Revenue Loss if CTA Pre-Paid Benefits 30-Day Passes Were Eliminated (Passes Sold in October 2017) A) Use Value B) Use Value <= $100 > $100 Total PPB 30-Day Passes Number of Passes 23,639 4,878 28,517 % of Passes 83% 17% 100% Average Use Value $70.16 Non-PPB 30-Day Passes Number of Passes 30,616 26,464 57,080 % of Passes 54% 46% 100% PPB 30-Day Passes Switching to PPU % of Total PPB Passes 29% Number of Passes 8,343 Loss per Pass ($100 - Avg Use Value) $29.84 Total Monthly Revenue Loss $248,988 Sources and notes: Only includes full-fare 30-Day Passes. The percentages of passes with use value below the pass price excludes about 6% of PPB passes and about 1% of non-PPB passes that had not yet been used at the time of analysis. These passes are then included in the total number of passes based on these percentages; this conservatively assumes that they will eventually be used with the same use value distribution as the rest of the 30- day passes.

At both the CTA and the MBTA, the estimated loss from eliminating employer-based sale channels is substantial. Even given the potential issues with this simple calculation, this result suggests strong potential to capture additional revenue if more customers are attracted into employer-based passes from pay-per-use travel. Unlike fare increases, which create a tension between revenue and ridership, expanding employer pass programs also increases ridership by providing more transit users with zero- marginal-cost passes.

More generally, it highlights the stark differences in willingness-to-pay across sale channels, which could inform broader marketing and fare policy strategies. At both the CTA and MBTA, the majority of employer-based passes are used less than the pass price. One of the drivers of this pattern is the favorable tax treatment, which makes passes worthwhile to customers with lower use value. The most obvious way to take advantage of this is to grow direct employer-based pass sales; however, there are also other ways to provide additional customers with access to pre-tax passes. Many of the passes purchased through non-employer sale channels are also purchased by employees using pre-tax dollars; instead of using a direct pass sale program, these employees using pre-paid debit cards from third-party benefits administrators to purchases passes using other channels such as ticket vending machines and online autorenewal (linked to the prepaid debit card). Kamfonik (2013) estimated that 6-17% of total monthly pass sales (15-40% of non-employer-based passes) at the MBTA were purchased in this manner. These pre-tax pass purchases are not attributable to any transit agency program, but they likely contribute additional revenue that would not be collected if this tax benefit were not available. Any marketing, programs, regulatory changes, or policy changes that similarly increase customer access to pre-tax transit

125 passes would likely generate additional revenue. Some ideas to expand this access are described in the next section. 5.5 Conclusions and Implications In this chapter, sales and use of passes were described and compared across different sale channels at the MBTA and the CTA. There were several notable patterns common to both agencies. Passes sold through employer programs (paid pre-tax via employee payroll deduction) had the lowest average ridership of all sale channels, followed closely by passes sold online with automatic renewal. (These automatically- renewing passes likely include some employees who receive pre-tax debit cards through their employers.). Passes sold at retail locations had the highest average use, followed by ticket vending machines. The "use value" distributions of passes in these sale channels show this variation in pass use relative to the retail pass price, and passes in the MBTA's employer-based pass program were much less sensitive to a recent fare change than passes in other sale channels. This variation in customers’ willingness to pay for passes could be explained by a combination of 1) differences in the populations that use the various sale channels and 2) the direct impact of sale channels -- particularly the reduced effective price of employer-based passes (due to tax benefits and employer subsidies) and reduced price salience associated with advanced and automatically recurring purchases.

Regardless of the specific mix of causes, variation in willingness to pay across pass sale channels provides strategic opportunities for transit agencies to advance their objectives. Using the calculation presented in Kamfonik (2013), employer-based sales of monthly passes at the MBTA and CTA are estimated to contribute over $11 million and $3 million (respectively) in annual revenue that would not be collected if those programs did not exist, and they increase ridership at the same time by providing additional transit users with zero-marginal-cost passes. Importantly, this revenue and ridership derive from tax benefits and employer subsidies that are outside of normal state-level transit funding sources (chiefly sales taxes). While the existing benefit of these programs is substantial, only about one third of CTA 30-day passes are purchased through the Pre-Paid Benefits program, and the MBTA Corporate Pass Program has not grown in step with Boston employment. Expansion of employer programs and other sale channels with tax benefits and automatic purchase renewal could contribute even more revenue and ridership while addressing inequities in access to transit tax benefits; Kamfonik (2013) found that well under 50% of employees in the MBTA’s service area worked for employers who provided access to the Corporate Pass Program (only 45% of employees working within 0.5 miles of an MBTA rapid transit station and 8% of employees working within 0.25 miles of an MBTA bus station). These potential revenue, ridership, and equity benefits are often ignored in the context of short-term budgetary needs, since program expansion or creation requires longer-term institutional strengthening; however, it could be a useful complement to transit fare increases.

Drawing on recommendations for the MBTA Corporate Pass Program in Kamfonik (2013), Filler (2015), and Dawson (2018), there are at least four mechanisms by which transit agencies could expand employer- based pass sale programs:

 Increase employer enrollment through marketing and program support.  Increase employee participation rates by working with employers enrolled in the program.

126

 Encourage employer subsidy of employee passes through discounts or special payment programs like AccessMIT.  Advocate for policy changes that improve employer participation or expand access to pre-tax passes. Employers of a certain size could be required to offer transit benefits. Federal law could be amended to treat all monthly passes as tax-advantaged (regardless of whether they were purchased through an employer). State tax policies could similarly be improved to increase eligibility for tax deduction of transit expenses. And transit agencies could use accounts in automated fare collection systems to document purchases and provide tax reporting in order to help transit users take advantage of available tax benefits. Expanding access to tax benefits may be particularly important as contract workers with limited access to employer-based “fringe” benefits represent a larger and larger share of U.S. employment.

Other preferred sale channels could also be improved to attract customers and further reduce the salience of pass prices. For example, automatic pass renewal could be made more convenient and accessible. As mentioned earlier, it seems likely that some customers using automatic pass renewal are using pre-paid, pre-tax debit cards issued by their employers; a convenient pass renewal option allows agencies to augment the benefit of pre-tax transit spending for employees who elect to use debit cards rather than direct pass purchases. The Ventra system and Ventra mobile app have expanded use of this sale channel at the CTA; however, pass autoload could also be added as an option at vending machines and retail store sales devices (for customers who have already registered their account).

Sale channels could also be used to expand advertising of transit in general and preferred fare products (like passes). States could require any retail stores that sell lottery tickets to market and sell transit fare products, improving access in neighborhoods without transit stations or terminals. Additionally, personalized information could encourage customers to purchase passes through preferred sale channels. Vending machines and other sale channels could provide recommended or repeat-purchase options based on a customer’s account; for example, customers who purchased a pass or rode frequently in the past month could be presented with a one-click pass purchase option. Similarly, customers with registered accounts could be sent email reminders to renew their pass purchase (if their current pass is about to expire) or to try using a pass (if they are a frequent rider).

127

6 Estimating Behavioral Parameters of Fare Product Purchase and Use 6.1 Introduction As discussed in Chapter 4, there are several key customer behaviors that need to be understood on some level in order to predict the impacts of a fare change:

 Induced ridership. When a transit agency offers both unlimited-use pass products and pay-per-use fares, the zero-marginal-cost nature of the pass products will affect a customer’s ridership on a pass relative to the same customer’s ridership using pay-per-use. Policies that result in customers switching from pay-per-use to pass products will tend to increase ridership, and vice versa.  Fare product choice. Customers select from among the fare products that are offered based on different attributes of fare products that are available to them. One important product attribute for a customer is the cost of each product at the customer’s desired level of ridership. Policies that change prices or other fare product attributes will cause some customers to switch to different fare products.  Elasticity. Having identified their preferred fare product, customers decide when, how, and how often to ride on transit. Ridership decisions are based in large part on service availability and quality, but they are also affected by the marginal financial cost of riding. Policies affecting pay- per-use fares (at the level of individual rides or trips) will impact every decision to ride transit. Policies that change pass prices generally do not affect marginal fares (which are $0 regardless of the pass price), but they impact customer decisions about whether to purchase a pass versus either using another fare product or using non-transit modes of travel; pass purchases in turn affect ridership (at the level of the term of the pass, such as a week or a month).

The MBTA and the CTA offer both unlimited use passes and pay-per-use fares, so all three of these key customer behaviors are at play when fare policies are changed.

The magnitude of these three key customer behaviors, however, is an empirical question. The focus of this chapter is identifying opportunities to observe these behaviors using AFC data and demonstrating methods for quantifying them using the MBTA and the CTA as case studies. The ultimate goal of quantifying these behaviors is to predict the potential impacts of future incremental changes in fare structure and fare levels. (Chapter 7 presents a framework for combining the three behaviors into one prediction model.) As such, this chapter attempts to quantify each behavior in the form of concise parameters that can be applied to alternative fare policy scenarios. While there are different options for the specific form these parameters might take, the following are selected in order to maintain a simple model structure (presented in Chapter 7):

1. Induced ride factors: Multiplicative factors describing the average ridership impact of zero- marginal-cost passes relative to pay-per-use fares. For example, an overall induced ride factor of

128

1.1 means that customers who switches from riding pay-per-use to purchasing a pass product will take 1.1 times as many transit rides as before on average. 2. Fare product choice utility parameters: Parameters used to calculate the relative preferences of customer for a fare product (relative to all other fare products) based on fare product and customer attributes (chiefly prices and estimated travel demand). These parameters can be used to predict the probability of fare product choices of specific customers or the market shares of different fare products overall. 3. Price elasticities: Multiplicative factors that describe average sensitivity to changes in prices or fares. For example, an elasticity of -0.3 indicates that a 10% increase in fares will reduce ridership by 3% on average (or in the aggregate).

The following section describes the general empirical setting for estimating these parameters using AFC data. One case study is then presented for estimating each parameter type at either the CTA or the MBTA; induced ride factors and fare product choice utility parameters are estimated at the CTA, and price elasticities are estimated at the MBTA. (Estimation of the complete set of parameters at each agency is left to future work.) 6.2 Empirical Setting

6.2.1 AFC Data and Variation in Fares and Fare Products Passively-collected AFC data is “long” in that it captures ridership information for every card and ticket in a transit system (apart from non-interaction) over long periods of time. However, it is “narrow” in that it records a limited set of information about fare products and transit rides and typically does not include demographic information about customers (such as home address), data on customer activity outside of the transit system, or data on “external factors” such as weather or gas prices. Based on this scope of information captured in AFC data, the data can be organized in several different formats for empirical modeling of customer behaviors related to fare policy – cross-sectional data, aggregated time series data, or panel data.

Depending on the format selected, AFC data contains variation in fares, fare product choices, and ridership that are needed to learn about fare-related customer behavior. At any one point in time, there is variation in transit travel demand across customers. One result of this variation is that different individuals will see different effective fare product prices, which may lead them to different purchase decisions; for example, the monthly cost of riding pay-as-you-go will be higher for a customer with greater transit travel needs. On the flip side, there is also random (at least from the researcher’s perspective) variation in fare product choices, meaning that customers with the same level of transit demand may choose to use different fare products; as a result, otherwise similar customers may experience different marginal costs of ridership. Over time, AFC data captures variations within any particular transit card or account. Individual customers may change their fare product purchase decisions and ridership decisions day to day, week to week, or month to month. These changes result from a combination of unobserved individual-level factors (such as a change in underlying travel demand after moving home locations), system-wide policies (such as changes in service or fares), and other factors “external” to the transit system (such as weather or gas prices).

129

This variation within a transit card or account over time presents two challenges referred to as two different types of “churn.” The first is churn between fare products. From any one time period to the next, customers can be observed switching between different fare products even in the absence of any known changes in policy, service, or “external” factors; as discussed above, this is driven primarily by unobserved variation in individual-level factors. The second is churn in cards or accounts. From any time period to the next, individuals appear and disappear entirely from the AFC data. This appearance and disappearance could result from cards being lost or stolen and replaced, from customers switching between multiple cards or fare media, or from customers joining or leaving the system entirely. 6.2.2 AFC Data Formats The following sections discuss advantages and disadvantages of the three potential formats of AFC data -- cross-sectional, time series, and panel data – for estimating fare policy parameters.

Individual Cross-Sectional Data AFC data can be used as a cross-section for any defined time period, such as a day, week, month, or year. This captures variation in attributes and choices across customers, such as transit ridership and resulting weekly cost under a selected fare product, without requiring any special attention to churn between fare products or churn in cards (since cards are not traced over time). The sample size of customers is large, yet it is still computationally feasible to quickly perform computations on any single month of AFC data (at least at the MBTA and CTA given current data infrastructure).

There are two primary downsides to cross-sectional data for this research. First, it does not provide any variation within customers; without a good instrumental variable (which has not been identified), it offers no way to separate the impact of transit ridership level on fare product choice from the simultaneous impact of fare product choice on ridership (i.e. induced ride factors). Second and related, it cannot be used to evaluate the impacts of changes in fares over time.

Aggregated Time Series Data AFC data can alternatively be aggregated into time series data – effectively repeated cross-sections. This is still computationally cheap and still nets out the impact of churn in fare products and cards, but it also capture exogenous variation in fare levels and other fare policies.

The challenges of aggregated time series data in a single-agency context are sample size and limited variation. With a single, aggregated observation in each time period (such as weeks or months), a longer time period must be used to get a sufficient sample size for necessary controls (e.g. seasonality) and any statistical inference. Also, fare levels and fare structures change infrequently, which limits the variation needed to estimate a model. Using a relatively short time series to study a single fare change event requires strong assumptions about “baseline” or “but-for” trends. Using a longer time series across multiple fare changes adds variation, but it requires better controls for external factors changing over time and does not adapt quickly to rapidly-changing circumstances (such as new availability of ride-hailing services). Finally, abstracting from individual-level data removes options for studying the relationship between ridership levels and fare product choice; aggregated time series deal in overall totals and averages, which do not realistically portray customer choice situations.

130

Panel Data The third general format for AFC data is a panel structure, tracing individual customers over time. This preserves a large sample size and maximizes variation in the dataset – both across customers and over time (including individual choices and system-wide fare policy changes). The combination of realistic individual-level choice settings and exogenous variation over time presents opportunities to parse out the relationship between fare product choice and ridership levels.

Panel AFC data has its own challenges, however. As with aggregated time series data, infrequent fare policy changes limit the exogenous variation that could be used to identify panel data models. Panel data is also computationally intensive relative to cross-sectional data and aggregated time series data, requiring processing of individual-level information over extended periods of time. Without aggregation, any panel study of fare product choice or ridership additionally requires careful treatment of switching between fare products and churn or turnover in cards. There may be significant “baseline” churn that needs to be controlled or differenced out of any product-level analysis, and selecting any group of customers in a pre- period may result in reversion to the mean in the post-period.

As an example of this last challenge, it would be a mistake to look at cards that used a pass before a fare change, identify the subset that switched from to pay-per-use after a fare change, and attribute all of the switching to the fare change. Much of the switching could be typical churn in fare products. One reason that this churn occurs is random variation in individuals’ travel demand levels over time. Many customers who purchased a pass before a fare change started with ridership that is higher than the overall average and will tend to revert somewhat to lower ridership over time (even in the absence of any fare changes or other external events); as their average ridership declines, they will be more likely to select pay-per-use instead of a pass product. 6.3 Estimation of Induced Ride Factors at the CTA This section demonstrates the use of AFC data to estimate upper bounds on induced ride factors at the CTA. 6.3.1 Motivation and Methodology Induced ride factors measure the degree to which fare product choice – specifically the choice between an unlimited-use pass product and a pay-per-use fare product – affects ridership frequency. This effect cannot be observed directly, because fare product choice and ridership level are simultaneous decisions and causation runs in both directions; choice of a fare product is partly determined by expectations about frequency of future transit use, yet once a choice has been made between purchasing a pass or riding pay- per-use, ridership decisions will be affected by the marginal cost of each ride under the selected fare product. As a result, naïve comparisons of ridership between pass-holders and pay-per-use riders are unhelpful.

Estimation of induced ride factors, then, would require an exogenous “shock” to fare product choices – a change that would affect decisions between passes and pay-per-use without affecting underlying travel demand. Ideally, this would take the form of a randomized control trial in which a random subset of customers are required to switch from a pass to pay-per-use (or vice versa) after a baseline period, and a control group is required to continue using the same product. A second-best quasi-experiment might be a

131 targeted promotion (of either passes or pay-per-use) or a change in system-wide fares, both of which would affect fare product choice without affecting underlying travel demand. However, even if there were such an event and customers were observed switching fare products after the event, it is not clear how to separate cards that switched fare products as a result of the exogenous shock from cards that switched as part of “normal” churn in fare products; as discussed above, there is seemingly-random switching between fare products that occurs even in the absence of a targeted promotion or a fare change.

Since randomized or quasi-random experiments present significant challenges, account filtering is instead used to estimate an upper bound on induced ride factors using AFC data (in the absence of any exogenous shocks). In order to control for time-invariant individual-level factors, the analysis uses a panel of AFC accounts that switched between using a pass product and using pay-per-use during the study period. The logic behind the account filtering is that two customer behaviors observed in the panel are inversely correlated with unobserved changes in individual-level travel demand. Setting minimum thresholds on these two behaviors should then limit changes in demand within the filtered panel of AFC accounts:

1. Minimum average weekly ridership across both pass products and pay-per-use. If an AFC account consistently has ridership near or above the pass “multiple” (the number of rides needed for total pay-per-use fares to equal the pass price), then expected ridership and resulting fare product costs are less likely to be a driving factor in the choice between fare products. When ridership is above some minimum threshold at all times, randomness will play a larger role in fare product choices relative to expected ridership (and resulting product costs). For example, imagine two account are observed using pay-per-use, one with ridership somewhat below the pass multiple and another riding slightly above the pass multiple. If both accounts are later observed increasing their ridership by the same amount and purchasing a pass, it is not known whether and to what degree an underlying change in travel demand caused the change in fare product choice (versus randomness or something unobserved); however, it seems less likely that an increase in demand caused the change in fare product for the account that started above the pass multiple. 2. Minimum number of switches between a pass and pay-per-use during the study period. A single, persistent switch by an AFC account from one product to the other seems more likely to be driven by a major one-time shift in underlying travel demand (such as changing work locations or purchasing a car) than multiple switches. Even if a customer switches between a pass and pay- per-use frequently based on smaller variation in the customer’s self-predicted future ridership, the error in those predictions may play a large role in the product choices.

It may seem appealing to also set a maximum threshold on the difference between pass and pay-per-use ridership for any individual AFC account; after all, there may be some a priori expectations about the potential range of induced ride factor values (probably greater than 1 and less than 2), so it would seem natural to remove AFC accounts that are observed with ridership changes in the wrong direction or seemingly too large. This would be a mistake. The goal is to measure induced ridership using precisely these changes in ridership, so imposing a restriction on that value assumes a result. In addition, any prior expectations about the potential range of induced ride factors are focused on average or median effects, but there may be larger impacts for some individual customers; removing large values in one direction or the other would bias the result.

132

The customer behaviors described above are examined in a complete panel of accounts to select cutoffs and filter to a final set of accounts. An estimated upper bound on the induced ride factor is based on the median percentage difference between average weekly ridership on a pass and average weekly ridership on pay-per-use across the final panel of accounts. This represents an upper bound because account filtering can only partially remove the impact of ridership on fare product choice (the reverse causality), which increases the observed difference in ridership levels between passes and pay-per-use. Induced ride factors can be estimated separately for users of different pass types and for different trip types. 6.3.2 Data An AFC panel of CTA Ventra accounts is observed over one year from June 2016 through May 2017. At the start, this panel contains 2,025,705 accounts that only used four full-fare, account-based fare products for the entire year: pay-per-use, 3-day pass, 7-day pass, and 30-day pass. Measures of ridership are aggregated to the weekly level within each account, labeling each week of ridership with the fare product that was used (or “Multiple” if multiple products were used). Eight different measures of ridership are tracked:

1. total linked trips (counting a trip with a transfer as a single trip) 2. rail rides 3. bus rides 4. one-seat trips (without transfers) 5. transfer trips 6. peak period trips (7-9am and 4-6pm on weekdays) 7. off-peak trips (all other times) 8. weekend trips

The number of times each account changed fare product types between pay-per-use and any pass product (3-, 7-, or 30-day) is then observed across weeks. In order to control for time-invariant individual-level factors, at least one switch between pay-per-use and a pass for each account is required. This limits the panel to 164,812 accounts. As discussed in Chapter 5, there are important differences in purchase and ridership behaviors between 30-day pass-holders and 7-day pass-holders at the CTA, so it is desirable to analyze these two pass types separately. This results in two overlapping samples: 72,645 Ventra accounts that were observed switching at least once between a 30-day pass and pay-per-use, and 81,704 accounts that were observed switching between a 7-day pass and pay-per-use. (While use of 3-day passes is allowed in the panel, 3-day passes are not analyzed separately.)

For these two samples, individual-level percentage difference in average weekly ridership are calculated between weeks when the pass type of interest was used (either 30-day or 7-day) and weeks when pay-per- use was used. This produces a distribution of account-level percentage differences between ridership on a pass and ridership on pay-per-use, which is the basis of the analysis. 6.3.3 Analysis and Results Analysis and results are presented in detail for 30-day passes versus pay-per-use, and in summary for 7- day passes versus pay-per-use.

133

The analysis begins with 72,645 Ventra accounts that switched at least once between a 30-day pass and pay-per-use (in either direction) during the period of study. Figure 6-1 and Figure 6-2 show the distribution of the difference and percentage difference in average weekly ridership during weeks when a pass was used relative to weeks when pay-per-use was used for this sample of Ventra accounts. The median percentage difference of +68% is the initial upper bound on the induced ridership effect, meaning that use of a pass is expected to boost ridership no more than 68% relative to pay-per-use; this is so high a bound as to be useless, since it is expected that most of the difference in ridership was not caused by choice of a pass but rather the other way around.

Figure 6-1: Difference in Average Weekly Trips Within Account (30-Day Pass minus Pay- Per-Use)

Source: CTA Ventra, Jun 2016 – May 2017

Figure 6-2: Percent Difference in Average Weekly Trips Within Account (30-Day Pass Relative to Pay-Per-Use)

Source: CTA Ventra, Jun 2016 – May 2017 Notes: Range of horizontal axis restricted to -100% to +500%. Bin size = 5%.

To reduce the likely impact of changes in demand on switching, accounts are filtered based on minimum average weekly ridership (i.e., the lesser of average weekly ridership for 30-day pass and for pay-per- use). Figure 6-3 shows boxplots of percentage difference in ridership for different values of minimum weekly ridership. At low values, the range of the percentage difference in ridership is very large (since

134 small changes on a small denominator result in large percentage changes). Focusing on medians avoids problems with these extreme values. There is a large decline in the ridership difference between 30-day pass and pay-per-use as minimum ridership is increased (moving to the right). A cutoff of 8 minimum average weekly trips is selected, since the median appears to stabilize beyond this point.

Figure 6-3: Filtering Accounts on Minimum Average Weekly Ridership

Sources: CTA Ventra data, Jun 2016 – May 2017 Notes: Range of vertical axis restricted to -50% to +300% and horizontal axis restricted to <15. Outliers are not shown.

Accounts are similarly filtered based on the number of times they switched between pay-per-use and any pass product (in either direction). Figure 6-4 shows boxplots of percentage difference in ridership for accounts observed switching different numbers of times during the study period. Once again focusing on medians, there is a drop moving from one switch to two switches, and then a relatively tight range on the median for more than two switches. A cutoff of at least two switches is selected.

135

Figure 6-4: Filtering Accounts on Number of Switches Between Pay-Per-Use and Pass

Sources: CTA Ventra data, Jun 2016 – May 2017. Notes: Range of vertical axis restricted to -50% to +300%. Outliers are not shown.

Finally, the sample is restricted to only accounts with at least four weeks of use on each product (30-day pass and pay-per-use) to mitigate the impact of variation in individual weekly ridership. The final panel contains 4,163 accounts that pass the following three cutoffs:

1. At least 8 rides per week on both pass and pay-per-use 2. At least 2 switches between 30-day pass and pay-per-use (in either direction) in the year 3. At least 4 weeks of use on each product

Figure 6-5 shows the distribution of the percentage difference between 30-day pass ridership and pay-per- use ridership for this final sample, which has a median of +11%. This result is interpreted as an upper bound: On “average,” the ridership impact of using a pass (with zero marginal cost) rather than pay-per- use is no more than 11% (equivalent to a multiplicative induced ride factor of 1.11).

136

Figure 6-5: Percent Difference in Average Weekly Trips Within Account After Filtering (30-Day Pass Relative to Pay-Per-Use)

Sources: CTA Ventra data, Jun 2016 – May 2017. Notes: Bin size = 2%.

The above result is for total trips. Figure 6-6 uses the same sample of accounts to look at the seven different trip or ride types. The +11% median value in Figure 6-5 is shown in red at the left. The other trip types are colored in groups by mode, transfers, and time period. The results form an interesting pattern. The median ridership difference is higher for bus rides than rail rides, higher for transfer trips than one-seat trips, and similar across time periods. As a reminder, these are differences within individual accounts in the sample; for example, the median account took 21% more bus rides in weeks when they used a pass than they took in weeks when they rode pay-per-use. These results suggest that bus ridership and transfer trips are particularly sensitive to the fare product being used (pass or pay-per-use) and should have larger induced ride factors. These trips are more likely to decline than other trip types if a customer switches from a pass to pay-per-use, and more likely to increase if a customer switches from pay-per-use to pass.

137

Figure 6-6: Percent Difference in Average Weekly Trips by Trip Type (30-Day Pass Relative to Pay-Per-Use)

Sources: CTA Ventra data, Jun 2016 – May 2017 Notes: Outliers not shown. For each trip type (other than total), some customers took zero trips of that trip type while using a 30-day pass, while using pay-per-use, or both. These customers with zero use are included in the distributions at extremes. Customers with zero rides on pay-per-use and positive ridership on a 30-day pass for a given trip type have an infinite percent difference in ridership, which is replaced with 200% (well above the median). Customers with zero rides on both pay-per-use and pass for a given trip type have an undefined percent difference, which is replaced with 0%. These data modifications would bias average differences, but should retain the correct median (as shown in the boxplots).

Thus far, the analysis has compared use of 30-day passes and pay-per-use. The same methodology is also applied to compare 7-day passes and pay-per-use. Examination of minimum average weekly ridership across fare products and number of times switching between fare products revealed similar patterns and suggested use of the same cutoffs as for the 30-day pass analysis. Application of the cutoffs gave a sample of 2,159 Ventra accounts with average weekly ridership of at least 8 during both 7-day pass weeks and pay-per-use weeks, at least 2 switches between 7-day pass and pay-per-use (in either direction) in the year, and at least 4 weeks of use on each product.

The results for 7-day passes, both for total trips and specific trip types, are shown in Figure 6-7. The median for all trips is +21% (corresponding to an upper bound induced ride factor of 1.21), about twice as high as the value for 30-day pass-holders. The patterns across trip types are similar to the 30-day pass analysis above – bus rides higher than rail rides and transfer trips higher than one-seat trips – but all the values are higher. Customers in the sample who are in the 7-day pass market are more sensitive to the marginal cost implications of the fare product they are using than customers in the 30-day pass market, suggesting that they should be assigned a higher induced ride factor.

138

Figure 6-7: Percent Difference in Average Weekly Trips by Trip Type (7-Day Pass Relative to Pay-Per-Use)

Sources: CTA Ventra data, Jun 2016 – May 2017 Notes: Outliers not shown. For each trip type (other than total), some customers took zero trips of that trip type while using a 7-day pass, while using pay-per-use, or both. These customers with zero use are included in the distributions at extremes. Customers with zero rides on pay-per-use and positive ridership on a 7-day pass for a given trip type have an infinite percent difference in ridership, which is replaced with 200% (well above the median). Customers with zero rides on both pay-per-use and pass for a given trip type have an undefined percent difference, which is replaced with 0%. These data modifications would bias average differences, but should retain the correct median (as shown in the boxplots).

6.4 Estimation of Elasticities at the MBTA

6.4.1 Motivation Elasticities are the most familiar parameter used to predict the potential impacts of a change in fares. They have a simple and intuitive interpretation – you simply multiply the percent change in fare by the elasticity to get the predicted percent change in ridership – and they have been estimated empirically in both academia and throughout the transit industry.

As discussed in Chapter 3, the most common method for estimating transit fare elasticities appears to be aggregated time series models – either across a number of transit agencies (typically performed in academia) or across multiple fare changes at a single agency (often performed internally for regular ridership and revenue forecasting applications). These time series models can be estimated for all fare products together using average fares, or they can be estimated independently for different fare products or trip types. For a transit agency trying to learn from its own experience, this approach can work well

139 particularly if there are frequent fare changes and the relative prices of different fare products are stable. In fact, even considering a short time period around a single fare change at a single transit agency, a simple before-and-after analysis – the percent change in ridership from before to after the fare change – could be a reasonable basis for an estimating an elasticity if there is not a general trend in ridership or any other major changes affecting ridership.

In many cases, however, this straightforward approach is not viable. Recent fare changes at the MBTA and the CTA are two examples. The MBTA changed most of its prices and fare levels in July 2016. It had been two years since the previous fare change in July 2014, and fares and prices changed quite differently across fare products; notably, the price of monthly passes for bus and subway service increased by 13% while the single-ride fare for subway increased only 7%. The CTA changed prices and fares in January 2018 for the first time in five years. (The previous fare change in January 2013 came shortly before roll-out of a new fare collection system.) By 2018, ridership was trending down on the bus network and rail ridership had also started to dip. As at the MBTA, prices and fares were changed differently for different fare products: 7-day passes were unchanged, 30-day pass prices went up 5%, and single-ride rail fares went up 11%. Underlying trends in ridership (such as the recent decline in bus ridership at agencies around the country) and customer switching between passes and pay-per-use fares complicate simple before-and-after comparisons. Using a typical aggregated time series model could control for a baseline trend in ridership (which should not be attributed to a fare change), but it does not adjust for fare product switching in a product-level analysis. In these types of settings – long gaps between fare changes, underlying trends in ridership, and different price changes for different fare products – how can transit agencies make use of available AFC data to estimate elasticities and learn from their fare change experiences?

A pair of approaches is demonstrated to estimate elasticities with AFC data, using the MBTA July 2016 fare change as a case study. The first approach is focused on pay-per-use fares, and the second is focused on pass products. 6.4.2 Panel Model for Estimating Pay-Per-Use Ridership

Methodology As discussed above, an aggregated time series of total pay-per-use ridership spanning the MBTA fare change would lump together all three of the fare-related customer behaviors that need to be distinguished and quantified:

1. Elasticity. It is expected that continuing pay-per-use customers reduced their ridership as a result of increased fares. 2. Fare product choice. It is expected that there was a net shift away from pass products and into pay-per-use after the relative price of passes increased. 3. Induced ridership. It is expected that the additional customers who switched from passes to pay- per-use reduced their ridership once they experienced a marginal fare for their transit trips.

While there are strong a priori expectations about the directions of these individual behavioral effects, it is difficult to hypothesize about the net impact of the three together on aggregated pay-per-use ridership.

140

To narrow in on pay-per-use elasticity, the panel nature of AFC data and the marginal fare paid for each ride can be leveraged. A panel of MBTA CharlieCards (interpreted loosely as customers) that are active before and after the fare change using only pay-per-use fares (i.e. never purchasing a pass) is selected. This use of a fixed panel of cards that ride exclusively using pay-per-use removes any impacts of fare product choice and induced ridership; this analysis looks only at cards that did not switch fare products, and as a result none of the cards experienced an induced ridership effect. Within this panel, all customers experienced the same change in pay-per-use fares and could adjust their ridership behavior accordingly.

A fixed effects regression models is specified with the following structure:

log(푎푣푔_푑푎푖푙푦_푟푖푑푒푟푠ℎ푖푝)푖푡 = 훼푖 + 퐱′풊풕휷 + 훾1(푓푎푟푒_푐ℎ푎푛푔푒)푡 + 푢푖푡, where

푖 = 1, …, n (individual CharlieCards); 푡 = 1, …, T (time periods, either months or years)

푎푖 = card-specific constant terms

퐱′풊풕 = other control variables

1(푓푎푟푒_푐ℎ푎푛푔푒)푡 = dummy variable equal to 0 in time periods before the fare change and 1 in time periods after the fare change

Breaking this model down in more detail:

 The dependent variable, log(푎푣푔_푑푎푖푙푦_푟푖푑푒푟푠ℎ푖푝)푖푡, is an observation of average daily ridership for a specific card (푖) during a particular time period (푡).

 The card-specific constant terms, 푎푖, capture unobserved heterogeneity across cards that does not vary over time; in effect, these terms remove average ridership for each individual card (conditional on other control variables). This focuses the results on changes in ridership rather than on levels of ridership (which varies across cards).

 Specifications are tested with several different 퐱′풊풕, which are combinations of control variables to account for seasonality and trends in ridership (in the absence of a fare change). Overall and card-specific effects for months of the year are tested separately; while average daily ridership already accounts for different numbers of days in each month, ridership might be expected to fall in some months and rise in others due to factors such as weather (an overall effect) or vacation patterns (more of a card-specific effect). Overall and card-specific time trends (both linear and quadratic) were also tested; if either all cards or a specific card had a downward trend in ridership before the fare change, a time trend avoids attributing that prior trend to the fare change.  Having controlled for individual “fixed effects” and (depending on the specification) seasonality and time trends, 훾 represents the average impact of the fare change on log ridership. Because a log transform of ridership is used as the dependent variable, 100* 훾 can be interpreted as the approximate percentage impact of the fare change on ridership. Estimates of this parameter, 훾̂, provide estimates of fare elasticity: elasticity = (% change in ridership) / (% change in fare) = 훾̂ / (% change in fare).

Note that this analysis does not include other time-period-level control variables for specific “external factors” that might affect ridership, such as transit service levels, weather, employment, and gas prices.

141

These controls would typically be included in a longer aggregate time series regression model, but they are omitted here for a few reasons. First, one goal of this analysis is to demonstrate elasticity estimation using AFC data on its own, without requiring that additional information be gathered. Second, the time period of analysis is relatively short since it focuses on a single fare change event; a longer time series would be needed to estimate and distinguish the relationships between variables like gas prices or population and transit ridership, and these factors may have insignificant impacts on ridership over the short time period of analysis. Third, the effect of any steady changes in “external” or non-fare factors can be controlled for by including time trends in the panel model. All that said, it is important to consider the external factors that may have driven changes in ridership during the study period, especially in interpreting estimated pre-fare-change time trends and deciding whether they represent a reasonable baseline assumption. In some cases, this is the best information that can be hoped for; any sudden external changes that coincided with the fare change will be difficult or impossible to separate out in a regression model, and data for other important factors may not be available (such as the growth of ride- hailing ridership over time).

A potential downside to this methodology is selection bias. This analysis focuses on a panel of CharlieCards meeting certain criteria – using only full-fare pay-per-use and active for a certain number of months. Will the results for this panel be generalizable to the pay-per-use riders that were excluded, including customers riding in only a few months, customers paying discounted fares, customers who switched between fare products, and customers using CharlieTickets?

Another limitation is that the panel regression models provide estimates of pay-per-use elasticity conditional on non-zero transit ridership. This analysis focuses on active transit riders to avoid complications of churn or turnover in cards; however, if a fare change caused some pay-per-use customers to leave the system entirely (e.g. switching to a car commute), this behavior would not be captured in the elasticity estimates. Applying the resulting elasticities to aggregate pay-per-use ridership assumes that the impact on net churn in cards is proportionally the same as the conditional elasticity.

Data There are AFC records for 5.3 million CharlieCards that were active any time during a three-year period between July 2014 and June 2017 (a total of 31.2 million active card-months). This analysis is limited to the subset of those cards that used only full-fare pay-per-use over the entire study period – about 2.7 million cards and 15.5 million active card-months.

Average daily ridership (all on pay-per-use) is calculated on a monthly basis for each card in the panel using five measures of ridership:

1. Total linked trips32 2. Weekday peak trips 3. Weekend trips 4. Trips beginning with a tap at a gate (mostly rail trips) 5. Trips beginning with a tap at a farebox (mostly bus trips)

32 Total linked trips are divided by (weekdays + 0.5*weekend-days) for each month to roughly adjust for different numbers of weekdays and weekend days in each month. Weekend days have roughly half the total ridership of weekdays at the MBTA.

142

Note that AFC data is recorded at the level of “taps” or fare stages, not linked trips; an existing data cleaning process for other AFC research applications was used to roll up to the level of linked trips (i.e. grouping transfer rides together as a single trip).

In order to have a sufficient number of observations for each card in the regression analysis, two subsets of cards in this panel are selected:

1. Cards that were active in at least 18 months (196,638 cards and 4,798,324 active card-months) 2. Cards that were active in all 36 months (8,062 cards and 290,232 active card-months)

Figure 6-8 shows the distributions of average daily ridership for the entire panel and these two subsets. Relative to all pay-per-use only cards, those that are active in at least 18 months have similar but somewhat higher ridership levels. Cards active in all 36 months have a much higher ridership distribution, suggesting that this subset of 8,062 cards may not be representative of CharlieCards with low or infrequent ridership.

Figure 6-8: Distribution of Tap Frequency for Pay-Per-Use Panel and Panel Subsets

Sources: MBTA AFC

The subset of cards that are active in all 36 months can also be analyzed at the annual level. (Annual analysis is not straightforward with the other subset because cards are active for different numbers of months in different years.) Figure 6-9 shows the distribution of log average daily ridership by fiscal year during the study period for this subset of cards that were active in each month. It appears that there was a small but visible downward shift in ridership in FY2017 (which began July 1, 2016, the same time as the fare change). The aim of the regression analysis is to quantify that shift and to see if it is still present after accounting for individual fixed effects, seasonality, and time trends.

143

Figure 6-9: Distribution of Tap Frequency by Fiscal Year (Pay-Per-Use Cards Active in All 36 Months)

Sources: MBTA AFC

Analysis and Results This section presents the results of different regression specifications for the two subsets of CharlieCards described earlier. The focus of discussion is on estimates of 훾̂ – the coefficient on the fare change dummy variable, which represents the average impact of the fare change on log ridership.

Panel of Pay-Per-Use CharlieCards Active in All 36 Months As discussed above, the subset of cards in the panel that are active in all 36 months (July 2014 through June 2017) can be analyzed in two different ways – either on an annual basis over three fiscal years, or on a monthly basis.

The annual analysis avoids seasonality in monthly ridership, but the small number of time period observations (T=3) limits the controls that can be included for time trends. Three different annual specifications are tested:

1. No controls. For reference, a model is run with no controls, only a dummy variable for the fare change (equal to 1 in FY17). The resulting 훾̂ is simply the difference in average log ridership in the panel between FY2017 and the previous two years. 2. Card fixed effects (“FE”). Including card fixed effects controls for the average ridership level of each card, so the resulting 훾̂ is the average card-level change in log ridership in FY2017 relative to the previous two years (the average of FY15 and FY16). 3. Card FE and Card Time Trend (Linear). Adding card time trends essentially draws a line through the FY15 and FY16 observations for each card. The resulting 훾̂ measures the average card-level difference between that trend and log ridership in FY2017.

144

Figure 6-10 plots the results for these three specifications, estimated separately for the five journey types described earlier.33 The scale on the top plot is the 훾̂ value, which measures the percentage impact of the fare change. The bottom plot shows the same results, but the scale is divided by the percentage change in CharlieCard pay-per-use rail fares ($2.25/$2.10-1 ≈ 7.1%) to give an estimated pay-per-use elasticity. The three specifications give similar results for the model of total linked trips – a percentage impact on ridership of roughly -5%, which implies an elasticity of about -0.7.

Figure 6-10: Regression Results for Annual Analysis of Pay-Per-Use Cards Active in All 36 Months

Sources: MBTA AFC Notes: The top plot shows 95% confidence intervals for the estimated coefficient on the fare change dummy variable (훾̂). In the bottom plot, the vertical scale is divided by the percentage change in CharlieCard pay-per-use rail fares in the July 2016 fare change ($2.25/$2.10-1 ≈ 7.1%) to give an estimated pay-per-use elasticity.

Results are also similar across specifications and a bit lower for weekday peak trips and trips beginning at a gate. This is consistent with the expectation that peak trips and rail trips are less sensitive to fares than other trip types. However, the specifications give very different results for weekend trips and farebox trips (largely bus) depending on whether card time trends are included; without time trends, the estimated elasticity is very high, and when the time trends are included there is no detectable impact of the fare change. Figure 6-11 suggests the reason is that card-level decline in weekend and farebox ridership in the panel was similar from FY15 to FY16 (before the fare change) and from FY16 to FY17. The regression with time trends assumes that the prior-year trend would have continued even if there was not a fare change; whether that assumption is reasonable is a personal judgment. It seems likely that elasticities for

33 All regressions were estimated using the plm and felm functions in the plm and lfe packages developed for the open source software R.

145 these trip types are actually somewhere in the middle – larger in magnitude than the elasticities for peak and rail trips, but probably still “inelastic” (smaller than -1).

Figure 6-11: Median of Average Daily Trips (Demeaned at Card Level) Shows Smooth Trend for Weekend and Farebox Trips

Source: MBTA AFC Notes: Monthly card observations of average daily trips were demeaned using the relevant card fiscal year average. February 2015 was removed from the calculations due to severe winter storms.

Analysis of the same panel at the monthly level gives similar results. Note that February 2015 was removed from the monthly analysis due to severe winter storms. Results are presented for four different monthly specifications – the same three as before, plus a specification with both general calendar month dummy variables and card time trends:

1. No controls. 2. Card fixed effects (“FE”). 3. Card FE and Card Time Trend (Linear). The card time trend for the monthly analysis is a trend through the FY15 and FY16 monthly observations for each card. 4. Card FE, Month Dummies, and Card Time Trend (Linear). Including dummy variables for the calendar months (Jan, Feb, etc.) controls for system-wide seasonality such as typical weather, holidays, etc.

Figure 6-12 displays the results for the monthly level. As expected, the results are generally similar to the annual analysis of the same panel subset. Elasticity estimates for total trips are mostly around -0.8, and somewhat lower for weekday peak trips and “gate” (rail) trips. As before, the specifications without card time trends suggest that weekend and “farebox” (bus) trips have higher elasticities than other trip types; however, including card time trends and calendar month dummies reduces any fare change impact. Again, it is left up to interpretation whether the downward trend in weekend and bus ridership would have continued in the absence of the fare change.

146

Figure 6-12: Regression Results for Monthly Analysis of Pay-Per-Use Cards Active in All 36 Months

Sources: MBTA AFC Notes: The top plot shows 95% confidence intervals for the estimated coefficient on the fare change dummy variable (훾̂). In the bottom plot, the vertical scale is divided by the percentage change in CharlieCard pay-per-use rail fares in the July 2016 fare change ($2.25/$2.10-1 ≈ 7.1%) to give an estimated pay-per-use elasticity.

Panel of Pay-Per-Use CharlieCards Active in At Least 18 Months The same regression specifications are estimated using the second panel subset – pay-per-use CharlieCards that were active for at least 18 months. For these cards, the dataset only includes months with at least one trip, so the resulting elasticities are conditional on a card being active in a given month.34 It was not feasible to estimate regression models on the entire panel subset, so a 5% random sample of 10,000 cards (244,421 active card-months) was selected. As with the previous monthly analysis, February 2015 was removed due to severe winter storms.

Figure 6-13 shows the results for this sample of 10,000 cards that were active for at least 18 months. The resulting estimates of fare sensitivity are larger in magnitude than results from the smaller panel of cards active in all 36 months, but they follow a similar pattern across trip types. The implied elasticity for total trips is estimated between -1 and -2, results are somewhat lower for weekday peak trips and trips starting

34 The same models are also estimated on a modified dataset where card-months without any AFC activity were added with ridership set equal to zero, but the results were highly variable. The appropriate model for predicting active versus inactive months is not a linear regression but rather some kind of binary choice model; this would complement the continuous linear regression model of ridership level conditional on being active. Additionally, inactivity on a card might result from a customer using multiple cards rather than reducing their transit use.

147 at gates, and estimates for weekend trips and trips beginning at fareboxes are reduced dramatically with the inclusion of card time trends and calendar month dummies.

Figure 6-13: Regression Results for Pay-Per-Use Cards Active in At Least 18 Months

Sources: MBTA AFC Notes: The top plot shows 95% confidence intervals for the estimated coefficient on the fare change dummy variable (훾̂). In the bottom plot, the vertical scale is divided by the percentage change in CharlieCard pay-per-use rail fares in the July 2016 fare change ($2.25/$2.10-1 ≈ 7.1%) to give an estimated pay-per-use elasticity.

This similarity in the shape of results across the two panel subsets is encouraging, since the panel of cards active in at least 18 months is more representative of the universe of pay-per-use CharlieCards. However, the magnitude of implied elasticities from the panel of cards active in at least 18 months (greater than -1 overall) is surprising; both conventional wisdom and empirical results from earlier MBTA fare changes and other transit systems suggest that transit travel is price inelastic in the short run (elasticities smaller than -1). For example, the MBTA FERRET model currently assumes a “base” or middle-of-the-road elasticity of -0.25 for full-fare, pay-per-use bus and heavy rail rides.

Between the two panel subsets, the results from the smaller subset of cards that were active in all 36 months of the study period are preferred. While this subset was less representative of all pay-per-use customers (tending to ride transit more frequently), the results followed a similar pattern to the more representative subset of cards active in at least 18 months, and the magnitude of implied elasticities accorded better with broader empirical evidence on fare elasticities. Those estimates still suggest that pay-per-use elasticities are considerably higher than currently assumed in the MBTA FERRET model – perhaps -0.7 overall, smaller than -0.7 for peak and rail ridership, and (assuming consistency with elasticity patterns in the academic literature) greater than -0.7 for off-peak and bus ridership.

148

6.4.3 “Forensic Bounding” of Pass Elasticities

Methodology In Section 6.4.2, fare elasticities were estimated for pay-per-use ridership using a panel of CharlieCards. This section estimates elasticities for sales of the MBTA’s major rapid transit pass products, the monthly LinkPass and 7-day LinkPass. As discussed in Chapter 5, it is important to distinguish between Corporate LinkPasses (sold through employers) and non-Corporate LinkPasses (sold at FVMs and through other sale channels). Unfortunately, the AFC panel approach is not feasible for estimating pass elasticities. For pay-per-use ridership, each ride is an individual “sale” that could be influenced by the pay-per-use fare. By contrast, the marginal cost of ridership on a LinkPass is zero both before and after the fare change; so, if a panel of pass-holders were selected, they would experience no variation in marginal fares and their ridership would not be expected to change. Instead, elasticities for pass products communicate the impact of fare changes on pass sales (and relate only indirectly to pass ridership).

To study changes in pass sales, it is essential to look at all pass-holders. This includes pass-holders who switch between purchasing passes and riding pay-per-use and pass-holders who appear or disappear in the AFC records (such as a customer who purchases a pass before the fare change but is never observed in AFC after the fare change). These are the two types of churn discussed at the beginning of this chapter – churn in fare products and churn in cards. As discussed earlier, problems with this churn can sometimes be avoided by estimating aggregate time series models of total pass sales; this assumes that typical switching between fare products and normal card turnover net out to zero impact on total pass sales. However, since MBTA pass prices were raised proportionally more than pay-per-use fares, some switching away from passes toward pay-per-use ridership is expected. It is not possible to separate that switching from the elasticity effect in an aggregate time series analysis.

Instead, an alternative approach described as “forensic bounding” is demonstrated. This approach uses individual-level information on changes in sales and ridership from a comprehensive panel of cards and tickets to partially remove the effects of fare product switching from an aggregate analysis of changes in sales.

Step 1: Calculate changes in fare product sales and ridership for each individual card and ticket The approach begins by looking at individual-level sales (for pass products) and ridership (for passes and pay-per-use) by fare product in two time periods (t0 and t1). Since it begins at the level of individuals, changes in sales and ridership by fare product can be calculated for each card and ticket. Two types or cases of changes are distinguished:

1. Switching (“traceable” cards). The first case is a card or ticket that is observed in both time periods. In this case, changes in sales and ridership on any particular fare product could be described as a combination of switching between fare products and changing level of demand. If the term of a pass product matches the time period of analysis (e.g. a monthly pass and a month- long time period), then all changes in pass sales represent only “switching.” For example, suppose a card is active in two different months (i.e. has positive ridership in both months) but only purchases a monthly pass in the first month. This card has a change of -1 in monthly pass

149

sales, and this change must involve switching from the monthly pass to one or more other fare products.35 2. Appearance and disappearance (“untraceable” cards and tickets). The second case is a card or ticket that is only active (i.e. only appears in AFC data) in one of the two time periods. In this case, changes in sales and ridership can be described as appearance (for a card that was inactive

in t0 but active in t1) or disappearance (for cards active in t0 but inactive in t1). Cards could appear and disappear for many different reasons – a customer using multiple cards at different times, a lost or replacement card for an existing customer, a customer uses a disposable fare media like a ticket or a time-limited card, a customer that enters the system for the first time or leaves entirely. In some cases disappearance and appearance of cards is actually continued use of the same fare product (e.g. if a customer buys a monthly pass on a new ticket every month) or switching between fare products (e.g. if a customer uses an employer-provided pass one month and then uses pay-per-use on a personal card the next month). But from what can be observed in AFC, it is only known that the card appeared or disappeared.

Table 6-1 illustrates what individual-level changes might look like and how they would be categorized. Changes for cards 1 and 3 are categorized as switching because the total ridership for those cards (across all fare products, including pay-per-use) is greater than zero in both t0 and t1. Changes for cards 2 and 4 are categorized as appearance and disappearance because they are only active (positive total ridership) in one of the two time periods.

Table 6-1: Example Classification of Individual Changes in Sales and Ridership Card Fare Sales Sales Change Ridership Ridership Change in Type of Change Product in t0 in t1 in Sales in t0 in t1 Ridership Monthly Pass 1 0 -1 35 0 -35 1 Switch Pay-per-use N/A N/A N/A 0 20 +20 2 Monthly Pass 0 1 +1 0 42 +42 Appear Monthly Pass 0 1 +1 0 33 +33 3 7-day Pass 1 0 -1 15 0 -15 Switch Pay-per-use N/A N/A N/A 7 4 -3 7-day Pass 2 0 -2 18 0 -18 4 Disappear Pay-per-use N/A N/A N/A 4 0 -4

Step 2: Aggregate individual-level changes into total “net traceable switching” and “net churn in cards” for pass products

With these individual-level changes in hand, the aggregated change in pass sales from t0 and t1 can be calculated for each type of individual-level change – switching versus appearance and disappearance of cards. The sum of all switching is referred to as net traceable switching and the sum of all appearance

35 The situation is more complicated for 7-day passes. For example, if instead a customer purchased four 7-day passes in one month and two 7-day passes in a later month, this could reflect either switching to another fare product, scaling back 7-day pass purchases without switching over to a different fare product, or a combination. Likewise, changes in pay-per-use ridership could reflect either switching or changes in frequency on the same fare product. No attempt is made here to distinguish these sub-cases; customers can use multiple fare products in a single time period and can change their ridership across time periods, which complicates efforts to directly connect a change in one product to a change in another product.

150 and disappearance is called net churn in cards. Table 6-2 shows an example aggregation of the individual-level data in Table 6-1.

Table 6-2: Example Aggregation of Individual-Level Changes in Sales and Ridership Fare Product Net Traceable Switching Net Churn in Cards Total Net Change Monthly Pass -1 (Card 1) + 1 (Card 3) = 0 +1 (Card 2) +1 7-day Pass -1 (Card 3) -2 (Card 4) -3

Separating these two components of net changes is useful for estimating pass elasticities, since the elasticity effect – reduction in sales unrelated to fare product choice – only affects pass sales through net churn in cards.36,37 If net churn in cards is observed across time periods that span a fare change, this should include the elasticity effect of the fare change on pass sales while partially excluding the effects of fare product switching. As discussed above, net churn in cards can additionally include customers who are, in reality, switching between fare products (such as discarding a disposable ticket for one fare product and purchasing a different product on another card or ticket); this introduces a positive or negative bias to any estimates of elasticity based on net churn in cards.

Step 3: Remove baseline net churn in cards and calculate elasticity Before estimating pass elasticities from net churn in cards, it is important to specify a baseline or counterfactual level of net churn in cards. Individual-level changes are occurring all the time, even in the absence of any new policies such as fare changes. This generates a baseline level of net traceable switching between fare products and net churn in cards between any two time periods. The baseline level of net churn could be non-zero and could have a non-zero trend. This baseline net churn should be estimated from time periods that do not contain a fare change, and the baseline level should be removed before attributing net churn to an elasticity effect from a fare change. For example, what would be the interpretation of a positive net churn in cards from a month before a fare change to the same month after the fare change (a year later)? This could be the expected baseline level of (even if there were not a fare change), it could be higher-than-expected net churn in cards, or it could be lower-than-expected net churn in cards; the baseline level of net churn in cards would need to be estimated and removed to even know the direction of the impact.38 As in the panel analysis of pay-per-use riders, use of a short time period for analysis and estimation of a pre-fare-change time trend mitigate potential bias from non-fare factors (which are not included explicitly in the calculations). For example, changes in transit service that occurred well before the study period would not affect the results, and steady changes in local population would in theory be captured by the pre-fare-change time trend in net churn.

36 This is only strictly true if the time period of analysis corresponds to the term of the pass product under study. For example, it is true for a monthly analysis of monthly passes; however, in a monthly analysis of 7-day passes, the elasticity effect could appear as a change in sales frequency without switching between fare products. 37 Note again that this is not true for pay-per-use ridership, where individual ridership levels can increase or decrease as a response to a fare change; however, it does hold for ridership on zero-marginal-cost passes, where elasticities are not expected to affect ridership conditional on pass purchase. 38 Note that baseline net churn in cards for a particular pass type could be positive even if the total net change in pass sales is zero, since baseline net traceable switching might be negative. For example, if the “customer life cycle” had a tendency to join the system on a pass, switch to pay-per-use, and eventually leave the system, then baseline net churn in cards on the pass would be positive and baseline net traceable switching on the pass would be negative.

151

Once baseline net churn in cards is removed, the remaining net churn in cards for a particular pass product can be divided by its total sales in the first time period. This gives the percent change in sales due to net churn in cards beyond what is normally expected. Dividing this percent change in sales for a pass by the percent change in the pass price gives an estimate of the pass elasticity.

Step 4: Adjust elasticities based on prior knowledge of pay-per-use elasticity For pay-per-use, this approach cannot effectively separate the impacts of elasticity and fare product choice for pay-per-use ridership. As mentioned above, “traceable” changes in pay-per-use ridership could reflect either fare product switching (relating to fare product choice) or changes in demand level on the same fare product (part of the elasticity effect); subtracting net traceable switching does not help to narrow in on elasticity. This is why the earlier part of this section used a different method (a pay-per-use- only panel regression model) to separately estimate pay-per-use elasticities.

However, if there is any prior knowledge of pay-per-use elasticities, they can potentially be used to adjust pass elasticities under this method. Consider the net change in pay-per-use ridership from before a fare change to after the fare change. If a pay-per-use elasticity is assumed, this implies an elasticity effect, which can be subtracted out of the observed net change in ridership. Similarly, any baseline trend in pay- per-use ridership can be removed. The remaining net change in pay-per-use ridership represents any net fare product switching and induced ridership related to the fare change. This switching should, ideally, be equal and opposite to the net traceable switching for passes that was calculated earlier (assuming that pay- per-use and passes are the only two fare product options, and after adjustments for induced ridership on passes). If it is not equal, it could be assumed that the difference is caused by net switching that was not traceable in AFC and thus was categorized as net churn in passes. This difference could then be subtracted from the net churn in cards for passes (since it relates to fare product choice rather than elasticities), and the pass elasticities could be recalculated. This “scaling” of sorts ensures that the combined impact of the pass elasticities and assumed pay-per-use elasticity are consistent with the overall net change in ridership across all fare products (as always, net of any assumed baseline adjustments).

Data This approach is applied to the MBTA surrounding the July 2016 fare change. The objective is to estimate elasticities for three major full-fare pass products – the Corporate LinkPass, non-Corporate LinkPass, and 7-day LinkPass – so analysis is limited to those products and full-fare pay-per-use.

This approach requires individual-level data on sales (for passes) and ridership (for both passes and pay- per-use) for both a period of time around the fare change and a baseline period before the fare change. Three fiscal years are used as the study period, two before and one after the fare change (July 2014 through June 2017).

AFC contains ridership data for CharlieCards and CharlieTickets, and it contains sale transactions for 7- day LinkPasses and most non-Corporate monthly LinkPasses. However, sales of monthly LinkPasses through the Corporate Pass Program and Semester Pass Program are recorded in a separate administrative database (not in the AFC system). Individual-level sales information from the administrative database is combined with the ridership data in AFC.

There are several data challenges in distinguishing Corporate and non-Corporate Monthly LinkPasses for the purpose of this analysis. One is the AccessMIT program, which provides free transit to MIT

152 employees that is billed to MIT on a pay-per-use basis. The program started rolling out starting at the same time as the MBTA fare change (July 2016, at scale in Sep. 2016). In the MBTA AFC and pass sales data, this change effectively added over 6,000 cards that appear to have monthly LinkPasses but do not represent sales (they are “AccessMIT passes” or “Mobility passes” billed to MIT retroactively based on actual use). It also effectively replaced about 3,000 Corporate LinkPasses that were previously purchased by MIT employees (subsidized by MIT). AccessMIT passes are removed entirely from the analysis, and observed net churn in Corporate LinkPasses is adjusted to remove the disappearance of the 3,000 MIT Corporate LinkPasses (which was not an impact of the fare change). Another challenge is Semester passes. Ideally, Semester passes could be removed from non-Corporate LinkPasses, since they are a substantively different product (priced at a discount and distributed through an exclusive sale channel). However, it is not currently possible to distinguish Semester passes from other non-Corporate LinkPasses in AFC prior to Spring 2015; since July 2014 through June 2015 is used to calculate a baseline of YOY changes, Semester passes are left in non-Corporate LinkPasses for consistency. This should not have a dramatic effect on results (given the relatively small volume of Semester passes), but it adds noise to non- Corporate Monthly LinkPass churn.

Changes in 7-day pass distribution at the same time as the fare change also limit options for analyzing 7- day passes. 7-day passes became available to load on CharlieCards at FVMs beginning April-May 2016; previously, customers could only load 7-day passes on CharlieCards at ticket windows, so 7-day passes were primarily sold on CharlieTickets. CharlieTickets are not traceable year-over-year, so this results in a small baseline of net traceable switching for 7-day passes prior to the fare change. Much of the apparent negative net churn in cards and positive net switching following the fare change likely reflects customers switching from 7-day CharlieTickets to loading 7-day passes on existing CharlieCards. This severely limits the ability to estimate elasticities for 7-day passes.

Figure 6-14 show simple sums of ridership and sales on the four products selected for analysis from July 2014 through June 2017, using the individual-level dataset that was developed.39 The left graphs show levels of sales and ridership, and the right graphs show monthly YOY percentage change in sales and ridership. Boston’s string of severe winter storms in February 2015 (“Snowmageddon”) caused an exceptional drop in ridership on all fare products and sales of 7-day passes, which is most visible as a large YOY increase in ridership in February 2016 (relative to 2015); it did not appear to affect monthly pass sales, so February 2015 is left in the analysis of monthly LinkPass sales. For monthly LinkPasses, there are noticeable breaks in trend following the fare change in July 2016. Non-Corporate LinkPass sales and ridership appear to drop over 10% within a couple months of the fare change and remain lower than the previous year throughout FY17. Corporate LinkPasses had modest growth in sales in FY16 (relative to FY15), but sales gradually slowed after the fare change and ended FY17 lower than the previous year. There is not a clear pattern in 7-day LinkPass sales and ridership, though they do appear to lag prior year performance at the end of FY2017. (Aggregate pay-per-use ridership is presented only for reference and scale, since it is not the focus of this pass analysis. It does not present an obvious pattern, and it is only used at the end of the analysis in combination with an assumed pay-per-use elasticity to adjust / scale pass elasticities.)

39 Aggregate sales numbers developed from the card- and ticket-level data do not perfectly match the total sales levels recorded by MBTA accounting, which could be a result of the data fusion process (working with sales at the individual level); however, they do follow a very similar pattern. Since the data are primarily used to show changes rather than levels, these differences should not materially affect the results in this section.

153

Figure 6-14: Monthly Sales and Ridership on Selected MBTA Fare Products, FY 2015-2017

154

Sources: MBTA AFC, MBTA pass sale program administrative data Notes: February 2015 had noticeable impact on ridership for all products and sales for 7-day passes; removed from analysis accordingly. Average daily taps are normalized by “transit ridership days” (=1 for non-holiday weekday, 0.5 for holidays and weekend days).

Analysis and Results First, monthly year-over-year (YOY) changes in pass sales were calculated and categorized at the individual level.40 (Note that “change types” – switching products, changing demand level, appearing, or disappearing – must be identified using individual-level data. Once data is aggregated separately in two time periods, it is not possible to separate customers who were active in both periods from customers who were only active in one and “appeared” or “disappeared” in the other.) These changes could have been calculated across any time interval within the study period; monthly YOY changes were selected to control for seasonality and to allow time for customers to leave the system (“disappear” from AFC) after temporarily switching to another fare product. These YOY calculations were performed for each fare product and for each month in FY17 (changes from July 2015 to July 2016, August 2015 to August 2016, etc.). Changes in sales were then aggregated up by pass type and change type. The same YOY change calculations and aggregations were also performed for each month in FY16 in order to later develop a baseline of traceable fare product switching and churn in cards for FY2017 (a counterfactual of what would have happened in FY2017 in the absence of a fare change).

Figure 6-15 shows the different change types for each pass product. Recall that each monthly observation is the year-over-year change in sales (relative to the same month of the previous year). The graphs illustrate the high degree of churn in both fare products and cards that goes on all the time, absent any policy changes, relative to the net sales levels that were shown above in Figure 6-14. In any given month, roughly 10,000 customers have switched into and out of a non-Corporate LinkPass relative to the same month a year prior, and roughly half of non-Corporate LinkPasses are on cards and tickets that were not active one year earlier. The level of card churn is similar for Corporate Monthly LinkPasses; however, new Corporate passes are always issued on new, specially-coded CharlieCards, so there is no traceable switching into Corporate passes from pre-existing cards. As discussed earlier, 7-day LinkPasses were

40 Changes were calculated separately for each CharlieCard; CharlieTickets were aggregated by fare product for convenience, since they are not traceable year-over-year and are therefore all categorized as churn in cards.

155 predominantly on CharlieTickets until they became available to load on CharlieCards around the same time as the fare change; as a result, there was practically no traceable switching in or out of 7-day passes until May 2016, and nearly all 7-day passes in AFC are on cards and tickets that were not observed a year earlier. Importantly for this analysis, in spite of very substantial movements into and out of each pass product, the net traceable switching and net churn in cards for each is relatively smooth (in spite of clear seasonality in different directional movements).

Figure 6-15: Traceable Switching and Churn in Cards for MBTA LinkPasses Non-Corporate Monthly LinkPass

Corporate Monthly LinkPass

156

7-Day LinkPass

Sources: MBTA AFC, MBTA pass sale program administrative data

The next step in the analysis is to focus on YOY net churn in cards after the fare change (excluding YOY net switching) in order to estimate elasticities. An important element of this is estimating a baseline level of YOY net churn in cards that should be subtracted from FY2017 net churn to isolate the impact of the fare change. Two different options are considered for this baseline or counterfactual:

1. the average value from months in FY2016, and 2. a linear trend of months in FY2016, projected forward to FY2017.

The use of a linear trend may help control for steady changes in non-fare factors that are affecting sales, such as population growth or expansion of ride-hailing. Conveniently, simple time trends do not require explicit information about those factors; however, since these factors are not included as explicit controls, use of a linear time trend does require the assumption that the trend in the year before the fare change was likely to continue in the absence of the fare change. By the same token, using the average for the year before the fare change assumes that any trend across months that year was a result of random variability or was a short-term process that would not have continued absent the fare change. Choosing between these baseline assumptions is complicated by the significant variability in monthly net churn; prior trends cannot be easily distinguished from noise.41 Using a longer pre-period (more monthly observations) would help in estimating a prior trend and selecting between baseline options; however, earlier data would extend before the MBTA’s previous fare change in July 2014. (Recall that net churn in cards is a year-over-year difference, so net churn for July 2015 uses data from July 2014.)

Selection of a baseline raises a broader question about statistical uncertainty. This analysis uses simple calculations of differences between actual and assumed baseline net churn to estimate elasticities. Given the variability in net churn in pass sales (evident in the summaries below) and the small number of observations, it is not expected that the results are statistically significant; however, they are, perhaps, the best estimates available using only the July 2016 fare change. If desired, equivalent regression models could alternatively be specified as follows and estimated using the 24 monthly observations in order to quantify the uncertainty in estimated elasticities:

41 Note that this variability does not appear to reflect seasonality, which is largely eliminated by the year-over-year differencing used to develop net churn in the first place.

157

1. Using the average value in FY2016 as the baseline:

푛푒푡푐ℎ푢푟푛푡 = 훼 + 휆1(푝표푠푡푓푎푟푒푐ℎ푎푛푔푒푡) + 휀푡 2. Using a linear trend of months in FY16 as the baseline:

푛푒푡푐ℎ푢푟푛푡 = 훼 + 훽푡 + 휆1(푝표푠푡푓푎푟푒푐ℎ푎푛푔푒푡) + 휀푡

The charts below zoom in on YOY net churn in cards. The top graph for each pass type shows this net churn in the context of YOY net total change in sales. The bottom graph for each pass type then shows only net churn in cards along with the two potential FY17 baselines (based on the FY16 average or trend). The impact on sales that can be (cautiously) attributed to price elasticity and the fare change is the difference between one of the baselines and the actual YOY net churn in sales in FY2017.

For non-Corporate Monthly LinkPasses, Figure 6-16 shows that removing net traceable switching limits the net decline in sales that could be attributed to an elasticity effect (as opposed to fare product switching). The impact of the fare change on net churn in cards (and thus the estimated elasticity) depends heavily on the selection of a baseline in the bottom graph. If it is believed that the downward trend in YOY net churn in cards before the fare change would have continued after the fare change, then it appears the fare change had little to no elasticity impact. If it is instead believed that YOY net churn would have maintained the same monthly average as FY16, then the fare change had a substantial elasticity impact (the area in FY17 above net churn in cards and below the FY16 average dashed line).

Figure 6-16: Monthly Year-Over-Year Change in Net Churn for Non-Corporate Monthly LinkPasses with Alternative Baseline Trends

Sources: MBTA AFC, MBTA pass sale program administrative data

158

The picture for Corporate LinkPasses in Figure 6-17 is the opposite of Non-Corporate LinkPasses. There was an upward trend in YOY net churn in cards in FY16. If that trend would have continued, the fare change arguably depressed Corporate LinkPass sales (indication of a non-zero elasticity). If the trend was just noise or unlikely to continue, then there is no discernable elasticity impact on Corporate LinkPass sales.

Figure 6-17: Monthly Year-Over-Year Change in Net Churn for Corporate Monthly LinkPasses with Alternative Baseline Trends

Sources: MBTA AFC, MBTA pass sale program administrative data

Finally, Figure 6-18 for the 7-day LinkPass would normally suggest a substantial elasticity effect (under either baseline). However, as discussed above it seems likely that much of the drop in net churn in cards for 7-day passes resulted from customers moving their 7-day LinkPass purchases from CharlieTickets onto CharlieCards that were already active (once that became an option at FVMs). That switch in media would appear in AFC as negative churn in cards (abandoning a CharlieTicket) and positive traceable switching (adding a 7-day pass to a CharlieCard previously used for stored value or a monthly pass). The magnitude of these media shifts is not known, so it is not possible to estimate a 7-day LinkPass elasticity with any confidence using this approach.

159

Figure 6-18: Monthly Year-Over-Year Change in Net Churn for 7-Day LinkPasses with Alternative Baseline Trends

Sources: MBTA AFC, MBTA pass sale program administrative data

Table 6-3 summarizes and quantifies the results in the graphs at the annual level.42 The total net change in sales from FY16 to FY17 is broken out into net traceable switching and net churn in cards. The churn in cards is further broken out into a baseline component (either average or trend from FY16, as described above) and a "beyond baseline" component, which can be attributed to the fare change. This final component is converted to a percent of FY16 sales, which is divided by the percent change in the pass price to estimate elasticity.

The final step is to adjust the pass elasticities based on an assumed elasticity for pay-per-use. Table 6-4 and Table 6-5 present example calculations for these adjustments. The net change in full-fare pay-per-use average daily ridership from FY16 to FY17 in the dataset was about -5,400. (Note that pay-per-use ridership was very similar in FY15 and FY16, so no baseline trend is removed.) Based on the pay-per- use panel analysis presented earlier in this chapter, a pay-per-use elasticity of -0.7 is assumed for example calculations. This elasticity would imply a decline in average daily ridership of about -12,600. The difference between this assumed elasticity effect and the observed net change in pay-per-use ridership is roughly +7,000.

42 As discussed above, uncertainty in parameters is not estimated; it is not expected that the resulting estimates are statistically significant, given the limited time period of analysis and monthly variability in net churn in cards.

160

Table 6-3: Calculation of Unadjusted Elasticities for MBTA LinkPasses Non-Corporate Corporate Monthly Monthly LinkPass LinkPass 7-Day LinkPass FY16 Sales 1,042,860 1,052,635 2,454,765 FY17 Sales 924,577 1,040,791 2,351,644 Net Change in Sales -118,283 -11,844 -103,121 Net Traceable Switching -50,421 -54,674 157,753 Net Churn in Cards -67,862 73,074 -260,874 Baseline Assumption FY16 Avg FY16 Trend FY16 Avg FY16 Trend FY16 Avg FY16 Trend Baseline Net Churn in Cards 9,848 -56,923 73,934 95,151 -38,492 -25,547 Beyond Baseline (sales) -77,710 -10,939 -860 -22,077 -222,382 -235,327 Beyond Baseline (% of FY16 Sales) -7.5% -1.0% -0.1% -2.1% -9.1% -9.6% % Change in Price +12.7% +12.7% +12.7% +12.7% +11.8% +11.8% Implied Elasticity -0.59 -0.08 -0.01 -0.17 -0.77 -0.81

Table 6-4: Difference Between Assumed Pay-Per-Use Elasticity and Observed Change in Pay-Per-Use Ridership Pay-Per-Use FY15 Avg Daily Ridership 253,273 FY16 Avg Daily Ridership 251,240 FY17 Avg Daily Ridership 245,814 Net Change in Avg Daily Ridership -5,426 Assumed Elasticity Effect -12,562 % Change in Fare +7.1% Assumed Elasticity -0.7 Implied % Change in Ridership -5.0% Remainder = Net Switching + Induced Ridership 7,136

This gain in pay-per-use ridership (offsetting some of the pay-per-use losses from the assumed elasticity effect) should be driven by customers switching from passes to pay-per-use. So, this is expected to be approximately equal and opposite to the net switching on the three major pass products (adjusted down to account for induced ridership on passes, which is assumed to be 10%); however, the adjusted sum of net traceable switching on the pass products is about -14,000 average daily taps – about 8,000 lower than it should be to balance switching on pay-per-use.43 If all of the assumptions made here are correct, this suggests that 8,000 average daily trips that were initially attributed to positive net churn were actually positive net switching from pay-per-use to passes. To correct, this amount is added to net pass switching (i.e. more switching into passes) and the same amount is subtracted from net churn in passes, allocating the adjustment to the pass products based on their ridership shares; this adjustment is converted from ridership to pass sales based on its share of FY16 average daily ridership. The end effect is that pass elasticities are lowered (larger in magnitude) by about -0.1, relative to the unadjusted estimates.

43 The analysis above focused on sales, but the same calculations were performed for average daily ridership, removing Feb. 2015. As seen in from aggregate totals in Figure 6-14, ridership followed a very similar pattern to sales (with the exception of Snowmageddon in Feb. 2015).

161

Table 6-5: Calculation of Adjusted Elasticities for MBTA LinkPasses Non-Corporate Corporate Monthly Monthly LinkPass LinkPass 7-Day LinkPass Total Net Switching in Avg Daily Ridership Implied by Pay-Per-Use -7,136 Traceable in AFC - Unadjusted -11,020 -12,104 8,126 -14,998 Traceable in AFC - Adjusted (/1.1) -10,018 -11,003 7,387 -13,635 Difference b/n Implied and Traceable 6,499 Allocation (FY16 Ridership Shares) 40% 31% 29% Allocated 2,618 2,007 1,873 Adjustment From Churn to Switching FY16 Avg Daily Ridership 168,724 129,367 120,723 Adjustment as % of FY 16 Ridership +1.6% +1.6% +1.6% FY16 Sales 1,042,860 1,052,635 2,454,765 Adjustment in Sales +16,182 +16,333 +38,090 Net Change in Sales -118,283 -11,844 -103,121 Adjusted Net Traceable Switching -34,239 -38,341 195,843 Adjusted Net Churn in Cards -84,044 56,741 -298,964 Baseline Assumption FY16 Avg FY16 Trend FY16 Avg FY16 Trend FY16 Avg FY16 Trend Baseline 9,848 -56,923 73,934 95,151 -38,492 -25,547 Beyond Baseline (Sales) -93,892 -27,121 -17,193 -38,411 -260,472 -273,417 Beyond Baseline (% of FY16 Sales) -9.0% -2.6% -1.6% -3.6% -10.6% -11.1% % Change in Price +12.7% +12.7% +12.7% +12.7% +11.8% +11.8% Implied Elasticity -0.71 -0.21 -0.13 -0.29 -0.90 -0.94

While the example calculations in Table 6-4 and Table 6-5 are based on a pay-per-use elasticity of -0.7, other values could be assumed; Figure 6-19 shows the implied monthly pass elasticities after adjustment based on different assumptions for pay-per-use elasticity. (As discussed earlier, the 7-day pass elasticities are not persuasive given changes in pass distribution at the same time as the fare change.) A higher assumed pay-per-use elasticity would imply that a greater portion of the net churn identified for passes (used to calculate pass elasticities) was actually net switching to pay-per-use; this reduces the pass elasticity, resulting in the downward slope in the graph.

Figure 6-19: Sensitivity of LinkPass Elasticity Adjustment to Assumed Pay-Per-Use Elasticity

162

This exercise was an attempt to adjust aggregate pass sales to isolate the impact of price elasticity from two other simultaneous customer responses to the fare change (fare product switching and resulting induced ridership). The approach demonstrated in this section is not theoretically or statistically rigorous and required many assumptions; however, it was deemed the best option given constraints of the policy scenario and the data that were available, and the results for monthly LinkPasses are not unreasonable. Figure 6-20 compares selected “forensic bounding” results for these monthly passes (adjusted using a pay-per-use elasticity of -0.7) to the current assumptions used in the MBTA’s FERRET model and to a naïve elasticity based on the percent change in net sales from FY16 to FY17 (not accounting for fare product switching). Relative to a naïve calculation, the “forensic bounding” approach moderates extreme results, lowering the non-Corporate elasticity and raising the Corporate elasticity by accounting for fare product switching observed in AFC.

There is still a wide range in the “forensic bounding” estimates depending on the assumed baseline (whether trends from FY16 would have continued into FY17), and final adjustments depend on an assumed elasticity for pay-per-use fares. Regardless, the results suggest that the elasticities for non- Corporate and Corporate LinkPasses are greater in magnitude than assumed in the FERRET model (- 0.15), and that non-Corporate LinkPasses are more sensitive to price than Corporate ones. Without a better understanding of the FY16 trends, it seems most reasonable to assume that the baseline is closer to the FY16 average. A weighted average of the elasticity estimates giving the FY16 average baseline twice as much weight as the FY16 trend baseline results in elasticities of -0.54 for the non-Corporate LinkPass and -0.18 for the Corporate LinkPass.

Figure 6-20: Comparison of Selected MBTA LinkPass Elasticity Estimates with MBTA FERRET Model Assumptions and Naïve Before-and-After Estimates

163

6.5 Estimation of Fare Product Choice Utility Parameters at the CTA Fare product choices are rarely modeled as part of transit fare change scenario evaluation. However, as will be discussed more in Chapter 7, fare product choice and customer switching between fare products can have important implications for the impacts of fare changes depending on the fare products that are offered, relative changes in the competitiveness of substitutable products with each other (a function of baseline prices and price changes), and the distribution of travel demand. Fare product choice is particularly important at agencies like the CTA and the MBTA that offer both pass and pay-per-use “tariff” options to customers; altering the ratio between a pass price and corresponding pay-per-use fares (called the “pass multiple”) can potentially generate substantial switching between products.

In this section, CTA Ventra (AFC) data are used to estimate a simple multinomial logit model of transit fare product choice. The model estimates are used as parameter inputs to the CTA fare change scenario prediction model presented in Chapter 7. 6.5.1 Methodology As discussed in Chapter 3, some fare prediction models include parameters that make marginal adjustments to demand for any particular fare product based on changes in the price of another product (such as cross-price elasticities) or changes in the relative price of another particular product (such as the diversion factors in the MBTA’s FERRET model). A few other prediction models have accounted for fare product choice with greater realism and flexibility using parameters and formulas based on random utility theory (typically using some form of a multinomial logit model). All but one of these latter studies relied on customer survey data to estimate choice model utility parameters; the one exception, Zureiqat (2008), used AFC data at Transport for London. Relative to customer surveys, AFC data severely restricts the variables that are available to predict fare product choice; however, AFC data has the advantage of revealed preferences (rather than stated preferences), comprehensive coverage, and low cost (since it is already available to many transit agencies).

The ultimate goal of this exercise was to provide parameter inputs to the scenario prediction procedure that is applied to the CTA in Chapter 7. This affected the model specification and estimation in four ways:

1. The CTA offers both pass and pay-per-use fare options to its customers, and there was interest in raising fares differently for the two kinds of fare products. This would cause some customers to switch fare products, so it was important to model fare product choices with some realism. As a result, a multinomial logit model was preferred to cross-elasticities or diversion factors. 2. Resources were not available to conduct a customized survey that could be used to estimate a fare product choice model, so estimation had to be performed using only AFC data. 3. The most recent CTA fare change at the time of model development was in January 2013, which was followed shortly by roll-out of the new Ventra fare collection system. In effect, the choice model had to be estimated using Ventra data from after the 2013 fare change, meaning that there

164

was no variation in fare levels. This is different from the AFC model in Zureiqat (2008), which spanned 26 months and multiple fare changes at Transport for London. 4. The prediction procedure used in Chapter 7 is a static, semi-aggregate spreadsheet model. In order to apply the choice model estimates to “representative individuals” in this spreadsheet, the choice utility parameters had to be steady-state (or convertible to steady-state), and the resulting “synthetic baseline” fare product market shares had to approximately match observed baseline market shares. This is another key difference from Zureiqat (2008), where the fare product choice model was autoregressive (fare product utilities depended on the choice in the previous time period) and the resulting parameter estimates were applied to simulation of individual-level panel data.

A simple cross-sectional multinomial logit specification was selected for four major fare products at the CTA – pay-per-use (PPU), 3-day passes, 7-day passes, and 30-day passes. Ventra provides a very good way to track most full-fare customers over time as they make different fare product choices; every day on which transit rides are taken could be considered a new choice situation. Ventra AFC data also provides a key factor that relates to fare product choice – the portfolio of rides that a customer took in the previous week, and more specifically what that portfolio of rides would cost under each fare product option; following Zureiqat (2008), the cost of the previous week’s rides can be used as a proxy for the expected costs of the coming week’s rides. These real-life choice situations – in which customers choose among fare products based in part on expected costs – allow estimation of multinomial logit formulas for the probability that a customer will choose any particular fare product (given their expected ridership).

The level of observation for the multinomial logit model is a single Ventra account (푛) on a specific day (푡) with observed ridership. The systematic utility formula for each fare product is shown and described below.

푉푃푃푈,푛,푡 = 훽푊푒푒푘푙푦퐶표푠푡 ∗ 푊푒푒푘푙푦퐶표푠푡푃푃푈,푛,푡

푉3푑푎푦,푛,푡 = 훼3푑푎푦 + 훽푊푒푒푘푙푦퐶표푠푡 ∗ 푊푒푒푘푙푦퐶표푠푡3푑푎푦,푛,푡 + ∑(훾3푑푎푦,푘 ∗ 1(푃푒푟푐푒푛푡푅푎푖푙퐵푖푛푘)) 푘

푉7푑푎푦,푛,푡 = 훼7푑푎푦 + 훽푊푒푒푘푙푦퐶표푠푡 ∗ 푊푒푒푘푙푦퐶표푠푡7푑푎푦 + ∑(훾7푑푎푦,푘 ∗ 1(푃푒푟푐푒푛푡푅푎푖푙퐵푖푛푘)) 푘

푉30푑푎푦,푛,푡 = 훼30푑푎푦 + 훽푊푒푒푘푙푦퐶표푠푡 ∗ 푊푒푒푘푙푦퐶표푠푡30푑푎푦 + ∑(훾30푑푎푦,푘 ∗ 1(푃푒푟푐푒푛푡푅푎푖푙퐵푖푛푘)) 푘

This specification models the attractiveness of a fare product for individual 푛 on day 푡 as a function of three factors: weekly cost, customer segment, and other systematic preferences.

1. Weekly cost. 훽푊푒푒푘푙푦퐶표푠푡 describes the importance of weekly cost in fare product choices. Weekly costs for each fare product are measured in a way that is intended to be consistent with an actual choice situation:

a. 푊푒푒푘푙푦퐶표푠푡푃푃푈,푛,푡 is measured as the “use value” of rides on account 푛 in the previous week (i.e. in the seven days before day 푡). “Use value” is the amount that would have been paid for those rides if pay-per-use fares were applied, regardless of which fare product was actually used. It is tempting to use rides in the week following the choice, but this would introduce endogeneity since ridership levels are affected by fare product

165

choice. Instead, it is imagined that customers think back on the frequency of their transit travel last week to estimate their expected travel this week. However, this is a merely a proxy for expected travel; in reality, customers have additional information about their expected upcoming travel needs, but this private information is not observed in AFC data. This means that the expected weekly cost of pay-per-use is measured with error.

As in linear regression, this is expected to bias 훽푊푒푒푘푙푦퐶표푠푡 toward zero – the attenuation or dilution problem (Stefanski and Carroll 1985); resulting fare product choice predictions will tend to understate the impact of changes in weekly cost (such as changes in fares and prices) on fare product choices.

b. 푊푒푒푘푙푦퐶표푠푡3푑푎푦,푛,푡 is measured as the price of a 3-day pass multiplied by the proportion of days that account 푛 was active in the previous week (with a minimum of the price of a 3-day pass). Similar to the logic for pay-per-use, the expected cost of using a 3-day pass is the cost of serving last week’s rides using a 3-day pass. If rides were taken on 3 days

or fewer, only one 3-day pass would need to be purchased and 푊푒푒푘푙푦퐶표푠푡3푑푎푦,푛,푡 would simply be the price of a single 3-day pass. If (for example) rides were taken on 5 day last week, then the expected weekly cost would be 5/3 multiplied by the 3-day pass

price. As with 푊푒푒푘푙푦퐶표푠푡푃푃푈,푛,푡, use of a proxy for expected days of travel in the coming week is expected to result in attenuation bias.

c. 푊푒푒푘푙푦퐶표푠푡7푑푎푦 is the price of a 7-day pass. The expected cost does not depend on any individual-level information (such as a customer’s total expected ridership or number of active days). It also does not vary over time, since as mentioned above there are no changes in fares or prices during the period of analysis.

d. 푊푒푒푘푙푦퐶표푠푡30푑푎푦 is similarly the prorated price of 30-day pass (the price multiplied by 7/30).

The 훽푊푒푒푘푙푦퐶표푠푡 coefficient is a generic parameter, meaning that a single value is applied to all fare products. Unlike in Zureiqat (2008), it is not possible to estimate alternative-specific

훽푊푒푒푘푙푦퐶표푠푡 parameters because there were no fare changes in the study period to generate 44 variation in the weekly cost of 7-day passes and 30-day passes. The generic 훽푊푒푒푘푙푦퐶표푠푡 parameter is estimated based on variation in individual-level weekly transit travel, which creates variation in the weekly cost of the pay-per-use and 3-day pass alternatives.

2. Customer segment. The 훾푘 terms capture the impact of membership in a particular customer segment on the utility of each fare product. Specifically, observations are divided into four groups based on the percentage of last week’s rides that were taken on rail (푘 휖 {0-24%, 25-49%,

50-74%, 75-100% rail}). 1(푃푒푟푐푒푛푡푅푎푖푙퐵푖푛푘) is an indicator variable for membership in group 푘. The 훾푘 terms are allowed to differ across the different products: 훾푘,3푑푎푦, 훾푘,7푑푎푦, and

훾푘,30푑푎푦, all measuring differences in utility relative to the PPU alternative.

44 Zureiqat (2008) estimates separate cost coefficients for passes with different time validities, which allows for pass costs to be expressed as totals rather than pro-rated to a specific time period across all products. As a result, Zureiqat’s cost coefficients capture a mix of aversion to total cost and aversion to up-front cost. This may actually be less interesting from a policy perspective than expressing all costs on the same time interval (e.g. weekly) and interpreting alternative-specific constants as capturing any aversion to up-front costs across product tariffs.

166

3. Other systematic preferences. The alternative-specific constant (ASC) terms 훼3푑푎푦, 훼7푑푎푦, and

훼30푑푎푦 capture other systematic preference for each fare product, conditional on weekly cost and customer segment and measured relative to the PPU alternative. There are many unobserved factors that may lead customers to systematically prefer one product over another, even if a single customer segment were considered and the weekly cost of each product were the same. For example, the convenience of paying less frequently and the inconvenience of having to pay up front are captured in the ASCs for the pass products.

While the level of observation for this specification is account-days with transit ridership, not all account- day observations are useful. For customers who purchase pass products, the first day with ridership on a pass represents a real choice; however, following that first day and for the remainder of the term of the pass product, the customer enjoys zero-marginal-cost ridership on the pass. While a customer could theoretically choose to ride pay-per-use while holding a valid pass, this would be highly unusual, and valid pass products are automatically used for validation by the CTA Ventra system before stored value. Since the choice to “hold” and use a previously-purchased pass on these days is uninformative, these observations are replaced by the original choice situation (on the day that the pass was purchased); this effectively weights pass choice situations by pass usage. If pass “hold” days were simply dropped during model estimation, passes would represent a very small share of total observed choice decisions. 6.5.2 Data The multinomial logit model is estimated using Ventra data from Jun 2016 through May 2017. The data were first filtered to include only accounts that exclusively used a combination of the four major full-fare products mentioned earlier -- pay-per-use (PPU), 3-day passes, 7-day passes, and 30-day passes; customers using limited-use products (such as single-ride tickets) and free or discounted products (such as senior passes or U-Passes) were excluded. (Note that this filtering is similar to the restrictions used for the Observed Baseline in the prediction model presented in Chapter 7.)

Fare transaction data were pre-processed to create the variables in the logit systematic utilities. Transactions were aggregated to the daily level for each account, and the most frequently-used fare product on each day was identified as the actual fare product choice for that day. Ridership, "use value," the number of active days, and the percentage of rides on rail (versus bus) over the previous week were then calculated for each account on each day. After filtering and pre-processing, there were about 80 million customer-days with transit travel (about 57 million that were not pass “hold” days) across about 2 million Ventra accounts.

Table 6-6 shows an example of the AFC data after pre-processing. This illustrates the logic of the choice situations: each day a customer rides transit, they consider the cost of making their expected transit trips under the different fare product options (which is approximated using the cost of taking the previous week's rides) and other preferences over fare products, and then they select a fare product (which is observed in AFC). In the example, on 7/20/2016 the customer faces expected weekly costs of only $20 using pay-per-use (the lowest cost of all the alternatives), and the customer is observed selecting pay-per- use that day. The next day, the same customer faces a higher expected weekly cost of riding pay-per-use ($23, based on the portfolio of rides taken over the last seven days, including 7/20/2016). This time, the customer chooses a 7-day pass. Skipping ahead one week to 7/28/2016, the 7-day pass has expired and the customer once again faces a choice between fare products. At this point, the expected weekly cost of

167 pay-per-use is $43.25 based on ridership over the previous week – much higher than the weekly cost of a 7-day or 30-day pass; however, the customer chooses to ride pay-per-use. In all likelihood, this customer knew that they needed to ride the CTA frequently one week (beginning 7/21/2016) and less frequently the next week (beginning 7/28/2016); to a researcher using AFC data, however, that information is unknown and its effect on fare product choices is modeled as random variation. This error in measurement of expected weekly costs cuts in both directions; for every customer switching from a pass to pay-per-use based on private knowledge that their transit use is decreasing, another customer switches from pay-per- use to a pass based on the private knowledge that their transit use is increasing. As explained earlier, errors of this sort in explanatory variables result in attenuation of predicted choice probabilities; the actual impact of weekly cost on fare product choice is expected to be greater than estimated by the model.

Table 6-6: Example Data Format for Choice Model Estimation Ventra (AFC) Percent Rail Fare Product Observed Date Weekly Cost Account Ridership Alternative Choice PPU $20.00 1 Example 3-Day Pass $35.75 0 7/20/2016 50-74% Account A 7-Day Pass $28.00 0 30-Day Pass $23.33 0 PPU $22.00 0 Example 3-Day Pass $31.75 0 7/21/2016 50-74% Account A 7-Day Pass $28.00 1 30-Day Pass $23.33 0 … … … … … … PPU $43.25 1 Example 3-Day Pass $45.50 0 7/28/2016 75-100% Account A 7-Day Pass $28.00 0 30-Day Pass $23.33 0

As described earlier, days on which a pass is "held" (i.e. a previously-purchased pass is still valid) were also replaced with the choice situation on the day when the pass was first selected. This effectively weights real (non-"holding") pass choice situations based on subsequent pass use; otherwise, pay-per-use selections would represent the vast majority of informative choice situations in the model. Choices made during the first seven days of the study period were removed, since the arbitrary start date for the study period prevents observation of a full week before observations on those days. Similarly, choices were dropped whenever an account had not been used within the past seven days, which would prevent use of the previous week to calculate expected weekly costs for PPU and 3-day passes); this applies both to the first observation of a card in AFC and to active days after a period of inactivity longer than one week.

The final dataset contained about 67 million observations (customer-days). It was not computationally feasible to estimate the logit model using all observations, so a random sample of 100,000 observations (individual-level fare product choices) was taken for estimation. Table 6-7 and Figure 6-21 compare fare product market shares, previous week's use value, and previous week's percent of rides on rail for the sample and the universe of observed daily choices. The sample is representative of all observations.

168

Table 6-7: Fare Product Market Shares in Sample and Population of Daily Choices PPU 3-Day Pass 7-Day Pass 30-Day Pass Sample 68.9% 0.1% 7.7% 23.3% Population 68.8% 0.1% 7.8% 23.3% Notes: “Sample” is a random sample of 100,000 observations. “Population” is observations after filtering described in text.

Figure 6-21: Previous Week’s Use Value and Percent Rail Rides in Sample and Population of Daily Choices

Notes: “Sample” is a random sample of 100,000 observations. “Population” is observations after filtering described in text.

Finally, choice situation observations in the sample were weighted during estimation. Weights were used to scale the previous week’s use value and percent rail distributions in the choice model data to current week’s use value and percent rail distributions in the spreadsheet model Observed Baseline data presented in Chapter 7. More specifically, for each fare product that was actually selected, the number of customer- days in the 100,000-observation choice model dataset were tabulated for binned combinations of the previous week’s use value ($0-4, $5-9, …, $35-39, and $40+) and the previous week’s percent of rail use (0-24%, 25-49%, 50-74%, and 75-100%). Similarly, using the Observed Baseline data in the spreadsheet model presented in Chapter 7, trips were tabulated for binned combinations of the current calendar week’s use value and percent of rail use. Each trip total from the spreadsheet model data was divided by the corresponding number of customer-days from the daily choice data to create weights; the weights were then applied to individual daily choices in the 100,000-observation sample. Table 6-8 shows the resulting weights for the choice model observations.

169

Table 6-8: Weights for Choice Model Estimation PPU Chosen 3-Day Pass Chosen 7-Day Pass Chosen 30-Day Pass Chosen Days in Days in Days in Days in Choice Choice Choice Choice Choice Choice Choice Choice Percent Use Model Trips in Ch Model Model Trips in Ch Model Model Trips in Ch Model Model Trips in Ch Model Rail Value Sample 6 Model Weight Sample 6 Model Weight Sample 6 Model Weight Sample 6 Model Weight 0-24% $0-4 3,461 3,657,530 1,057 3 6,214 2,071 75 72,321 964 179 93,096 520 0-24% $5-9 3,355 4,204,194 1,253 1 16,823 16,823 83 168,674 2,032 290 221,178 763 0-24% $10-14 4,553 6,181,875 1,358 0 0 N/A 194 416,406 2,146 631 790,571 1,253 0-24% $15-19 3,275 5,287,177 1,614 0 0 N/A 335 813,508 2,428 840 1,373,342 1,635 0-24% $20-24 2,731 4,575,746 1,675 2 24,794 12,397 550 1,399,774 2,545 1,111 2,178,611 1,961 0-24% $25-29 681 1,362,560 2,001 1 15,490 15,490 538 1,403,498 2,609 588 1,392,435 2,368 0-24% $30-34 214 514,434 2,404 0 0 N/A 363 1,215,885 3,350 361 939,658 2,603 0-24% $35-39 85 198,233 2,332 2 6,956 3,478 286 922,038 3,224 186 581,399 3,126 0-24% $40+ 56 146,711 2,620 4 12,085 3,021 344 1,584,458 4,606 168 826,976 4,922 25-49% $0-4 453 539,291 1,190 0 0 N/A 27 22,461 832 39 21,837 560 25-49% $5-9 1,243 1,813,436 1,459 0 0 N/A 40 94,087 2,352 195 128,386 658 25-49% $10-14 1,696 2,353,303 1,388 0 0 N/A 105 207,921 1,980 344 381,902 1,110 25-49% $15-19 1,514 2,527,957 1,670 1 46,837 46,837 185 447,911 2,421 570 833,259 1,462 25-49% $20-24 1,400 2,517,699 1,798 1 38,089 38,089 351 887,357 2,528 747 1,473,915 1,973 25-49% $25-29 748 1,493,896 1,997 5 23,401 4,680 439 1,268,623 2,890 680 1,638,164 2,409 25-49% $30-34 292 656,865 2,250 2 15,319 7,660 384 1,228,401 3,199 486 1,268,530 2,610 25-49% $35-39 82 258,292 3,150 2 9,657 4,829 269 925,457 3,440 243 805,092 3,313 25-49% $40+ 55 181,135 3,293 3 16,700 5,567 374 1,563,950 4,182 300 1,083,358 3,611 50-74% $0-4 1,486 1,702,411 1,146 2 8,490 4,245 40 38,204 955 151 63,329 419 50-74% $5-9 1,855 2,478,292 1,336 1 26,887 26,887 50 88,945 1,779 282 193,412 686 50-74% $10-14 2,061 2,889,539 1,402 0 0 N/A 92 174,294 1,895 481 512,793 1,066 50-74% $15-19 2,297 3,544,110 1,543 1 61,378 61,378 156 362,853 2,326 819 1,243,474 1,518 50-74% $20-24 2,443 3,942,876 1,614 2 48,807 24,404 297 711,111 2,394 1,209 2,299,542 1,902 50-74% $25-29 872 1,658,591 1,902 0 0 N/A 290 863,697 2,978 832 1,913,518 2,300 50-74% $30-34 288 612,830 2,128 1 11,602 11,602 243 707,377 2,911 457 1,256,908 2,750 50-74% $35-39 96 213,295 2,222 0 0 N/A 158 511,648 3,238 245 736,909 3,008 50-74% $40+ 38 123,412 3,248 2 8,899 4,450 195 686,206 3,519 241 720,042 2,988 75-100% $0-4 4,697 5,734,255 1,221 3 24,721 8,240 52 44,461 855 396 145,265 367 75-100% $5-9 4,930 6,566,088 1,332 4 73,270 18,318 54 111,999 2,074 726 471,145 649 75-100% $10-14 5,601 7,262,754 1,297 6 106,719 17,787 74 185,107 2,501 1,282 1,254,209 978 75-100% $15-19 6,157 9,100,959 1,478 4 92,712 23,178 163 341,207 2,093 1,987 2,861,809 1,440 75-100% $20-24 7,807 12,158,038 1,557 6 59,529 9,922 294 724,861 2,466 3,509 6,297,155 1,795 75-100% $25-29 1,778 3,129,172 1,760 3 24,674 8,225 296 701,589 2,370 1,650 3,598,937 2,181 75-100% $30-34 426 791,204 1,857 1 8,752 8,752 170 449,989 2,647 623 1,582,648 2,540 75-100% $35-39 120 244,180 2,035 1 4,795 4,795 82 312,313 3,809 322 755,640 2,347 75-100% $40+ 61 118,930 1,950 1 4,614 4,614 73 294,468 4,034 137 569,397 4,156

The motivation for weighting was to improve the Synthetic Baseline predictions in the spreadsheet model presented in Chapter 7, which use the estimated fare product choice model to simulated baseline fare product market shares. It was found that using these weights during choice model estimation improved the match between the Synthetic Baseline and the Observed Baseline. This is likely due to the imperfect correspondence between the two datasets: individual customer-day observations categorized by the previous week’s ridership are used to estimate the choice model, but the choice model parameters are then applied to aggregated customer-week observations categorized by the current calendar week’s ridership in the spreadsheet model (as described in Chapter 7). The logic of this approach is that groups of similar individuals can be treated like individuals (the “classification” approach described in Ben-Akiva and

170

Lerman (1985), pp. 138 and 151) and that, in steady-state, the previous week’s ridership is the same as the current week’s ridership (at least on average). However, in reality, these different datasets do not match perfectly in terms of fare product market shares (either total or for disaggregate segments defined by weekly ridership and other rider characteristics). Weighting helped to align the different datasets.

Alternative approaches were considered to facilitate application of the choice model estimates to an aggregate prediction model. In cases where a choice-based sample is used for estimation, Ben-Akiva and Lerman (1985) (pp 237-238) explains that weights are not needed during estimation; constant terms can merely be adjusted by subtracting 푙푛(푀푎푟푘푒푡푆ℎ푎푟푒푖,푠푎푚푝푙푒/푀푎푟푘푒푡푆ℎ푎푟푒푖,푝표푝푢푙푎푡푖표푛) from the constant term for each alternative 푖. While the sampling used here for choice model estimation was random, not choice-based, the different natures of the choice model data and the spreadsheet model data did result in minor differences in fare product market shares in the two datasets; however, simple adjustments to alternative-specific constants did little to improve the match between the Synthetic Baseline and the Observed Baseline (particularly at a disaggregate level). Alternatively, the exact same group of customers used to estimate the choice model could be used to create the customer segments for the spreadsheet model; Ben-Akiva and Lerman (1985), p. 151, explains a procedure for adjusting the scale parameter of a choice model to convert an individual-level model to a market segment-level model. However, this approach is less intuitive and less modular; rather than estimating a discrete choice model and applying the results to a separate “classification” or segmentation of customers in an observed baseline, the choice model would need to be applied to the exact same dataset that was used for estimation (with some scaling to recover market totals). 6.5.3 Analysis and Results Table 6-9 shows the parameter estimates for models using the unweighted choice data and the weighted choice data. As expected, the generic coefficient on the weekly cost variable is negative, meaning that an increase in the expected cost of a fare product lowers the utility of selecting that product. The coefficients on the customer segment dummy variables all measure preferences relative to pay-per-use. These coefficients for 30-day passes are positive, indicating that customers who took a higher share of their rides on rail (relative to the reference level of 0-24% rail) were more likely to select 30-day passes than PPU (all else constant); for 7-day passes, these coefficients decrease as rail use increases, showing that 7- day passes are less likely to be chosen by customers with higher rail use (again, with weekly cost held constant).

The alternative specific constant is highest for pay-per-use, which has a constant of 0 as the reference level. If both weekly cost and customer segment (percent rail) are held constant, customers in the sample preferred pay-per-use to any of the pass products. This may come as a surprise, since passes provide some conveniences over pay-per-use; however, the inconvenience of loading value for pay-per-use travel has been reduced in Chicago by the Ventra system, and the alternative-specific constants also capture any aversion to the up-front cost of passes and any value of the assurance that transit costs will not exceed the pass price.

The McFadden R-squared (also called pseudo-R-square or 휌2) values for both the unweighted and weighted models appear to be low. However, this choice model was not expected to have very strong explanatory power given the reliance on previous week’s travel behavior to predict subsequent fare product choices (without have access to private knowledge about expected changes in demand).

171

Additionally, in many travel demand contexts a McFadden R-squared of 0.2-0.4 often represents an excellent fit (McFadden 1977). The closest comparison to this choice model, Zureiqat (2008), achieved a much higher McFadden R-squared (0.837) by modeling inertia in fare product choices through “previous choice” dummy variables. Adding a “previous choice” dummy variable to the CTA choice model provides a similarly dramatic improvement in explanatory power (McFadden R-squared of 0.6832 for the unweighted model and 0.6189 for the weighted model); however, as discussed earlier, inertia was deliberately excluded from the model to more easily apply the results in a “steady-state” prediction model (described in Chapter 7). Prior CTA fare product choice models estimated using survey data also had higher explanatory power without inertia; however, they relied on additional explanatory variables in the survey data that are not available in AFC (such as car ownership). Explanatory power of an AFC choice model could be improved by refining the model specification and improving AFC-based segmentation. (See the list of possible model improvements and extensions below.)

Table 6-9: Fare Product Choice Model Estimates Without Weighting With Weighting Estimate Std. Error t-value Estimate Std. Error t-value 3-Day Pass Constant -5.760 0.278 -20.7 -3.982 0.139 -28.6 7-Day Pass Constant -0.731 0.023 -31.2 -0.104 0.022 -4.7 30-Day Pass Constant -0.790 0.019 -42.0 -0.693 0.020 -34.4 Weekly Cost -0.110 0.001 -114.2 -0.137 0.001 -127.8 25-49% Rail : 3-Day Pass 0.750 0.385 1.9 1.069 0.177 6.0 25-49% Rail : 7-Day Pass 0.224 0.034 6.6 0.059 0.032 1.9 25-49% Rail : 30-Day Pass 0.276 0.029 9.6 0.149 0.031 4.8 50-74% Rail : 3-Day Pass -0.125 0.434 -0.3 0.841 0.170 4.9 50-74% Rail : 7-Day Pass -0.371 0.036 -10.4 -0.476 0.033 -14.6 50-74% Rail : 30-Day Pass 0.308 0.026 11.7 0.244 0.028 8.8 75-100% Rail : 3-Day Pass 0.065 0.334 0.2 0.825 0.151 5.5 75-100% Rail : 7-Day Pass -1.406 0.036 -38.9 -1.435 0.032 -45.3 75-100% Rail : 30-Day Pass 0.276 0.022 12.7 0.228 0.023 9.9 N: 100,000 100,000 Null Log-Likelihood -79,859 -94,515 Log-Likelihood: -69,135 -77,225 McFadden R^2: 0.1342 0.1829 Notes: Estimation was performed using the mlogit package in the open source software R

The implications of these estimates for aggregate ridership and revenue by fare product are explored using a prediction spreadsheet model in Chapter 7 (using results for the weighted choice data). To give an initial flavor though, these coefficients can be used to plot the probability that an individual customer will select each fare product given their weekly use value and the percent of rides they take on rail. For example, the probability that an individual customer (푛) will choose pay-per-use is given by:

푒푉푃푃푈,푛 푃푛(푃푃푈) = 푒푉푃푃푈,푛 + 푒푉3푑푎푦,푛 + 푒푉7푑푎푦,푛 + 푒푉30푑푎푦,푛

172

Probabilities for other products are calculated similarly. Figure 6-22 shows how these choice probabilities change over different weekly use values (i.e. as transit demand varies between individuals or for one individual over time). Each plot is for a single customer segment (percent rail ridership).

As weekly use value increases, the weekly cost of riding pay-per-use approaches and then exceeds the weekly cost of using a pass ($46.67 for 3-day pass, $28 for 7-day pass, and $23.33 for 30-day pass). As a result, the probability of selecting pay-per-use decreases from close to 100% at low use values to about 5% at $45. The probabilities of selecting a 7-day or 30-day pass are almost equal for customers primarily riding bus; as mentioned earlier, 30-day passes are preferred more and more to 7-day passes as the percent of a customer’s rides on rail increases, with less than 15% of rail users choosing 7-day passes at high use values. The preference of bus users for 7-day passes is surprising since 30-day passes have a lower weekly cost than 7-day passes ($23.33 versus $28 before 2018) and purchasing multiple 7-day passes is likely less convenient for bus users than rail users (due to ticket vending machines in rail stations). The difference might be explained by lower incomes and aversion to the upfront cost of a 30-day pass among customers who rely on bus service, as suggested in Verbich and El-Geneidy (2017). It may also reflect differences in user captivity and service quality; customers who only have access to bus service may be less inclined to commit to extended periods of transit use.

Figure 6-22: Predicted Fare Product Choice Probabilities for an Individual Customer by Weekly Use Value

6.5.4 Possible Extensions The choice model estimated in this section could be improved and extended in many different ways:

 The weighting procedure could be improved – weighting by number of customer-weeks rather than trips from the spreadsheet model, and normalizing weights by fare product – or, if possible, eliminated.

173

 The measurement error in expected weekly costs could be described, and the model could be modified to correct or mitigate the resulting attenuation bias. This arises from use of the previous week’s travel as a proxy for the upcoming week, and the result is an expected understatement of the impact of fares and prices on fare product choice. This issue was not discussed in Zureiqat (2008), which also used the previous week’s travel to approximate expected weekly costs.  The model specification could potentially be improved by moving to a nested logit structure. Nesting has been used in past CTA fare product choice models, and it is sensible that pass products would be nested together.  Model specification could also be improved by taking advantage of the panel structure of the dataset. Conditional or fixed effects logit specifications could control for unobserved, time- invariant characteristics of individual customers, focusing on how choices change over time for the same individual. A mixed logit model could also control for individual effects, additionally allowing for individual heterogeneity or random variation in preferences; for example, the effect of weekly cost on fare product choice could be estimated as a random variable with a Normal distribution.  Along similar lines, the model could include inertia in fare product choice, as in Zureiqat (2008), which would greatly improve explanatory power; this would require finding a way to convert the estimated coefficients into a steady-state choice probability formula for application to a static prediction model. (Zureiqat (2008) performs a conversion of linear regression coefficients from short-term to long-term/steady-state, but he does not do the same for his logit model coefficients.)  The model specification could additionally include better segmentation of customers; this would allow the factors driving fare product choice to vary more realistically across different groups of transit users. While AFC data does not contain all of the variables in a customer survey (such as income, race, and car ownership), AFC data does provide many features describing patterns of purchases and travel. For example, a customer’s preferred sale channel could be used to segment customers; as discussed in Chapter 5, sale channels are highly correlated with preferences for 7- day passes over 30-day passes. Travel pattern features developed from AFC data could also be used to develop cluster-based segmentations; the Appendix provides an example of using cluster- based segmentation of travel frequency and regularity from AFC to distinguish commuters and non-commuters.  Some significant differences between groups of customers cannot be explicitly identified using AFC data. However, a latent-class model could be used to distinguish unobserved groups (such as customers who switched fare products because their transit demand changed and those who switched products for other reasons).  The use of a one-week window for a customer’s recent use value is a convenient balance, observing travel frequency without requiring observation of an individual over an extended period of time. However, one week of ridership may not be indicative of longer-term travel frequency; model predictions might be improved by looking back longer than a single week to estimate expected weekly costs by fare product.  The choice model could be estimated using data that spans a fare change (which was not possible at CTA at the time of analysis). This would provide additional variation in fare levels and choices. That variation would allow estimation of alternative-specific coefficients on weekly cost (which is impossible without variation in the price of each product) and potentially inclusion of other variables in the choice model (such as distinct variables for weekly cost and upfront cost).

174

6.6 Summary and Conclusions This chapter demonstrated how transit AFC data can be used to estimate parameters describing three key fare-related behaviors – how many additional rides customers take when they use pass products (measured by induced ride factors), sensitivity of ridership on different fare products to changes in fares (expressed as point elasticities), and how customers choose among different fare product alternatives (modeled using choice model utility parameters). AFC systems can be used to develop cross-sectional data on individuals, aggregated time series data on total sales and ridership, and panel data tracing cards or accounts over time; each has advantages and disadvantages, and all three formats are used at different points in the chapter. Parameters are estimated at either the CTA or the MBTA using the data and methods summarized in Table 6-10. The use of both pass products and pay-per-use fares at these agencies (a very common aspect of fare structure) highlights the importance of accounting for fare product choice when analyzing past or potential future fare changes (especially changes altering relative product prices or pass “multiples”).

Table 6-10: Summary of Fare-Related Parameters and Estimation Methods

The first analysis estimates upper bounds on induced ride factors at the CTA. It is difficult to estimate these parameters (describing the impact of fare product choice on ridership) because of reverse causality – expected ridership is a primary driver of fare product choice. AFC panel data and account filtering is used to focus on customer who plausibly switched between passes and pay-per-use for reasons other than large changes in expected ridership. The resulting overall upper bound is 1.11 for 30-day passes and 1.21 for 7-day passes (that is, a customer using a 30-day pass can be expected to use the CTA for up to 11% more trips on average than if the same customer were to use pay-per-use, and the ridership impact of a 7- day pass relative to pay-per-use is about twice as large). The upper bounds vary across different trip types, with pass use having the largest impact on bus rides, transfer trips, and off-peak and weekend trips (relative to rail, one-seat, and peak weekday rides). While still affected by product self-selection, these results affirm the role of pass products in growing and sustaining transit ridership, and it suggests that the shift away from passes following the CTA’s 2013 pass price increases (described in Chapter 2) likely contributed to subsequent ridership declines.

175

The second analysis estimates fare elasticities for three major fare products at the MBTA – pay-per-use, Corporate Monthly LinkPasses, and all other Monthly LinkPasses – using the experience of the July 2016 fare change. The three empirical challenges are identification of the specific passes sold through the Corporate Program, use of a single, simultaneous fare change event, and the need to separate fare product switching from other changes in ridership and sales; this final challenge is a result of the specific fare change, which increased LinkPass prices by about 12% and pay-per-use fares by only 6% (causing some customers to switch from LinkPasses to pay-per-use). AFC panel data for customers riding pay-per-use both before and after the fare change is used to estimate pay-per-use elasticities (conditional on fare product choice and non-zero transit ridership). A separate panel data set of monthly year-over-year changes for each fare product is used to remove observable fare product switching; the panel data is then aggregated and analyzed as a time series to estimate LinkPass elasticities (adjusted for consistency with pay-per-use elasticities). The resulting elasticity estimates – -0.7 for pay-per-use, -0.5 for non-Corporate LinkPasses, and -0.2 for Corporate LinkPasses – are more moderate than naïve before-and-after estimates but significantly larger than current elasticity assumptions in the MBTA’s fare change model (FERRET).

The third analysis estimates a multinomial logit fare product choice model at the CTA. There were no recent fare changes, and stated preference surveys about customer fare product preferences were not available. Following Zureiqat (2008), a simple product choice model for four major CTA fare products – pay-per-use, 3-day pass, 7-day pass, and 30-day pass – was estimated using individual-level AFC data on observed CTA travel and actual fare product selection; ridership in the past week on each Ventra account was used to calculate proxy values for the expected weekly cost of using pay-per-use, which customers are assumed to compare to pass prices when selecting a fare product. The resulting product utility formulas suggest an inherent preference for pay-per-use (if costs and customer use of rail versus bus were held constant). Preferences between 7-day passes and 30-day passes vary considerably by customer rail use; customers taking most of their transit trips on bus were more likely to prefer 7-day passes, and those taking more trips on rail preferred 30-day passes. The choice model estimates are used in the CTA prediction model in Chapter 7.

The estimation methods and specific estimates in this chapter can be used to develop parameter inputs to fare models for predicting the ridership and revenue impacts of potential fare policy changes. Prediction modeling is the focus on the next chapter.

176

7 Incorporating Fare Product Choice in Modeling of Fare Change Scenarios 7.1 Introduction In Chapter 4, three fare-related customer behaviors were identified that have an important impact on ridership and revenue: induced ridership, fare product choice, and elasticity. In Chapter 6, parameters associated with each behavior were estimated using AFC data at the MBTA and the CTA, but this leaves several questions for fare policy scenario analysis. How specifically can the three customer behaviors be incorporated into a single set of model calculations (ideally that are easy to understand and straightforward to implement using AFC data)? How flexible is the resulting model in its ability to analyze a range of different fare policy scenarios, and how accurately can it predict customer behaviors?

This chapter illustrates a simple prediction procedure originally developed at the CTA that can be populated using AFC data and can represent all three of the key customer behaviors using the kinds of parameters that were estimated in Chapter 6. This is potentially significant in two ways:

1. While the basic model structure in this chapter is not new, it is potentially relevant to nearly all major transit agencies and yet has not been widely implemented. Throughout this thesis, it has been argued that transit agencies offering multiple tariff options – nearly all major transit agencies around the world (Translink 2016) – need to consider the implications of all three key customer behaviors mentioned above when they evaluate fare changes. The most common ridership and revenue modeling techniques, however, have only crude ways of accounting for fare product choice (and associated induced ridership). The model structure shown here has the potential to improve on this current practice. 2. For the first time, this model structure is applied using only AFC data. As originally developed at the CTA, the fare product choice parameters required by this model structure were estimated using specially-designed customer surveys and applied to system-wide customer surveys. The model shown in this chapter instead uses the product choice parameters estimated using AFC data in Chapter 6 and applies them directly to AFC data.

The chapter begins by describing the proposed prediction procedure and comparing it to some of the alternative modeling approaches that were described in Chapter 3. The model is then demonstrated using example scenarios at the CTA to better understand the functionality and potential usefulness of the model. Finally, model predictions are evaluated using actual outcomes following the January 2018 fare change at the CTA. 7.2 Methodology

7.2.1 Selection of a Prediction Procedure Several factors were considered in selecting an appropriate prediction procedure for the CTA:

177

1. Desired model applications and outputs. The CTA was interested in exploring scenarios that changed fares differently for different products and trip types (for example, changing rail fares differently from bus fares, or changing pass prices in different proportions than pay-per-use fares). There was also interest in scenarios that added new differentiation, such as peak vs. off- peak pricing, and that modified transfer pricing policies. These applications required a model that distinguished different fare products and disaggregated demand by existing and potential fare rules (such as bus vs. rail trips and peak vs. off-peak trips). 2. Important customer behaviors implied by the fare structure. As a reminder, Chapters 3 and 4 described several simultaneously-determined customer behaviors that can drive the ridership and revenue impacts of fare change scenarios – mode choice, fare product choice, and induced demand. Mode choice and induced demand are relevant to all fare change scenarios, while fare product choice can sometimes be ignored (e.g., if an agency only offers a single tariff option). At the CTA, however, customers can choose between pay-per-use fares and period pass products, and there was interest in changing pass prices and pay-per-use fares in different proportions; fare product choice (and associated induced ridership) would likely play an important role in product- level ridership and revenue predictions. This required a model that explicitly accounted for price changes and relative price levels across fare product alternatives and allowed for realistic switching between fare products. 3. Available demand data and behavioral parameters. Recent fare-related surveys (or resources to conduct them) were not available for this modeling effort, so any prediction procedure needed to be built on a combination of CTA AFC data, existing estimates of behavioral parameters at the CTA, and behavioral parameter estimates from other transit agencies in the academic literature. 4. Simplicity and flexibility of implementation. A simple implementation in Excel was preferred. This would allow CTA staff to easily modify scenario parameters and generate custom output summaries using a familiar program. 7.2.2 Model Mechanics The selected prediction procedure is based on the logic of models developed for the CTA by Multisystems, Inc. from 1990 to 2005, which are described in Chapter 3 (Fleishman, Koppelman, and Schofer 1991; Multisystems, Inc 2000). Figure 7-1 describes the overall structure of the model. First, AFC data on transit trips are aggregated by segment in a way that retains some information on total weekly ridership per account, forming an Observed Baseline. To allow for later prediction of customer fare product switching, a Synthetic Baseline is created by allocating ridership and revenue to fare products using current fare levels and a multinomial logit formula (for which parameters were estimated in Chapter 6); this formula is recalibrated until the Synthetic Baseline approximately matches observed fare product shares for each segment (and overall). The same fare product choice formula is then applied using Scenario fares to calculate switching between fare products under alternative fare levels. In addition to predicting these changes in fare product market shares, the model also adjusts ridership and revenue for changes in final customer travel costs (net of any fare product switching); elasticities are used to adjust for “direct” demand responses, and induced ride factors are used to scale ridership for customers who switch between pay-per-use products and pass products. (Passes have zero marginal costs for each trip, so the same customer will tend to ride more with a pass than with a pay-per-use fare product.) The end result of these adjustments is Scenario ridership and revenue by segment, which can be compared to the Synthetic Baseline to find predicted impacts of the fare change Scenario.

178

The model mechanics are described in more detail in the sections below.

Figure 7-1: Structure of Fare Product Choice and Elasticity Fares Scenario Model

Observed Baseline First, AFC data on trips and “use value” (the applicable pay-per-use fare for trips) are aggregated along two dimensions to create an Observed Baseline of ridership and revenue by segment. One dimension for aggregation is “customer-week” attributes, where a customer-week is an observation of a single card or account in AFC data over one week. The most important customer-week attribute is the amount of transit system use in one week, measured as use value. Customer-weeks are grouped into nine $5 ranges of use value ($0-4, $5-9,…, $35-39, and $40+). For this application, customer-weeks were also grouped by the share of rides taken on rail in a week. The second dimension for aggregating AFC data is trip type; time period, mode, and number of transfers were used for this application. The result of aggregation is Observed Baseline ridership and revenue by combinations of customer-week type and trip type (each called a “segment”). The segments in the model are shown in Figure 7-2. For example, the segment in the top left cell is the total number of peak, one-seat, rail trips from customer-weeks with a total of $0-4 worth of trips and 0-24% of trips taken on rail. The revenue associated with these trips is based on either the applicable per-trip fare (for pay-per-use fare products) or on pass prices (for customer-weeks using a pass).

179

Figure 7-2: Model Segments Defined by Customer-Week Type and Trip Type Trip Type

% Rail Use

Rides: Value:

Peak Rail Peak OffpeakRail Bus Peak OffpeakBus Rail→Rail Peak Rail→Rail Offpeak Bus→Rail Peak Bus→Rail Offpeak Rail→Bus Peak Rail→Bus Offpeak Bus→Bus Peak Bus→Bus Offpeak Rail→Rail→Rail Peak Rail→Rail→Rail Offpeak Bus→Rail→Rail Peak Bus→Rail→Rail Offpeak Rail→Bus→Rail Peak Rail→Bus→Rail Offpeak Bus→Bus→Rail Peak Bus→Bus→Rail Offpeak Rail→Rail→Bus Peak Rail→Rail→Bus Offpeak Bus→Rail→Bus Peak Bus→Rail→Bus Offpeak Rail→Bus→Bus Peak Rail→Bus→Bus Offpeak Bus→Bus→Bus Peak Bus→Bus→Bus Offpeak $0-4 $5-9 $10-14 $15-19 0-24% $20-24 Rail $25-29 $30-34 $35-39 $40+ $0-4 $5-9 $10-14 $15-19 25-49% $20-24 Rail $25-29 $30-34 $35-39 $40+ $0-4 $5-9 $10-14 $15-19 50-74% $20-24 Rail $25-29 $30-34 $35-39

Customer-Week Type Customer-Week $40+ $0-4 $5-9 $10-14 75- $15-19 100% $20-24 Rail $25-29 $30-34 $35-39 $40+

180

Synthetic Baseline Second, a simple multinomial logit fare product choice formula is used to create a Synthetic Baseline by allocating customer-weeks and associated ridership to different fare products. Each group of customer- weeks (defined by a range of use value and percent of trips on rail) is treated as a “representative individual” for modeling fare product choices; this avoids the need to use sample enumeration to simulate fare product choices for each individual customer (which would be very data-intensive and require a database system rather than a simple spreadsheet). For each customer-week type 푐 (across all trip types 푡), the logit systematic utility for each fare product 푗 is calculated as

푉푐,푗,퐵푎푠푒 = 훼푐,푗 + 훽푊푒푒푘푙푦퐶표푠푡 ∗ 푊푒푒푘푙푦퐶표푠푡푐,푗,퐵푎푠푒 where 훼푗 and 훽 are exogenous input parameters and 푊푒푒푘푙푦퐶표푠푡푐,푗,퐵푎푠푒 is the weekly cost for customer- week type 푐 of using fare product 푗 under Baseline prices (across all trip types). If there are 퐽 total products, then the market share of a particular fare product 푗 for customer-week type 푐 is calculated using the multinomial logit probability formula:

푒푉푐,푗,퐵푎푠푒 푆ℎ푎푟푒푐,푗,퐵푎푠푒 = ⁄ 퐽 ∑푖=1 푉푐,푖,퐵푎푠푒

The same fare product shares are applied to all trip types within each customer-week type.

푐푢푠푡표푚푒푟_푤푒푒푘푠푐,푗,퐵푎푠푒 = 푐푢푠푡표푚푒푟_푤푒푒푘푠푐,퐵푎푠푒 ∗ 푆ℎ푎푟푒푐,푗,퐵푎푠푒

푡푟푖푝푠푐,푡,푗,퐵푎푠푒 = 푡푟푖푝푠푐,푡,퐵푎푠푒 ∗ 푆ℎ푎푟푒푐,푗,퐵푎푠푒

The simple idea here is that customers select fare products based in part on a rough comparison of the cost of taking their typical weekly portfolio of trips using different fare products; this explains the importance of aggregating AFC data by customer-week type, in order to retain some information about the frequency of customer travel even in an aggregated, Excel-based analysis. For pass products, the weekly cost is the prorated pass price; for pay-per-use, the weekly cost is the average pay-per-use fare per customer-week for all trips taken by a given customer-week type. Figure 7-3provides a conceptual example of how fare product market shares are calculated for each customer-week type.

Figure 7-3: Example Calculation of Fare Product Market Shares for One Customer-Week Type

181

The resulting fare product allocation will not necessarily match the Observed Baseline fare product market shares. To crudely correct for this, the 훼푐,푗 constant terms in the formula are adjusted until the Synthetic Baseline roughly matches the Observed Baseline at the segment level. (This recalibrated logit choice formula is also applied later to the Scenario for consistency with the Synthetic Baseline.)

So far customer-weeks and segment-level trips have been allocated to fare products, but revenue can also be calculated for the Synthetic Baseline. For pay-per-use, revenue (푅) is calculated at the segment level from pay-per-use trips and the applicable fare (based on the trip type for each segment).

푅푒푣푒푛푢푒푐,푡,푃푃푈,퐵푎푠푒 = 푓푎푟푒푡,푃푃푈,퐵푎푠푒 ∗ 푡푟푖푝푠푐,푡,푃푃푈,퐵푎푠푒

For passes, revenue is calculated initially using the prorated weekly cost of a pass type and the number of customer-weeks allocated to that pass type. However, due to an imperfect correspondence between customer-weeks and pass sales, this resulted in an overestimate of pass revenue. (For example, one 7-day pass may span two customer-weeks but should only be counted as one sale.) To correct for this, weekly revenue was scaled down until total Synthetic Baseline revenue matched observed baseline revenue from AFC sales records for each pass type.

푅푒푣푒푛푢푒푐,푃푎푠푠푇푦푝푒,퐵푎푠푒 = 푝푟푖푐푒푃푎푠푠푇푦푝푒,퐵푎푠푒 ∗ 푆푐푎푙푒퐹푎푐푡표푟푃푎푠푠푇푦푝푒 ∗ 푐푢푠푡표푚푒푟_푤푒푒푘푠푐,푃푎푠푠푇푦푝푒,퐵푎푠푒

Scenario Scenario predictions involve re-allocating ridership to fare products under Scenario fare levels and making several demand response adjustments for effective change in customer transit costs (net of fare product switching).

First, new weekly costs for each customer-week type under each fare product are calculated using the Scenario fare levels. The recalibrated multinomial logit formula (adjusted earlier to improve the match between the Observed Baseline and the Synthetic Baseline) is then applied again to the new weekly costs to create new unadjusted fare product market shares; net shifts into or out of each fare product from the Synthetic Baseline to the Scenario can be observed directly.

푉푐,푗,푆푐푛 = 훼_푎푑푗푐,푗 + 훽푊푒푒푘푙푦퐶표푠푡 ∗ 푊푒푒푘푙푦퐶표푠푡푐,푗,푆푐푛

푒푉푐,푗,푆푐푛 푆ℎ푎푟푒푐,푗,푆푐푛 = ⁄ 퐽 ∑푖=1 푉푐,푖,푆푐푛

Fare elasticities (exogenous inputs to the model) are assigned to each segment based on trip attributes. For net "non-switchers" – customer-weeks at or below the Baseline level for a given fare product 푗 – these elasticities (휀) are applied to the segment-level percent change in applicable fare for the selected fare product to adjust customer-weeks and ridership in response to the price increase. As with the Baseline revenue calculation, this elasticity adjustment is performed at the level of trip types (푡) for PPU and at the level of customer-week types (푐) for each pass product.

182

푓푎푟푒푡,푃푃푈,푆푐푛 − 푓푎푟푒푡,푃푃푈,퐵푎푠푒 푡푟푖푝푠푐,푡,푃푃푈,푆푐푛,푁표푛푆푤푖푡푐ℎ = (1 + 휀푡,푃푃푈 ∗ ) ∗ min (푡푟푖푝푠푐,푡,푃푃푈,퐵푎푠푒, 푡푟푖푝푠푐,푡,푃푃푈,푆푐푛) 푓푎푟푒푡,푃푃푈,퐵푎푠푒

푝푟푖푐푒푃푎푠푠푇푦푝푒,푆푐푛 − 푝푟푖푐푒푃푎푠푠푇푦푝푒,퐵푎푠푒 푡푟푖푝푠푐,푃푎푠푠푇푦푝푒,푆푐푛,푁표푛푆푤푖푡푐ℎ = (1 + 휀푃푎푠푠푇푦푝푒 ∗ ) 푝푟푖푐푒푃푎푠푠푇푦푝푒,퐵푎푠푒

∗ min (푡푟푖푝푠푐,푃푎푠푠푇푦푝푒,퐵푎푠푒, 푡푟푖푝푠푐,푃푎푠푠푇푦푝푒,푆푐푛)

For net "switchers" to a given fare product in the Scenario – customer-weeks above the Baseline level – the change in weekly cost from the Baseline to the Scenario (net of fare product switching) depends on which fare products were switched from (i.e., which were selected in the Synthetic Baseline). A Baseline weighted average weekly cost is calculated for “switchers” to each fare product (→ 푗) from all other products (−푗). Change from this Baseline weighted average weekly cost to Scenario weekly cost is then used for elasticity adjustment of “switchers” (rather than using change in a particular pass price or per-trip fare).

∑푖휖−푗 −min (0, 푆ℎ푎푟푒푐,푖,푆푐푛 − 푆ℎ푎푟푒푐,푖,퐵푎푠푒) ∗ 푊푒푒푘푙푦퐶표푠푡푐,푖,퐵푎푠푒 퐴푣푔푊푒푒푘푙푦퐶표푠푡푐,퐵푎푠푒,푆푤푖푡푐ℎ→푗 = ∑푖휖−푗 −min (0, 푆ℎ푎푟푒푐,푖,푆푐푛 − 푆ℎ푎푟푒푐,푖,퐵푎푠푒)

Finally, adjustments are made to account for ridership induced by pass use; a customer will tend to take more transit rides using a zero-marginal-cost pass than if the same customer were riding pay-per-use.

Induced ride factors (훾푡) are applied to “switchers” that moved from a pass to pay-per-use or vice versa. (Since there is only one pay-per-use option, all “switchers” into pay-per-use are adjusted; however, only the subset of “switchers” into passes that are coming from pay-per-use are adjusted. Note also that customer-weeks are not adjusted using induced ride factors, only trips.)

푡푟푖푝푠_푎푑푗푐,푡,푆푐푛,푃푎푠푠→푃푃푈 = 푡푟푖푝푠푐,푡,푆푐푛,푃푎푠푠→푃푃푈 ∗ (1 − 훾푡)

푡푟푖푝푠_푎푑푗푐,푡,푆푐푛,푃푃푈→푃푎푠푠 = 푡푟푖푝푠푐,푡,푆푐푛,푃푎푠푠→푃푃푈 ∗ (1 + 훾푡)

After Scenario trips and customer-weeks have been adjusted using elasticities and induced ride factors, Scenario revenue is calculated as in the Synthetic Baseline.

Overview To recap, the mechanics of the prediction procedure can be broken into 9 steps:

Observed Baseline 1. Aggregate observed AFC customer-weeks, transit trips, and “use value” during a Baseline period by combinations of customer-week type and trip type (with each combination referred to as a “segment”). Synthetic Baseline 2. Create a Synthetic Baseline allocation of each segment to different fare products using a logit fare product choice model and Baseline fare levels. 3. Recalibrate the logit fare product choice model parameters so that the Synthetic Baseline trips approximately match observed fare product shares for each segment. 4. Calculate Synthetic Baseline revenue for each segment using fare products allocations and Baseline fares.

183

5. Recalibrate revenue for pass product so that the Synthetic Baseline revenue matches observed total revenue by pass type. Scenario 6. Create a preliminary Scenario allocation of each segment to fare products using the recalibrated logit model parameters and Scenario fare levels. 7. Adjust the portion of each segment that did not switch fare products (“non-switchers”) using elasticities and the applicable change in weekly price. 8. Adjust the portion of each segment that did switch fare products (“switchers”) using elasticities, the segment-level average change in weekly price for “switchers,” and induced ride factors. 9. Combine results for “non-switchers” and “switchers” to calculate Scenario customer- weeks, trips, and revenue.

Conceptual Example While not strictly correct, one could loosely think of an individual customer going through the different steps of the model. Imagine, for example, a customer using pay-per-use who takes 15 rides in a week in the Observed Baseline (as shown in Figure 7-4). After recalibration of the fare product choice model parameters, suppose that customer is still assigned to pay-per-use in the Synthetic Baseline. If pay-per- use fares were to increase in the Scenario, this customer might switch over to using a pass product. Finally, after switching, the customer’s ridership is scaled up by an induced ride factor to account for the zero marginal cost of each ride on the pass.

Figure 7-4: Conceptual Example of Model Logic

7.2.3 Model Data and Parameters

Observed Baseline The model was implemented using CTA Ventra (AFC) data on sales and ridership for one year, from May 30, 2016 to May 29, 2017.45 In order to focus on major fare products, the model includes only four full-

45 Note that May 29, 2017 was included in error; the intent was to include only whole weeks, from Monday through Sunday, but instead an entire year was included (52 whole weeks plus one day). As a result, May 29, 2017 was mistakenly treated as a distinct week, which shifted the distribution of use value by customer-week down from the true distribution. The effect of this error on model results is believed to be small.

184 fare, account-based (i.e. non-ticket) fare products – pay-per-use (PPU), 3-day pass, 7-day pass, and 30- day pass – which account for the majority of CTA ridership and fare revenue. Customer-weeks including any discounted-fare rides or using any non-selected fare products were excluded from the model. Since the model is not comprehensive, evaluation of model results should focus on percent changes rather than levels, and scaling would be required to provide estimates for system-wide impacts.

To develop the Observed Baseline, CTA Ventra transactions were rolled up to the trip level based on fare transfer information, and weekly total trips and "use value" were calculated for each account. As described above, trip totals were differentiated by time period of the first tap (“peak”=7-9am and 4-6pm on weekdays, “off-peak”=all other times), mode of rides within each trip (bus or rail), and number of transfers. Retaining this level of detail allowed for accurate assignment of Baseline transfer fares and modeling of Scenarios with alternative transfer pricing policies and peak period pricing. Table 7-1 summarizes trips by trip type in the Baseline.

Table 7-1: Model Baseline Trips by Trip Type (Millions) Peak Offpeak Total Share Rail 33.2 41.5 74.7 44.9% Bus 12.6 22.3 34.9 21.0% Rail-Rail 1.4 3.4 4.8 2.9% Bus-Rail 4.7 8.0 12.7 7.7% Rail-Bus 4.0 7.6 11.7 7.0% Bus-Bus 3.8 10.9 14.8 8.9% Rail-Rail-Rail 0.1 0.2 0.2 0.1% Bus-Rail-Rail 0.2 0.6 0.7 0.4% Rail-Bus-Rail 0.1 0.2 0.3 0.2% Bus-Bus-Rail 0.3 0.8 1.1 0.6% Rail-Rail-Bus 0.2 0.4 0.6 0.3% Bus-Rail-Bus 1.1 2.9 4.0 2.4% Rail-Bus-Bus 0.3 0.9 1.3 0.8% Bus-Bus-Bus 1.0 3.5 4.5 2.7% Total 63.0 103.2 166.2 100% Share 37.9% 62.1% 100% Source: CTA Ventra, May 30, 2016 to May 29, 2017 Notes: Only includes customer-weeks exclusively using some combination of four fare products: PPU, 3-day pass, 7-day pass, and 30-day pass.

Customer-weeks were further grouped into nine $5 ranges of total weekly use value ($0-4, $5-9,…, $35- 39, and $40+) and four ranges of weekly rail use (0-24%, 25-49%, 50-74%, and 75-100%) before final aggregation for the model. Figure 7-5 shows the fare product breakdown of trips by customer-week use value range in the Observed Baseline. The shape of the pay-per-use share of trips across use value ranges illustrates the value of retaining this customer-week-level information in aggregate trips; the shape is similar to a logistic curve, reflecting the propensity of individuals to choose pay-per-use as their use value (or weekly cost of using pay-per-use) increases above the weekly cost of a pass.

185

Figure 7-5: Trips by Customer-Week Type in the Observed Baseline

Source: CTA Ventra, May 30, 2016 to May 29, 2017

Note that Baseline fare levels used in the model are pre-2018 CTA fares. Baseline pay-per-use fares are $2.00 for bus, $2.25 for rail, and $0.25 for transfers. Baseline pass prices are $20 for a 3-day pass, $28 for a 7-day pass, and $100 for a 30-day pass.

Fare Product Choice Parameters and Synthetic Baseline The prediction procedure relies on a multinomial logit fare product choice formula for both the Synthetic Baseline and the Scenario. The parameters for this logit formula were taken from a model estimated using a random sample of 100,000 individual-level fare product choices observed in Ventra over the same time period as the aggregated model data (June 2016 through May 2017). This model was presented in Chapter 6. Initial application of the fare product choice formula resulted in three systematic errors in fare product market shares across customer-week types:

 slight over-allocation to pay-per-use  over-selection of 7-day passes over 30-day passes for customer-weeks with low weekly use values  over-selection of 30-day passes over 7-day passes for customer-weeks with high weekly use values

Adjustment terms were added to fare product systematic utilities for different customer-week use value ranges in order to better approximate the Observed Baseline (while still relying on the estimated coefficient for weekly cost in model calculations). The multinomial logit utility parameters and adjustments are shown in Table 7-2. Following adjustment, the Synthetic Baseline reasonably approximated observed fare product market shares across customer-week types, as shown in Table 7-3. (Recall that precise matching or prediction of absolute market shares is not critically important, since the

186 purpose of the model is to predict changes in market shares. Changes are driven primarily by the coefficient on weekly cost, which was not adjusted from the model estimated in Chapter 6.)46

Table 7-2: Multinomial Logit Utility Parameters for Fare Product Choice EXOGENOUS PARAMETERS Generic

βWeeklyCost -0.137 Alternative-Specific PPU 3-Day Pass 7-Day Pass 30-Day Pass Alt-Specific Constant - -3.982 -0.104 -0.693 Customer-Week % Rail 0-24% - - - - 25-49% - 1.069 0.059 0.149 50-74% - 0.841 -0.476 0.244 75-100% - 0.825 -1.435 0.228 ADDITIVE ADJUSTMENT FACTORS Alternative-Specific PPU 3-Day Pass 7-Day Pass 30-Day Pass Customer-Week Use Value $0-4 - - - - $5-9 - - - - $10-14 - - -0.150 0.150 $15-19 -0.250 - -0.150 0.150 $20-24 -0.250 - -0.150 0.150 $25-29 -0.250 - 0.100 -0.100 $30-34 -0.250 - 0.100 -0.100 $35-39 -0.250 - 0.200 -0.200 $40+ - - 0.300 -0.300

46 See Chapter 6 for additional discussion of the disconnect between the setting for choice model estimation and the spreadsheet model.

187

Table 7-3: Calibration of Synthetic Baseline Fare Product Market Shares to Observed Baseline Shares Difference in Predicted Market Share, Difference in Predicted Market Share, Initial Synthetic - Observed Baseline Adjusted Synthetic - Observed Baseline PPU 3-Day 7-Day 30-Day PPU 3-Day 7-Day 30-Day Customer-Week Use Value $0-4 -1% 0% 0% 1% -1% 0% 0% 1% $5-9 1% 0% 0% -1% 1% 0% 0% -1% $10-14 3% 0% 1% -3% 2% 0% 0% -2% $15-19 4% 0% 1% -5% -2% 0% 1% 1% $20-24 2% 0% 2% -4% -6% 0% 2% 4% $25-29 8% 0% -3% -5% 2% 0% 2% -4% $30-34 7% 0% -7% 0% 3% 0% -2% -1% $35-39 7% 0% -11% 5% 4% 0% -3% -1% $40+ 1% 0% -15% 14% 1% 0% -2% 1% Overall 3% 0% -1% -2% -1% 0% 1% 0%

For calculating revenue by fare product, the revenue per customer-week for each pass type was calibrated to observed total pass sales for the Baseline period (using only Ventra accounts that were included in the model). The correction factors applied to prorated weekly pass prices for revenue calculations in the model are shown in Table 7-4. (It would have also been appropriate to apply these adjustment factors to the weekly costs for passes before applying the fare product choice model and calibrating the Synthetic Baseline fare product market shares above.)

Table 7-4: Calibration of Pass Revenue per Customer-Week to Observed Pass Sales PPU 3-Day 7-Day 30-Day Total Formula Observed Baseline Trips (millions) 100.7 1.0 22.0 42.5 166.2 (1) Share of Trips 61% 1% 13% 26% 100% Sales (thousands) N/A 185 1,756 1,043 Revenue ($millions) $222.3 $4.2 $49.9 $104.0 $380.4 (2) Synthetic Baseline (pre-calibration) Trips (millions) 99.4 0.7 22.8 43.2 166.2 (3) Share of Trips 60% 0% 14% 26% 100% Revenue ($millions) $220.7 $3.5 $64.5 $114.4 $403.1 (4) Revenue Calibration Synthetic/Observed Trips 0.99 0.70 1.04 1.02 (5) = (3)/(1) Scaled Observed Revenue $219.4 $2.9 $51.9 $105.8 $380.1 (6) = (2)*(5) Correction Factor 0.99 0.85 0.81 0.93 = (6)/(4) Sources: CTA Ventra Notes: The correction factor for PPU is not used in the model. It is included here only as a rough check on the pay- per-use fare levels assigned to different trip types in the model.

188

Elasticity and Induced Ride Factor Assumptions Elasticities and (to a lesser degree) induced ride factors have a major effect on the magnitude of predicted fare change impacts in the Scenario relative to the Synthetic Baseline. For demonstration purposes and due to the timing of model development, the values for these parameters in the model were selected based on prior fare analyses at the CTA (Cambridge Systematics in 2012 and Multisystems, Inc. in 1991, 2000, and 2005) and based on recommendations in an academic review of fare elasticities, Litman (2017). These parameter assumptions are shown in Table 7-5.

Ideally, these parameters would be updated regularly using agency-specific information and experiences. This thesis was limited by time and data availability from developing updated parameter estimates for both transit agency case studies; however, this was the motivation for the MBTA elasticity analyses and CTA induced ride factor analysis presented in Chapter 6.

Table 7-5: Model Elasticity and Induced Ride Factor Assumptions Pay-Per-Use Elasticities Trip Type Elasticity Selection Between Cambridge Systematics (2012) and Multisystems (2005), Peak Rail -0.2 within Litman (2017) short-term range Similar to Cambridge Systematics (2012), Multisystems (2005), Offpeak Rail -0.5 within Litman (2017) short-term range Similar to Cambridge Systematics (2012), Multisystems (1991, Peak Bus -0.3 2005); high end of Litman (2017) short-term range Similar to Cambridge Systematics (2012) and Multisystems (2005); Offpeak Bus -0.6 high end of Litman (2017) short-term range Pass Elasticities Pass Type Elasticity Selection 3-Day -0.2 7-Day -0.2 Value from Multisystems (1991, 2005) 30-Day -0.2 Induced Ride Factors (Proportional Change, PPU → Pass) Trip Type Factor Selection Peak Rail 0.05 Offpeak Rail 0.2 Conservatively half of the values from Multisystems (2000) Peak Bus 0.05 Offpeak Bus 0.2

7.2.4 Comparison to Existing Modeling Approaches This section compares the model methodology to three related modeling approaches: the MBTA FERRET elasticity spreadsheet model, the discrete-continuous panel data model developed in Zureiqat (2008), and prior product choice and elasticity models at the CTA.

189

Comparison to the MBTA FERRET Model: Logit Formula versus Diversion Factors or Cross-Elasticities for Fare Product Choice The MBTA FERRET model described in Chapter 3 provides a useful comparison between the choice and elasticity model presented in this chapter and a traditional elasticity spreadsheet model.

The primary difference is the ability to model fare product choices with some realism. The choice and elasticity model predicts fare product market shares based on the weekly cost of alternative fare products for different types of “representative individuals” (i.e. different customer-week types with different frequencies of transit ridership). The FERRET model, by contrast, attempts to capture marginal switching between fare products by applying diversion factors to total fare product demand (without distinguishing customer types). As described in Chapter 3, the magnitude of fare product switching in the FERRET model is determined by the assumed diversion factor and the ratio of the proportional changes in the prices of two alternative fare products; while this does result in more switching for larger relative changes in product prices, it does not depend on the relative price levels before and after a fare change (i.e. how competitive two products are with each other).

As a result of the mechanics of diversion factors and very low diversion factor assumptions, fare product switching has a relatively small impact on the FERRET model's predictions even in fare change scenarios that significantly change the competitiveness of different fare products with each other. Table 7-6 shows predicted changes in ridership at the fare product level for several major MBTA fare products from the final FERRET analysis of the July 2016 fare change. The second set of results is for the same scenario, but removing all “cash-pass” diversion factor adjustments. The impact of the induced ride factors is to shift approximately 350,000 Monthly LinkPass rides (about 0.4% of Baseline rides) to pay-per-use; FERRET does not apply diversion factors to 7-day passes. These impacts seem quite small given that Monthly and 7-day LinkPasses became much less competitive with pay-per-use for many MBTA customers; this contributed to FERRET’s overprediction of revenue for these pass products, shown in Figure 7-6. (FERRET overprediction also likely stemmed from elasticity assumptions that were too low for Non-Corporate Monthly LinkPasses and pay-per-use, as discussed in Chapter 6.)

190

Table 7-6: FERRET Predicted Bus and Rapid Transit Ridership With and Without Diversion Factors (MBTA July 2016 Fare Change Scenario, Selected Full-Fare Tariff Types) Pay-Per-Use 7-Day LinkPass Monthly LinkPass Baseline Ridership 85,099,865 43,610,592 112,822,885

Scenario With Diversion Factors Scenario Ridership 84,150,639 42,835,930 110,328,223 Change in Ridership -949,227 -774,662 -2,494,662 % Change in Ridership -1.1% -1.8% -2.2%

Scenario Without Diversion Factors Scenario Ridership 83,789,859 42,835,930 110,679,250 Change in Ridership -1,310,007 -774,662 -2,143,635 % Change in Ridership -1.5% -1.8% -1.9%

Impact of Diversion Factors Scenario Ridership +360,780 0 -351,027 Change in Ridership +360,780 0 -351,027 % Change in Ridership +0.4% 0% -0.3% Sources: MBTA FERRET Model (Final Option, FY16 Fare Change)

Figure 7-6: FERRET Predictions Versus Actual Percent Changes in MBTA Revenue (FY16 to FY17, MBTA July 2016 Fare Change Scenario, Selected Full-Fare Tariff Types)

Sources: MBTA FERRET Model (Final Option, FY16 Fare Change), MBTA Accounting

191

The shortcomings of FERRET model diversion factors are made even clearer by looking at fare change scenarios with a range of different prices. For example, if the prices of Monthly and 7-day LinkPasses were increased with pay-per-use fares held constant, the pass “multiples” would increase and customers would need to take a higher and higher number of rides to “break even” on purchasing a pass (as opposed to riding pay-per-use). Travel demand for any individual customer is limited, so by the time the monthly pass multiple exceeds 50 (more than two trips per weekday) most customers would likely shift their ridership from passes to pay-per-use (which will have a lower weekly cost given their travel needs). This is the result using the CTA choice and elasticity model, shown in Figure 7-7 – a substantial shift away from passes to pay-per-use as the pass multiple starts to exceed regular commuters’ travel needs.

Figure 7-7: CTA Choice and Elasticity Model Predicts Substantial Switching to Pay-Per- Use as 30- and 7-day Pass Multiples are Increased

Notes: In the scenarios in the figure, 7-Day Pass prices are changed proportionally with 30-Day Pass prices. All other fares and prices are held constant at Baseline levels. The Baseline 30-Day Pass multiple is 44.4.

However, in the MBTA FERRET model, as pass prices increase there is only a gradual and continuous shift in market share from passes to pay-per-use (shown in Figure 7-8). When the monthly pass multiple is 50 (Monthly price is $121.50 and 7-day prices is $30.50 versus a $2.25 CharlieCard rail fare), FERRET still predicts that LinkPasses will account for much more ridership than pay-per-use. (Note that the MBTA has a much higher Baseline pass market share than the CTA; the MBTA has a lower Baseline pass multiple, and MBTA pay-per-use using the Charlie system is less convenient than the CTA Ventra system.)

192

Figure 7-8: MBTA FERRET Model Predicts Little Switching to Pay-Per-Use as Monthly and 7-Day LinkPass Multiples are Increased

Source: MBTA FERRET Model (Final Option, FY16 Fare Change), Notes: In the scenarios in the figure, 7-Day LinkPass prices are changed proportionally with Monthly LinkPass prices. All other fares and prices are held constant at Baseline levels. Baseline Monthly LinkPass multiple is 37.6 (the leftmost scenario in the figure).

The same issue can be seen in the extreme if the price of a monthly or 30-day pass is increased while holding the price of a 7-day pass constant. Once a monthly pass costs more than 4 (or 4.2) 7-day passes, most monthly pass-holders should switch to purchasing 7-day passes; however, the FERRET model does not even include diversion factor assumptions for choices between 7-day and monthly passes, so there is no predicted change in 7-day LinkPass sales and ridership as Monthly LinkPass prices are increased (and, as seen in Figure 7-8, only a modest shift from Monthly LinkPass to pay-per-use). By contrast, the CTA choice and elasticity model outputs in Figure 7-9 show the expected shift in pass market shares, since customer-week demand is allocated to fare products based in part on relative weekly price levels. (As in the previous example, the CTA choice and elasticity model presented here also models switching from passes to pay-per-use, especially as the weekly cost of the 30-day and 7-day passes is equalized.)

193

Figure 7-9: CTA Choice and Elasticity Model Captures Fare Product Switching as 30-Day Pass Prices are Increased and 7-day Pass Prices are Held Constant

Notes: In the scenarios in the figure, 30-Day Pass prices are increased and all other prices are held constant, increasing the ratio between the 30-day and 7-day pass prices. The Baseline ratio in pass prices is about 3.6.

Comparison to Zureiqat (2008): Semi-Aggregate and Static versus Disaggregate and Dynamic The choice and elasticity spreadsheet approach presented in this chapter could be considered a static and semi-aggregate approximation of the discrete-continuous econometric model presented in (Zureiqat 2008) described in Chapter 3. It is static in that responses to fare change scenarios are predicted for a single future time period, whereas Zureiqat’s model simulates customer choices at regular intervals over time. It is semi-aggregate because the spreadsheet calculations are performed on ridership figures that are aggregated, but they are aggregated separately for groups of transit accounts with different characteristics (chiefly different weekly ridership frequency); where Zureiqat simulates individual choices, the spreadsheet model works with groups of individuals or “representative individuals” (but not merely groups of trips).

The approach in Zureiqat (2008) has several advantages relative to the spreadsheet approach presented here.

 Zureiqat’s model is more rigorous; it was adapted from econometric models of simultaneous discrete and continuous choices, which were derived from microeconomic theory of demand. This ensures a level of consistency between treatment of the different choices in the model, which in reality are made simultaneously by transit customers. (A customer’s choice of fare product depends on expected ridership, but ridership also depends on fare product choice and the resulting marginal fare faced by the customer.) The spreadsheet approach here includes similar individual

194

behaviors – fare product choice (logit formula), frequency of ridership (elasticities), and ridership adjustments that relate loosely to Zureiqat’s “selectivity bias” correction (induced ride factors); however, those behaviors are combined in a simple sequential fashion without any assurance of theoretical consistency.  Zureiqat’s approach is an integrated method of both parameter estimation and scenario prediction, while the spreadsheet approach requires input parameter assumptions (or selection and use of separate parameter estimation methods). One challenge this presents for the choice and elasticity spreadsheet approach is the calibration of a synthetic baseline of fare product choices to observed baseline fare product market shares (before even looking at a fare change scenario). This difficulty arises because choice model parameters are estimated using different data (either individual-level AFC data or survey data) than the semi-aggregate data to which they are applied in the spreadsheet model. As in Multisystems, Inc (2000) and Cambridge Systematics (2012), manual adjustments must be made to the logit parameters to better match reality. Zureiqat’s approach avoids this issue somewhat by using the same model structure and data for both parameter prediction and estimation; both the baseline and scenarios are then simulated for each individual starting from the end of the dataset going forward week by week (as suggested in Ben- Akiva and Lerman (1985), p. 147).  Zureiqat’s model predicts not only the steady-state impacts of a fare policy change, but also how long it will take for those impacts to be realized. Zureiqat predicts individual-level fare product choices and ridership frequency on a weekly basis based in part on the decisions that had been made in the previous time period; the coefficients on these lagged choices determine how quickly the system transitions to a new steady state. In the Transport for London policy examples in Zureiqat (2008), it appears to take 3-5 months for the full effect of policy changes to be observed. The static spreadsheet approach in this chapter does not predict how long it will take to observe the full impact of a fare change. In fact, if a logit fare product choice model were estimated using lagged choices as an explanatory variable, the parameters would require some sort of conversion to steady-state before they could be used as input parameters in the spreadsheet approach.47  While this chapter primarily focuses on prediction rather than parameter estimation, Zureiqat’s model has the advantage of providing a method for estimating behavioral parameters from fare change events that included fare product switching. As discussed in Chapter 6, the presence of fare product switching presents a significant challenge for estimating elasticities using other methods.

However, Zureiqat (2008) also has some disadvantages that make the spreadsheet approach in this chapter more practical for transit agency application:

 Zureiqat’s use of disaggregate panel data allows his model to realistically and consistently model simultaneous ridership and fare product choices for observed individuals, but it does not account for churn in accounts or cards (disappearance and appearance of users in AFC, or what Zureiqat (2008) describes on page 111 as “generation and suppression” of smartcards); Zureiqat’s model simulates future demand by assuming that no cards disappear or appear from the panel moving into the future. In reality, transit agencies like TfL, the MBTA, and the CTA have significant turnover of cards and accounts. As discussed in Chapter 6, this churn includes part of the effect

47 Page 88 of Zureiqat (2008) shows a related conversion for elasticities, but not for logit choice model parameters.

195

of transit users switching to and from other modes. In the absence of changes in policy or service or competition, the churn in cards would net to zero and ignoring it would not have much consequence; however, fare changes could clearly contribute to mode switching via churn in cards. This effect is typically wrapped up in elasticity estimates that are applied to transit ridership. Zureiqat’s model is not able to capture that element of mode choice (people joining or leaving transit entirely), and the elasticities in his model are conditional on non-zero transit use (i.e., more or less frequent ridership by transit users). This suggests that his model may perform well at predicting fare product market shares, but might underestimate the impacts of fare changes on overall transit ridership levels. Of course, any model relying on AFC data faces limitations of “choice-based sampling” – AFC only includes data on individuals who chose to use transit (as opposed to or in addition to other modes). However, the semi-aggregate spreadsheet model presented in this chapter has an advantage here. Aggregating ridership by customer-week type nets out any “normal” churn in cards (in the absence of a policy change) and allows for application of elasticities that capture both net changes in frequency of use for transit users and net shifts in mode choice to or from transit (i.e., elasticities that are conditional on fare product choice but not conditional on mode choice).48  Zureiqat’s model requires estimates of many non-standard parameters, including elasticities conditional on mode choice and fare product choice and selectivity bias adjustment factors. These parameters must be estimated as part of his integrated method of estimation and simulation-based prediction, which may be restrictive. The spreadsheet approach presented here allows for use of elasticities and induced ride factors from previous studies.  The elasticities and selectivity bias adjustment factors in Zureiqat’s model have less intuitive interpretations than the parameters used in the choice and elasticity spreadsheet model. As explained above, Zureiqat’s elasticities are conditional on both mode choice and fare product choice, and selectivity bias adjustments do not have a straightforward interpretation like a multiplicative induced ride factor.  Zureiqat’s model is relatively inflexible in representing complex fare structures or applying different elasticities to different market segments (such as by mode and time of day for each fare product), since an additional continuous model of ridership frequency would need to be estimated for each product-mode-time combination. The semi-aggregate approach in this chapter retains the flexibility of a traditional elasticity spreadsheet model, allowing for arbitrary disaggregation or segmentation of elasticity adjustments (by customer or trip attributes).  As a result of its data requirements and integrated estimation procedure, Zureiqat’s model requires special software to estimate parameters and simulate fare change scenarios. The spreadsheet-based prediction procedure uses exogenous behavioral parameters, and it uses “representative individuals” to model fare product choices for groups of similar customers rather than requiring simulation of choices for all individual customers; as a result, it can be implemented in Excel or any familiar data processing software.

48 It might be possible to include an “unsampled choice” in the logit fare product choice formula of a spreadsheet model (a simplified implementation of Newman, Ferguson, and Garrow (2012)) to more explicitly describe choice about leaving the transit system entirely. This would make the spreadsheet model approach more like Zureiqat’s model because it would then require elasticities that are conditional on both mode choice and fare product choice.

196

Past CTA Models: Survey Data vs. AFC Data Relative to the MBTA FERRET model and Zureiqat (2008), the prediction procedure presented in this chapter is very similar to the past CTA fare models described in Chapter 3. However, there are two important differences.

First, all of the previous CTA choice and elasticity models of the past two decades have relied heavily on customer surveys. Stated preference surveys over different fare product pricing scenarios were used to estimate a logit fare product choice model. The resulting logit formula was then applied to separate system-wide passenger survey responses, and the resulting fare product choices were weighted or scaled based on the sample rates of the customer survey; this allowed for the use of explanatory variables such as income and car ownership, which are not observed in AFC, but it also relied on expensive surveys and introduced potential error in scaling factors. This chapter instead uses logit model estimates developed from AFC data using very simple specifications (presented in Chapter 6), and the resulting logit model formula is applied directly to segments of AFC data (rather than to weighted customer surveys). While these differences do not significantly affect the mechanics of the prediction procedure, they demonstrate the potential to model incremental fare scenarios using revealed preferences over fare product options and simple models and without the cost of customer surveys.

Use of AFC data without customer surveys does have some disadvantages. The approach used here is limited to incremental fare change scenarios; scenarios including dramatic changes in fare structure or introduction of new fare products could still require customer surveys, since AFC data only reflects revealed preferences under the current fare structure and products. Additionally, as shown in Chapter 6, the AFC-only logit model explains much less total variation in fare product choices than previous logit models estimated on survey data, which introduced much more variation in fares (by design) and could use additional explanatory variables such as respondent demographics. However, the explanatory power (and, hopefully, prediction accuracy) of an AFC-only fare product choice model could be improved over the simple example in this thesis by exploring alternative specifications and introducing AFC-based segmentation (e.g. cluster-based segmentation distinguishing “commuters” from “visitors” using only AFC travel frequency and regularity); the same AFC-based segments could then be distinguished as different customer-week types in the spreadsheet model. See the Appendix for an example of such segmentation.

Second, Cambridge Systematics (2012) departed from the methodology of its Multisystems predecessors in the way that it applied demand response adjustments after predicting fare product market shares. The Cambridge Systematics model calculated one system-wide average fare for the baseline and one system- wide average fare for the scenario based on product-level prices; the percent change between these average fare levels was applied to elasticities for four broad segments of CTA ridership (combinations of bus/rail and weekday/weekend), and the resulting total system-wide elasticity impact was re-allocated to fare products based on predicted scenario market shares. This use of highly aggregated average fare levels for elasticity adjustments seems to have little connection to product-level changes and marginal decision-making of individual customers. Elasticities surely vary across fare products, and continuing users of products that did not change in price should not experience an elasticity adjustment just because system-wide average fares changed; elasticities should ideally be applied to individual- or segment-level changes in costs given specific fare product choices and product prices. Cambridge Systematics also did

197 not apply any induced ridership adjustment for customers who switched between zero-marginal-cost pass products and pay-per-use.

The application of average fares to aggregated ridership in the 2012 fare modeling work (without considering either variation in product elasticities or the implications of product marginal costs) was an unfortunate modification from earlier modeling methodologies. While the effect of these choices on total fare change scenario impacts is unclear, it certainly reduces credibility of product-specific predictions. The spreadsheet model presented in this chapter returns to the mechanics of earlier CTA models: allowing for variation in elasticity across products (and arbitrary customer and trip type segments), applying demand response adjustments based on segment-level changes in fare product choices and prices, and including induced ride factors to adjust for the ridership impact of zero-marginal-cost passes. 7.3 Model Demonstration This section presents example applications of the choice and elasticity spreadsheet model to fare change scenarios at the CTA. First, sensitivity analysis is used to describe the implications of model assumptions. Then, model results for several plausible fare change scenarios are presented and discussed. 7.3.1 Sensitivity to Exogenous Parameters The sensitivity of the model results to the behavioral parameter assumptions described earlier can be seen by looking at a single fare change scenario while varying those assumptions. The figures below show predicted impacts of a scenario that raises bus pay-per-use fares by 25 cents and rail pay-per-use fares by 50 cents while leaving pass prices constant (all relative to pre-2018 Baseline prices). Each graph show changes in ridership or revenue from the Baseline to the Scenario, divided by fare product.

First, Figure 7-10 shows the important impact of elasticity assumptions. Elasticities with higher magnitudes result in reduced ridership and revenue. Recall that this scenario did not change pass prices, so pass ridership and revenue are largely unaffected as elasticities are increased; however, they do decline slightly since elasticities are still applied to customers who switch from pay-per-use to a pass (based on the change in their weekly cost of ridership net of fare product switching).

198

Figure 7-10: Sensitivity of Results to Elasticities (Scenario: PPU Bus Fare +$0.25, PPU Rail Fare +$0.50, Pass Prices Unchanged)

Second, Figure 7-11 shows the impact of induced ride factors. In this scenario, there is net switching from pay-per-use to passes, so these parameters merely scale ridership up for the additional pass-holders. Higher induced ride factor assumptions result in higher ridership (or lower net ridership losses), but they do not affect revenue (since pass revenue does not vary with individual ridership level). Note that in a scenario with net switching from passes to pay-per-use, a higher induced ride factor would reduce revenue by scaling down ridership of customers who switch to pay-per-use.

Figure 7-11: Sensitivity of Results to Induced Ride Factors (Scenario: PPU Bus Fare +$0.25, PPU Rail Fare +$0.50, Pass Prices Unchanged)

Finally, the fare product choice model utility parameters have important impacts on ridership and revenue. However, unlike elasticities and induced ride factors, the choice model parameters are used to construct not only the Scenario predictions but also the Synthetic Baseline. As discussed in Chapters 3 and 4, the impact of the choice model parameters is also highly dependent on the prices of fare products relative to each other and to the distribution of use value in the customer population. As a result, sensitivity analysis of changes in fares with respect to choice model parameters are difficult to interpret; the effect of the choice model is best seen across a range of fare change scenarios, as presented in the next section.

199

7.3.2 Example Applications The purpose of the choice and elasticity spreadsheet model is to inform the actual design and selection of fare changes. To demonstrate, this section compares results for several fare change scenarios that were plausible options for fiscal year 2018 at the CTA. The primary impetus for a fare change in 2018 was closing a projected operating budget deficit. Following the major reduction in pass market share after the last fare change in 2013, CTA staff were also interested in focusing fare increases on pay-per-use fares to make pass products relatively more attractive. Since fare product switching was one potential goal of a fare change, there was interest in using a prediction procedure that captured fare product choice.

Figure 7-12 shows predicted changes in ridership and revenue by fare product from the Baseline (in the center of each graph) under different pay-per-use fare levels, leaving pass prices unchanged. The range of pay-per-use fare levels is extreme in order to illustrate the functionality of the model. (All scenarios use the specific parameter values described earlier in Section 7.2.3.) The black lines show that, as expected, increasing pay-per-use fares is predicted to decrease total ridership and increase total revenue. However, even modest net changes reflect much larger predicted movements of customers between fare products; if pay-per-use fares are increased, pay-per-use ridership is predicted to decrease much more than the net change in ridership, but most of that ridership “loss” is actually switching over to passes. The right graph of changes in revenue shows the counterintuitive prediction that pay-per-use revenue could actually decrease if pay-per-use fares are increased, even as net revenue increases; this is due to customers switching from pay-per-use to passes, which is a function of the relative prices of passes and pay-per-use in the Baseline.

Figure 7-12: Predicted Change in Ridership and Revenue Under Alternative Pay-Per-Use Fares

200

With these dynamics in mind, several scenarios were explored to find options that met the CTA’s revenue needs while also improving the relative attractiveness of passes. Top-line model predictions for three of those scenarios are shown in Table 7-7.

Table 7-7: Predicted Ridership and Revenue Impacts for Selected Fare Change Scenarios Change in Change in Change in Trips Revenue Pass Share millions % $millions % % Step-up Transfer, Peak Rail +25¢ +0.4 +0.2% +$1.8 +0.5% +0% Step-up Transfer, Rail +50¢ -3.1 -1.9% +$12.0 +3.2% +7% Rail and Bus +25¢, 30-day Pass +$5 -4.4 -2.6% +$15.5 +4.1% +5%

The first scenario combines a pay-per-use fare increase for peak-period rail trips only with a new “step- up” transfer pricing policy. Rail trips are believed to be less price-sensitive than bus trips, and peak trips are believed to be less price-sensitive than off-peak trips; concentrating fare increases on peak rail trips would take advantage of relatively low elasticities for those trips, mitigating ridership losses and improving revenue yield. The CTA’s current transfer policy charges $0.25 for each transfer, regardless of mode. A step-up policy would only charge a fee for customers transferring from bus to rail, and the fee would be the difference between the bus fare and the rail fare (so that customers would pay the single-ride fare for the most expensive mode that they used). Switching to a step-up transfer policy would reduce the total fare for bus-to-bus and rail-to-bus transfer trips, which are expected to have higher elasticities than one-seat rail trips. This scenario is not predicted to generate much revenue, but it does increase both revenue and ridership; this is a good reminder that combining fare increases on relatively inelastic trips (like peak rail) and effective fare reductions on relatively elastic trips like bus transfer trips can actually improve ridership while generating revenue.

The second scenario combines the change in transfer policy with a larger and broader increase in pay-per- use rail fare (+$0.50). This has many of the advantages of the first scenario – concentrating fare increases on inelastic trips and providing a “give-back” (effective fare reduction) on elastic trips – while also generating significant fare revenue. The significant increase in pay-per-use rail fare is predicted to increase pass market share by 7%, bringing passes and pay-per-use close to parity and partially reversing the reduction in pass sales following the 2013 pass price increases.

Finally, the third row in Table 7-7 is the fare change scenario that was ultimately selected and implemented in January 2018 – increasing pay-per-use fares by $0.25 and increasing the 30-day pass price by $5. This was predicted to generate about 30% more revenue than the second scenario, but with relatively greater ridership losses and less of an increase in pass mode share. The next section compares the model predictions for this scenario to actual year-over-year changes in the first few months following the fare change. 7.4 Model Evaluation (January 2018 Fare Change) Validation of fare policy prediction models at any particular transit agency can be challenging, since fare changes are typically infrequent and it can take months for their full effect to be observed. The CTA

201 changed fare levels on January 7, 2018, raising pay-per-use rail and bus fares by $0.25 and 30-day pass prices by $5. While it is too early to draw any firm conclusions about the impacts of this change, the first three months of 2018 provide an opportunity for preliminary evaluation of the prediction model presented in this chapter. 7.4.1 Evaluation Data In this section, the predicted percentage impacts of the fare change (before it took place) are compared to actual year-over-year percentage changes in ridership and revenue for early 2018 (relative to early 2017). The ridership and sales data for these comparisons was pre-processed in the same way as the Observed Baseline data in the choice and elasticity spreadsheet model, to improve comparability. (As a reminder, then, this evaluation focuses only on four major full-fare, account-based fare products and does not represent system-wide totals; it is best to focus on percent changes rather than levels of ridership or revenue.)

While ridership and pay-per-use revenue can be tabulated easily for any specified time period, sales and revenue for CTA 30-day passes are lumpy. There is a one- or two-day spike around the 26th of each month as pre-paid benefits sales are processed, and other 30-day pass sales increase around the first and then decline throughout the month. To capture similar lumpiness in pass sales in both 2017 and 2018, the time period selected for validation is eight weeks long, from Monday 1/16/17 through Sunday 3/12/17 (in 2017) and from Monday 1/15/18 through Sunday 3/11/18 (in 2018). 7.4.2 Evaluation Results Table 7-8 and Figure 7-13 summarize the overall changes in ridership and revenue by fare product predicted by the choice and elasticity spreadsheet model and actually observed at the beginning of 2018. The spreadsheet model predicted a somewhat larger decline in ridership (-2.6%) and somewhat smaller increase in revenue (+4.1%) than were actually realized at the beginning of 2018 (-1.9% ridership and +6.3% revenue). Looking at the breakdown of these changes by fare product, prediction errors were quite large. The spreadsheet model predicted substantial switching from PPU to 7- and 30-day passes, resulting in flat PPU revenue but an increase in pass sales -- especially on 7-day passes, which did not change in price and thus became more attractive relative to both 30-day passes and pay-per-use. In early 2018, however, it appears there was much less switching from pay-per-use to pass products, and particularly to 7-day passes. Thirty-day pass sales and ridership were almost constant in spite of a 5% increase in price, which was not far from model predictions and could be the net result of an elasticity effect and some customers switching from PPU to 30-day passes; however, there were much smaller declines in PPU ridership than predicted, and there was actually a small loss in 7-day pass ridership and revenue.

202

Table 7-8: Comparison of Predicted and Actual Early 2018 YOY Ridership and Revenue Impacts of the January 2018 Fare Change PPU 3-Day Pass 7-Day Pass 30-Day Pass Total Change in Ridership (% of Baseline Total, Across All Fare Products) Model Prediction -6.2% +0.1% +2.7% +0.7% -2.6% Actual (Early 2018 YOY) -1.8% +0.0% -0.2% +0.1% -1.9% Actual - Predicted +4.4% -0.1% -2.9% -0.7% +0.7% Change in Revenue (% of Baseline Total, Across All Fare Products) Model Prediction -0.1% +0.1% +2.3% +1.8% +4.1% Actual (Early 2018 YOY) +4.5% +0.0% -0.0% +1.7% +6.3% Actual - Predicted +4.7% -0.1% -2.3% -0.1% +2.2% Sources: CTA Ventra Notes: “Early 2018 YOY” is a comparison of Jan 15 – Mar 11, 2018 with Jan 16 – Mar 12, 2017.

Figure 7-13: Comparison of Predicted and Actual Early 2018 YOY Ridership and Revenue Impacts of the January 2018 Fare Change

Sources: CTA Ventra Notes: “Early 2018 YOY” is a comparison of Jan 15 – Mar 11, 2018 with Jan 16 – Mar 12, 2017.

In order to interpret these results, it is important to consider potential sources of the errors or discrepancies observed in Figure 7-13:

1. "External factors" (such as winter weather) or preexisting trends, 2. a lag in the effects of the fare change, 3. error in the model's behavioral parameters, and 4. deficiencies in the model structure, including failure to distinguish important customer segments.

There are not any known “external factors” that would have a significant impact on fare product choice between early 2017 and early 2018. Chicago had a mild winter in 2017, with only 8.4 total inches of

203 snow in January through March (versus 27.5 inches in 2018);49 if anything, this should have depressed 2018 ridership and revenue relative to model predictions, and the impact on fare product choice would be unclear. Other potential factors like gas prices, the national economy, and population and demographic shifts are not explored in this thesis; given the relatively short time frames of analysis, effects of long- term and macroeconomic factors seem likely to be small, and their impact on fare product choice would again be ambiguous.

Another way to check for confounding impacts of factors that are not related to fares is to look at pre- existing trends. The evaluation results in Figure 7-13 assume that ridership and revenue for each fare product would have been flat but for the fare change; if they were actually trending up or down then the impact of the fare change would need to be separated from the continuation of those trends. (This is the logic that was applied in Chapter 6 to estimate pass elasticities at the MBTA following the July 2016 fare change.) Figure 7-14 and Figure 7-15 show year-over-year percent change in pass sales and ridership by fare product in the year before the fare change and through March 2018. Thirty-day pass sales were relatively flat in the year before the fare change, with slight a slight year-over-year decline in sales, so pre-existing trends should not have affected results for 30-day passes. Seven-day pass sales, however, were down by about 8% on average in 2017 relative to 2016 but appeared to stabilize in early 2018. If the decline in 7-day pass sales would have continued in the absence of the fare change, then the flat year- over-year 7-day pass sales observed in Figure 7-13 actually represents an increase relative to this trend; this suggests that the fare change may have induced more switching from pay-per-use to 7-day passes than is apparent from the evaluation results above (which show little change in 7-day pass sales, suggesting little switching from pay-per-use). Ridership is more variable than sales, and there is not a consistent trend in pay-per-use ridership before the fare change. An individual-level panel analysis of fare product switching could shed more light on these pre-existing trends and the impact of the fare change on fare product switching.

49 National Weather Service, WFO Monthly/Daily Climate Data, Chicago-O’Hare (https://forecast.weather.gov/product.php?site=LOT&issuedby=ORD&product=CF6&format=TXT&version=1&glo ssary=0)

204

Figure 7-14: Year-Over-Year Changes in 7-day and 30-day Pass Sales, Before and After Jan 2018 Fare Change

Source: CTA Ventra Notes: Includes only full-fare, account-based passes.

Figure 7-15: Year-Over-Year Changes in Ridership by Fare Product, Before and After Jan 2018 Fare Change

Source: CTA Ventra Notes: Includes only full-fare, account-based fare products.

The fare change could also take more time to fully impact customer behaviors. Chapter 2 showed that the shift away from passes following the January 2013 fare change was spread over an entire year, and it seems possible that even more time would be required for customers to shift toward passes; a pass purchase is a larger decision than loading or spending small increments, and it may take some time for customers to estimate their expected spending under the higher pay-per-use fares (and slightly higher 30-

205 day pass price).50 The best way to address this is to re-evaluate the model predictions after more time has passed. As a preliminary check, however, year-over-year comparisons for early 2018 can be split into two four-week comparisons:

1. 1/15/18 – 2/11/18 compared to 1/16/17 – 2/12/17 2. 2/12/18 – 3/11/18 compared to 2/13/17 – 3/12/17

Figure 7-16 show results for these two periods separately. Actual 7-day and 30-day pass sales and revenue increased from the first period to the second period, bringing them closer to model predictions; a continuation of this trend would suggest that the model was correct in predicting switching into pass products. However, actual pay-per-use ridership and sales also improved from the first period to the second period, taking them farther from model predictions. Increases in both pass sales and pay-per-use sales from the first period to the second likely result from direct demand response (the “elasticity effect”), external factors, or pre-existing trends (not fare product choice); as mentioned above, it would be informative to explore individual-level switching between fare products to better understand this pattern.

Figure 7-16: Comparison of Predicted and Actual Impacts for Two Periods in Early 2018

Sources: CTA Ventra

Finally, the discrepancies between product-level model predictions and actual year-over-year sales and ridership changes could be a result of errors in model parameters or model structure. The pattern of prediction errors suggests that fare product choice utility parameters could be incorrect, predicting more switching between fare products than actually occurred. To further examine this possibility, model predictions and YOY actuals could be broken down and compared by the customer-week use value ranges in the model ($0-4, $5-9, …, $35-39, $40+). This would be informative because baseline weekly use value (the weekly cost of using pay-per-use) is a primary driver of fare product switching in the model. An obstacle to this comparison is that the distribution of use value shifted when pay-per-use fares changed – for example, a customer-week with use value of $28.00 in 2017 would have a use value of $31.11 in 2018 for the same trips. To control for this change, use value for actual fare transactions in

50 Advertisements that explicitly describe the improvement in the pass multiples (perhaps personalized with recent pay-per-use spending on individuals’ accounts) might accelerate and amplify the shift toward passes.

206

2018 would need to be calculated under 2017 fare rules (to group similar customer-weeks in data for both early 2017 and early 2018). This is left to future work.

Regarding model structure, one possible source of prediction error could be failure to differentiate important market segments in the model. Recall that fare product choices are also affected by market segment, and the market segments that were used was a simple grouping of customer-weeks by the percent of ridership on rail. Based on the choice model estimates presented in Chapter 6, one of the main implications of this segmentation was that customer-weeks with a high share of rail use were more likely to prefer 30-day passes to pay-per-use and were less likely to switch to a 7-day pass. It is possible that differing preferences between pay-per-use and passes or between 7- and 30-day passes could be captured better using a different segmentation or a different choice model specification (as discussed in Chapter 6). One alternative segmentation would be to use customers’ most frequent or most recent sale channel; analyses in Chapter 5 illustrate that the mix of sale channels varies across pass type, and transit travel frequency, time, and mode all vary across sale channels within pass type. Testing of alternative segmentations is also left to future work. 7.5 Conclusions and Extensions This chapter presented a ridership and revenue prediction procedure at the CTA that captures both fare product choice and demand response adjustments (elasticities and induced ride factors) in a simple spreadsheet model. This prediction procedure makes several methodological and implementation improvements relative to other modeling approaches:

 Relative to traditional elasticity spreadsheet models such as the MBTA FERRET model, it more realistically captures customer switching between fare products when relative product prices change.  Relative to the individual-level AFC panel data model in Zureiqat (2008), it is less data-intensive, better able to predict system-wide ridership and revenue (net of churn or attrition in transit cards and accounts), and more flexible to use standard parameters and to represent complex fare structures.  Relative to prior CTA fare models using fare product choice formulas based on stated preference customer surveys (which can have sampling and response bias issues), it uses only revealed preference Ventra ridership data for baseline ridership and to estimate a logit fare product choice formula.

Example scenario evaluations show the flexibility of the model and highlight the potential of different pricing strategies – concentration of fare increases on products and trip types that are less price-sensitive, reducing fares on those that are more price-sensitive, and altering pass multiples to shift ridership into pass products (which can stimulate ridership and mitigate future losses). One particular opportunity that stands out is introducing free bus-to-bus and rail-to-bus transfers as a targeted means of stimulate ridership (or at least mitigating losses on price-sensitive bus and transfer trips.

A very preliminary evaluation of the model using the first two months after the January 2018 fare change suggests that the model predicted more switching between fare products than has actually occurred; however, some of the prediction error is likely explained by a prior downward trend in 7-day pass sales,

207 and it is likely too early to observe the full impacts on fare product choice. There are several potential next steps for evaluation and improvement of the model:

 Evaluate model predictions for the January 2018 fare change after more time has elapsed.  Improve the logit specification used to estimate the fare product choice formula in the model.  Improve the transit customer segmentation used in both the fare product choice model and the ridership and revenue prediction model.  Examine and address the disconnect between customer-weeks in the spreadsheet model and customer-days in the fare product choice model. This disconnect creates several challenges – the model’s Synthetic Baseline required manual adjustment to closely match the Observed Baseline for each customer-week type, and weekly revenue for rolling-period passes required calibration to actual revenue because sales can cross customer-week boundaries. One possible solution is to use an incremental logit formulation that avoids the need for a Synthetic Baseline.

This thesis and the model presented in this chapter are focused on incremental fare changes. However, elements of the methodology and some of the resulting insights could be applied to more dramatic changes in fare policy. One example is fare integration. Just as free transfers between CTA rail and bus are a good opportunity to shift fares away from price-sensitive bus and transfer trips and increase ridership (or mitigate ridership losses), fare integration with Metra provides a similar opportunity to grow ridership by reducing the financial cost of transfer trips that are already unpleasant or inconvenient in other (non-financial) ways. Another related example is bundling CTA passes with other services, such as Metra, Divvy bike share, or ride hailing. The model in this chapter showed that decreasing pass multiples by increasing pay-per-use fares has the potential to shift ridership toward passes; however, using fare increases also comes with a ridership cost. Adding services or benefits to existing CTA passes would make them more attractive in absolute terms, potentially drawing additional customers from pay-per-use and other modes without reducing pay-per-use ridership in the process. The model in this chapter does not explicitly represent any drivers of transit demand other than fares; it effectively estimates the impacts of potential fare changes assuming that everything else remains constant. This is typically a reasonable way to compare and evaluate potential fare changes, since it focuses on changes attributable to fare policy. However, it may perform poorly at predicting future revenue and ridership if other factors are changing at the same time as fares. If these factors are unrelated to fares (such as population shifts), model results could simply be scaled up or down based on the assumed or modeled impact of those factors. But in other cases, changes in unobserved factors may directly affect fare-related behaviors, making the model parameters and model results incorrect.

One such factor in recent years is the growth of ride-hailing. When ride-hailing services reach new customers or the price of ride-hailing is reduced, ride-hailing becomes more attractive relative to transit (or more competitive with transit). This should result in a higher transit fare elasticity; however, elasticities in the model are exogenous and fixed, so they would not adjust for this change and may underestimate ridership losses resulting from fare increases. There are three potential options to better capture the impact of ride-hailing in a fare change scenario model. The first is to differentiate additional customer segments and trip types that may be particularly likely to switch from transit to ride-hailing (ideally based on data about ride-hailing use for different transit customer segments or in geographies served by transit), and to assume higher elasticities for these customers and trips; however, in order to capture changes in sensitivity to transit fares, these elasticities would need to be regularly estimated or

208 modified. A second option is to estimate cross-elasticities of transit demand with respect to ride-hailing prices, penetration, and/or service quality; these cross-elasticities could be applied in the model based on known or expected changes in ride-hailing. Estimation of cross-elasticities would require data on historical ride-hailing pricing, service quality (such as wait times), and trip volumes in relevant service areas. The third option is to directly model mode choices instead of relying on elasticities, explicitly representing ride-hailing as an alternative to transit. Estimation of a mode choice model would require individual-level stated or revealed preference data on trip information and mode choices. All of these options require additional data on ride-hailing.

209

8 Summary and Conclusions This chapter reviews the key findings from this thesis and draws cross-cutting conclusions for incremental transit fare change analysis and decision making. 8.1 Key Findings

8.1.1 Chapter 2: Case Study Background  The context and challenges for fare policy at the MBTA and the CTA are similar in many ways. Both agencies are experiencing a multi-year decline in bus ridership and are starting to lose rail ridership. Both operate alongside new and growing transportation alternatives (such as ride- hailing) and across neighborhoods with sharp differences in demographic composition and wealth. Both have experienced tightening operating budgets that have motivated recent fare increases (in July 2016 at the MBTA and in January 2018 at the CTA), but ridership stability and growth is also an important bottom-line objective to sustain support for public subsidies (over half of their operating budgets). Along with most major transit agencies around the world, the CTA and MBTA both offer customers a choice between period passes and pay-per-use ridership, with each accounting for a significant share of ridership and revenue. Otherwise, both charge “flat” fares on bus and rail that do not vary by distance, zone, or time of day. Within the last six years, both agencies broke from across-the-board fare changes and increased the multiple on their pass products. Both use a variety of different sale channels to distribute fare products and collect payments, including fare vending machines, employers, retail stores, and smartphone apps.  There are also several important differences between the MBTA and CTA that are relevant to fare policy. The MBTA operates bus, rapid transit, and commuter rail services, while the CTA operates bus and rail (with commuter rail service provided by Metra). The MBTA’s service area has a greater level of municipal fragmentation. The majority of MBTA ridership and revenue is in pass products; in contrast, the majority of the CTA’s ridership and revenue has been in pay- per-use since the pass price increase and rollout of Ventra in 2013. The retail network plays a larger role as a sale channel at the CTA than at the MBTA. The CTA’s current AFC system is account-based with open-loop payment, while the MBTA’s current fare technology is card-based with closed-loop payment. Fare policy decisions at the MBTA are made by a very active board under strong influence from the Governor of Massachusetts; at the CTA, the President and Mayor of Chicago play much larger roles in fare policy decision making. 8.1.2 Chapters 3 and 4: Literature Review and Framework  Microeconomic theory suggests that optimally efficient transit fares are less than agency operating costs (accounting for external benefits of transit use and external costs of motor vehicles) and should be differentiated by trip type (including by level of crowding on transit vehicles). However, direct application of this theory to calculate and set specific “optimal” prices is prohibitively difficult.  There are many pricing strategies that are reasonable business alternatives to pure cost-based pricing. These include several means of “price discrimination” (group and channel pricing) and “product differentiation” (horizontal and vertical differentiation and self-selection). Pricing

210

strategies are implicit in any transit agency fare structure and fare levels. Pricing of passes relative to pay-per-use fares reflects a combination of group pricing (different prices for different groups of customers based on their frequency of travel) and self-selection (customers will sort themselves into passes and pay-per-use based in part on the inconvenience of pre-payment or convenience of infrequent payment that they experience). This self-selection is the foundation of the “deep discounting” pricing strategy developed in the late 1980s and widely adopted in the 1990s, which paired per-ride fare increases with highly discounted multi-ride or unlimited-use pass products.  There are many methods for exploratory data analysis that can be applied to AFC data to learn about the relationship between pricing strategy and actual customer behaviors (in terms of transit purchases and ridership). Recent applications of clustering methods (a machine learning technique) to AFC data have produced concise customer features that could be very useful for adding segmentation to fare policy analysis.  Three key customer behaviors affecting the impact of fare policy changes, especially at agencies offering customers both pass and pay-per-use options, are induced ridership (especially the ridership impact of using a pass instead of pay-per-use), mode choice (shifting some or all travel between transit and other modes), and fare product choice (selection among different fare products or payment options).  There are many possible approaches to modeling the ridership and revenue impacts of potential fare change scenarios, which vary in the level of realism with which they represent agency fare structure and customer behavior. Modeling approaches all require both estimation of behavioral parameters describing the customer behaviors listed above and a procedure for applying those parameters to predict the impact of fare change scenarios.  The most common approach to modeling fare change scenarios – elasticity spreadsheet modeling like the FERRET model used by the MBTA – includes only crude adjustments to account for fare product choice. These models are generally unable to realistically predict impacts at the level of specific fare products if prices are changed differently for different products.  The modeling approach used in recent decades at the CTA combines a fare product choice model with a conventional elasticity spreadsheet model to reflect each of the three key customer behaviors identified above; however, previous iterations of the model required expensive, stated- preference surveys (with their attendant sampling and response bias issues) to estimate fare product choice utility parameters, and they were populated with data from the CTA’s legacy AFC system.  The academic literature on transit fare elasticities suggest that long-run elasticities are over twice as large as short-run, bus trips are twice as price-sensitive as rail trips, off-peak trips are twice as price-sensitive as peak trips, and fare integration with free transfers may generate substantial increases in ridership. There are few estimates, however, of elasticities and cross-elasticities for different transit fare products or payment structures (such as passes versus pay-per-use). Likewise, there are few studies of fare product choice models, and only one estimated using AFC data (Zureiqat 2008).  Before-and-after estimation of elasticities using a short time period around a single fare change event present challenges in controlling for non-fare or “external” factors that affect transit demand and accounting for switching between fare products. The confounding effect of external factors can be mitigated (or at least recognized) by estimating time trends in the period

211

immediately before the fare change. The effects of fare product switching can be removed from analyses (at least partially) using individual-level AFC data or modeled directly using integrated discrete-continuous panel data estimation.  Taken as a whole, these different literatures form a natural sequence that could be used to structure iterative fare policy analysis using AFC data, even in the absence of a pressing fare policy question or crisis: 1) Identify the pricing strategies implicit in the current fare structure and possible alternative fare structures, 2) use exploratory techniques with AFC data to observe relationships between these strategies and observed customer behaviors, and 3) model demand for relevant fare products by market segments in a way that reflects key customer behaviors in order to predict the ridership and revenue impacts of changes to pricing strategy (i.e. to fare structure and fare levels).  A simple pass pricing model can be constructed using four parameters to describe the important fare-related behaviors described above (mode choice, fare product choice, and induced ridership on pass products). Sensitivity analysis of this model demonstrates several fundamentals of pass pricing and promotion. Offering passes can increase both ridership and revenue, and pass prices should be set somewhere below the revenue-maximizing multiple to capture additional ridership. While setting particular pass prices involves a tradeoff between ridership and revenue, agencies can potentially increase both ridership and revenue by shaping preferences and improve passes; possible strategies include marketing, sale channel design, and bundling passes with other services (like bike share and ride hailing). 8.1.3 Chapter 5: Strategic Use of Pass Sale Channels  The mix of pass sale channels at the MBTA and CTA are similar, but the relative size of different channels varies widely. The CTA sells a much larger share of passes through retail stores and online sale channels (web site, mobile app, and auto-load) than the MBTA, while the MBTA’s pre-tax employer pass program accounts for a much larger share of passes (51% of total full-fare pass revenue) than the equivalent employer pass program at the CTA (20% of full-fare pass revenue).  Customers purchasing passes through different sale channels at the MBTA and the CTA exhibit different characteristics. Customers who purchase passes through employers, online, or via mobile app (only at the CTA) use transit less frequently than customers who purchase passes through fare vending machines, ticket windows, and retail stores. (In addition to reaching different groups of people, employer and online/mobile sale channels include automatically renewing purchase options that reduce the salience of fares. Employer sale channels additionally provide a tax benefit and potential employer subsidies that might not be realized by customers in other sale channels.) Customers purchasing passes through employers are also more peaked in the timing of their ridership and more likely to use only rail than customers purchasing through other sale channels.  Customers using different pass sale channels also exhibit differences in their willingness-to-pay for passes and their sensitivity to fare changes. AFC data show that the “use value” of passes (what a customer would have paid in pay-per-use fares for the ridership on their pass) is higher for passes purchased at FVMs and the retail network than through employers and online. Moreover, the employer-based pass sales experienced much smaller declines than other sale channels following the July 2016 fare increase at the MBTA.

212

 Based on the analysis of “use value” using AFC data, the MBTA and CTA both capture significant surplus revenue while modestly boosting ridership as a result of employer-based pass sales. The Corporate Pass Program at the MBTA is estimated to contribute $8.4 million per year in additional revenue from monthly LinkPasses, and likely over $11 million if commuter rail passes are included. The Pre-Paid Benefits program at the CTA is much smaller (accounting for a smaller share of lower total pass sales than at the MBTA), but it still generates an estimated $3 million each year that would not be collected otherwise. This surplus comes from some combination of two factors: 1) automatic renewal of employee pass purchases, which reduces price salience for all employees (even those who would have mostly purchased passes without an employer program or tax benefits), and 2) the impact of reduced effective price due to federal and state tax benefits, which attracts some customers from pay-per-use to passes. 8.1.4 Chapter 6: Parameter Estimation  By filtering accounts, AFC data can be used to estimate plausible upper bounds on the median induced ridership effect of a customer using a pass instead of pay-per-use (in the absence of any fare change or other exogenous shock to fare product choice).  At the CTA, this approach suggests an upper bound on induced ridership of +11% for the 30-day pass and +21% for the 7-day pass (i.e. customers using 30-day passes take as much as 11% more rides on average than they would have if they used pay-per-use, and customers using 7-day passes take as much as 21% more rides).  The same analysis at the CTA shows larger induced ridership effects for bus rides and transfer trips than for rail rides and one-seat trips, suggesting that customer choices and customer switching between passes and pay-per-use are likely to affect bus and transfer ridership more than rail and one-seat ridership.  AFC panel data regressions can be used to estimate the fare elasticity of pay-per-use ridership (conditional on choosing pay-per-use) using cards or accounts that use pay-per-use before and after a fare change, even in a setting with substantial switching between fare product options.  Fixed effects panel regressions on pay-per-use customers around the July 2016 fare change at the MBTA suggest a conditional elasticity of around -0.7 overall and -0.5 for peak-period and rail ridership (considerably higher than the values assumed in the MBTA’s current fare modeling tool).  By separating cards or accounts that are continuously active from those that become active or inactive, detailed AFC data can be used to partially remove the effects of fare product switching from an analysis of aggregate pass sales (even in a context with significant baseline switching between fare products and significant churn or attrition in cards and accounts). This allows for crude estimation of price elasticities for transit passes following fare changes that induce switching between fare products (where it would be incorrect to attribute fare product switching to price elasticity).  Application of this approach to the July 2016 fare change at the MBTA yields elasticity estimates for Corporate (i.e. employer-based) and Non-Corporate Monthly LinkPasses. Consistent with previous studies, the results suggest that Non-Corporate LinkPasses are more elastic (i.e. more sensitive to changes in fares) than Corporate LinkPasses. Under reasonable assumptions about baseline trends, the elasticity for Non-Corporate LinkPasses is estimated to be between -0.5 and -

213

0.7 (much higher than the MBTA’s current modeling assumption of -0.15 for all LinkPasses), and the elasticity for Corporate LinkPasses is estimated to be between -0.1 and -0.2.  AFC data can also be used to estimate fare product choice utility parameters for current fare products based on customers’ observed product choices and previous “use value” (pay-per-use price of observed ridership), even in the absence of a fare change.  Estimation of a multinomial logit model on a panel of CTA accounts shows that customer choices between passes and pay-per-use can be very sensitive to prices when “use value” is close to pass prices. CTA customers tend to select pay-per-use over 30-day passes even after controlling for ridership level and product prices, and customers taking a greater share of their transit rides on rail tend to prefer 30-day passes over 7-day passes. 8.1.5 Chapter 7: Incorporating Fare Product Choice into Fare Modeling  Card- or account-level AFC data can be used to populate a simple fare change scenario model that captures three key customer behaviors using parameters for product choice utility, induced ridership, and elasticity. As shown in Chapter 6, these parameters can be estimated using AFC data, which provides revealed preference information under the current fare structure. This avoids the need to rely on stated-preference customer surveys for analysis of incremental fare change scenarios, such as changing fare levels for existing products. (Scenarios with more dramatic changes in fare structure may still require surveys to estimate parameters.)  This modeling approach – combining an elasticity spreadsheet model with a fare product choice model – can in theory provide much more realistic product-level predictions of fare change impacts than elasticity models with fixed diversion factors in situations where product prices change by different proportions (as in recent fare changes at the CTA and MBTA).  Results from an example implementation of this modeling approach at the CTA predicted that the CTA fare change in January 2018 would have impacts of -3% on ridership, +4% on fare revenue, and +5% in pass market share, and that pay-per-use revenue would be relatively flat net of fare product switching.  While the actual impacts are best measured over a year or longer, the first two months after the fare change saw year-over-year ridership losses of 2%, revenue gains of 6%, and a change in pay- per-use revenue of +4.5% (all relative to January through March of 2017). The differences between predicted and actual changes could result from external factors, pre-existing trends, a lag in fare change impacts, error in the model’s parameters, or deficiencies in the model’s structure (such as failure to distinguish important customer segments); preliminary summaries suggest that the fare change reversed a pre-existing downward trend in 7-day pass sales, and it is too early to observe the full impact of the fare change.  Scenarios explored using the CTA ridership and revenue model suggested several principles for targeting fare increases at the CTA, including focusing increases on lower-elasticity trip types (rail trips and peak-period trips) rather than raising fares across the board (including on bus and off-peak trips), moving to a step-up transfer policy (rather than flat 25 cent transfer fee) as a targeted give-back to mitigate ridership losses from any bus fare increases, and raising pay-per- use fares more than pass prices to shift some customers back into passes (which have many advantages for the CTA).

214

8.2 Overall Conclusions and Recommendations Several noteworthy conclusions can be drawn from the cumulative findings of this thesis.

First, readily-available AFC data provides many opportunities to learn about fare policies and customer behaviors and to improve analysis of incremental fare changes; however, the AFC-based tools and techniques illustrated in this thesis may take too much time to develop during a short political window of opportunity to make changes in fare policy. In order to take full advantage of AFC data to inform fare changes, fare policy analysis must be an ongoing effort between fare change opportunities. Agencies can facilitate this ongoing analysis by implementing data infrastructure and data policies that improve usability of detailed, card- or account-level AFC data; most of the analyses in this thesis depend on the ability to observe individuals across trips or over time, whether simply to measure frequency of ridership within one month or to estimate trends in activity over time. Treating fare policy analysis as a constant or periodic exercise may provide useful insights between fare changes (such as opportunities to expand sale programs or better market fare products), and it allows analysts to play a more active role in shaping and informing discussion about fare change alternatives.

Second, as seen in the January 2013 fare change at the CTA and the July 2016 fare change at the MBTA, altering the relative prices of pass products and pay-per-use fares (or “pass multiples”) can result in significant switching between fare products, which in turn affects net changes in sales and ridership. Failing to account for this fare product switching could lead to incorrect conclusions about the fare elasticities for different fare products (both when reviewing past fare changes and when predicting the impact of future fare changes).

Third, transit agencies should consider promoting pass products and favoring a high pass market share. A simple model of pass pricing shows that passes should always be priced somewhat below revenue maximization in order to capture inexpensive gains in ridership (which is essential to justify and sustain support for public subsidy of transit operations). Induced ride factor estimates using CTA Ventra data suggest that customers who purchase passes uses transit more than if the same customers paid per-ride fares (due to zero marginal costs on passes), and elasticity estimates using AFC data at the MBTA show that passes are less sensitive to price changes than pay-per-use trips. In the short run, it may seem advantageous to concentrate fare increases on low-elasticity passes; however, this shifts market share toward pay-per-use, reducing ridership for existing customers and leaving a greater portion of an agency’s patronage “at risk” of greater decline in the future.

Fourth, there are several ways a transit agency might increase its pass market share. The most straightforward option is typically to increase pay-per-use fares more than pass prices, improving pass “multiples.” Ridership losses from increasing pay-per-use fares are mitigated by some customers shifting to passes, and losses can be further reduced by concentrating fare increases on lower-elasticity trip types like rail and peak-period trips (which are less price-sensitive than bus and weekend trips). Another option is to make passes more attractive without changing prices, which would increase revenue and ridership by shifting lower-use customers into passes. One way this can be accomplished is by expanding or marketing sale channels that allow auto-renewal of pass purchases, which reduces price sensitivity. Employer pass programs have the added substantial benefit of pre-tax purchase and potential employer subsidies, which lower the effective price of passes and make them attractive to a larger share of potential

215 transit riders. Another possibility for improving passes is to bundle them with discounts on other transportation services, such as bike share or ride hailing. 8.3 Research Extensions Transit fare policy analysis is a rich research area. The following are a few of the many potential extensions to this work:

 The pre-tax sale of transit passes through employee payroll deduction was found to generate additional revenue and ridership at the MBTA and CTA; however, this tax benefit is only available at a subset of employers who participate in direct pass sale programs or use third-party employee benefits administrators. Extending tax-free transit purchases to all transit commuters would correct this inequity and would increase agency revenue and ridership. Policies to achieve this – such as allowing transit agencies to certify customer eligibility for a tax deduction based on ridership frequency – could be explored and evaluated.  Pay-per-use elasticities at the MBTA were estimated for a very limited set of trip types. Additional trip features could be developed (trip distance, trips ending downtown, etc.) and used to further differentiate elasticity estimates.  In this thesis, fare product choice model estimation with CTA Ventra data used a very simple specification and did not make full use of the panel nature of the Ventra data. Logit choice model specification and estimation could be improved, and machine learning techniques for fare product choice prediction could be explored.  This thesis used the first few months after the CTA January 2018 fare change to provide an initial evaluation of a fare change scenario model. Additional retrospective evaluation of the CTA’s 2018 fare change could include later validation of the fare scenario model, summary and visualization of individual-level shifts in fare product choice and ridership, estimation of elasticities and product choice models, etc.  One limitation of the CTA product choice and elasticity scenario model presented in this thesis is the inability to capture rapid changes in ride hailing as a competing transportation alternative. Cross-elasticities with respect to ride hailing prices and service could be estimated (if ride hailing data were available), or the model could be extended to explicitly model mode choices between transit, ride hailing, and other modes.  Analyses in this thesis were disaggregated in various ways, but never spatialized. Exploratory AFC analysis, fare-related parameter estimation, and fare change scenario modeling could all be segmented or disaggregated by geography.  Along similar lines, this thesis did not focus on equity. The results of fare change scenario modeling and other analyses could be disaggregated and spatialized to explore the distributional impacts of fare policies and fare changes.  Bundling transit passes with discounts on other services is one potential strategy for making passes more attractive relative to pay-per-use, resulting in a higher pass market share and increases in both ridership and revenue. Bundled pass products could be designed, piloted, and evaluated.

216

 The Appendix demonstrates an initial clustering of MBTA AFC cards based on temporal travel features. In future work, AFC-based clusters could be used to segment analysis of fare change impacts and modeling of potential fare change scenarios.  Fare integration and introduction of free transfers anecdotally stimulate large increases in transit ridership. Potential integration of CTA bus and rail with Metra commuter rail fare policy could be analyzed.  This thesis focused on incremental changes in fare structure and fare levels. Methods and models could be adapted to analyze major changes to fare structure, such as switching from flat fares to distance-based fares or introducing fare capping.  Halvorsen (2015) and Basu (2018) apply statistical and machine learning techniques to AFC data to detect and predict changes in user behaviors. These behavior change detection techniques could be used to identify transit users whose demand changed for reasons unrelated to fare policy (such as moving home location) and to remove them from disaggregate analysis of fare change impacts and fare sensitivity.

217

References Abadie, Alberto, Alexis Diamond, and Jens Hainmueller. 2010. “Synthetic Control Methods for Comparative Case Studies: Estimating the Effect of California’s Tobacco Control Program.” Journal of the American Statistical Association 105 (490): 493–505. Andrews, Steven, and Annette Demchur. 2016. “Potential MBTA Fare Changes in SFY 2017: Final Option Impact Analysis.” MBTA. https://mbta.com/events/2016-03-16/fiscal-management- control-board-meeting. Antos, Justin, and Michael D. Eichler. 2016. “Planning a Successful Monthly Pass for Metrorail.” In TRB 96th Annual Meeting Compendium of Papers. https://trid.trb.org/view/1437823. Balcombe, Richard, Roger Mackett, Neil Paulley, John Preston, Jeremy Shires, Helena Titheridge, Mark Wardman, and Peter White. 2004. “The Demand for : A Practical Guide.” http://www.demandforpublictransport.co.uk/TRL593.pdf. Basso, Leonardo J., and Sergio R. Jara-Díaz. 2012. “Integrating Congestion Pricing, Transit Subsidies and Mode Choice.” Transportation Research Part A: Policy and Practice 46 (6): 890–900. https://doi.org/10.1016/j.tra.2012.02.013. Basu, Abhishek. 2018. “Data-Driven Customer Segmentation and Personalized Information Provision in Public Transit.” Thesis, Massachusetts Institute of Technology. Bates, John, David Ashley, and Geoff Hyman. 1987. “The Nested Incremental Logit Model : Theory and Application to Modal Choice.” Proceedings of the 15the PTRC Summer Annual Meeting 290: 75– 98. Ben-Akiva, Moshe E., and Steven R. Lerman. 1985. Discrete Choice Analysis: Theory and Application to Travel Demand. Vol. 9. MIT press. Bench, Clinton. 2006. “Technical Report: Impact Analysis of a Potential MBTA Fare Increase and Restructuring in 2007.” MBTA. Block-Schachter, David. 2016. “AFC2 Board Update.” MBTA, June 27. https://cdn.mbtace.com/sites/default/files/fmcb-meeting-docs/2016/june/062716-AFC2-Board- June-27.pdf. Borjian, Shahrzad, Jake Schabas, and John Segal. 2017. “Exploratory Method for Practitioners Analyzing the Impact of Integrated Fare Structures in Decentralized Metropolitan Regions.” Transportation Research Record: Journal of the Transportation Research Board 2652 (January): 124–34. https://doi.org/10.3141/2652-14. Boyle, Daniel K. 2006. Fixed-Route Transit Ridership Forecasting and Service Planning Methods. TCRP Synthesis 66. Washington, D.C: Transportation Research Board. Brakewood, Candace, and George Kocur. 2011. “Modeling Transit Rider Preferences for Contactless Bank Cards as Fare Media.” Transportation Research Record: Journal of the Transportation Research Board 2216 (December): 100–107. https://doi.org/10.3141/2216-11. ———. 2013. “Unbanked Transit Riders and Open Payment Fare Collection.” Transportation Research Record: Journal of the Transportation Research Board 2351 (December): 133–41. https://doi.org/10.3141/2351-15. Bueno, Paola Carolina, Juan Gomez, Jonathan R. Peters, and Jose Manuel Vassallo. 2017. “Understanding the Effects of Transit Benefits on Employees’ Travel Behavior: Evidence from the New York-New Jersey Region.” Transportation Research Part A: Policy and Practice 99 (May): 1–13. https://doi.org/10.1016/j.tra.2017.02.009. Cable, Dustin. 2013. “The Racial Dot Map: One Dot Per Person for the Entire U.S.” Weldon Cooper Center for Public Service, Rector and Visitors of the . http://demographics.virginia.edu/DotMap/. Cambridge Systematics. 2010. “Ridership and Revenue Budget Econometric Forecasting Model Version 3.0: Final Report and User Guide.” WMATA. ———. 2012. “CTA Ridership and Revenue Model: Draft Final Report.” Chicago Transit Authority.

218

Carbajo, Jose C. 1988. “The Economics of Travel Passes: Non-Uniform Pricing in Transport.” Journal of Transport Economics and Policy 22 (2): 153–73. Chan, Raymond, Maulik Vaishnav, Scott Wainwright, Paul Murray, and Alex Cui. 2018. “Data Driven Opportunities from an Account-Based Fare Payment System.” In TRB 97th Annual Meeting Compendium of Papers. https://trid.trb.org/view/1496865. Cosgrove, Joseph, Elizabeth Moore, and Annette Demchur. 2011. “MBTA Title VI Report.” MBTA. https://cdn.mbta.com/uploadedfiles/About_the_T/Fare_Proposals_2012/Map%20of%20MBTA% 20Service%20District%281%29.pdf. CTA. 2017a. “CTA President’s 2018 Budget Recommendations.” CTA. https://www.transitchicago.com/assets/1/6/2018_Budget_Book_2017-11- 21_FINAL_web_version.pdf. ———. 2017b. “CTA Annual Ridership Report: Calendar Year 2016.” CTA. https://www.transitchicago.com/assets/1/6/2016_Annual_-_Final.pdf. Davidoff, Paul. 1965. “Advocacy and Pluralism in Planning.” Journal of the American Institute of Planners 31 (4): 331–38. https://doi.org/10.1080/01944366508978187. Dawson, Robert L. 2018. “Increasing MBTA Ridership and Revenue with Company Commuter Benefit Programs.” White Paper 175. Pioneer Public. Pioneer Institute. Deaton, Angus, and John Muellbauer. 1980. “An Almost Ideal Demand System.” The American Economic Review 70 (3): 312–326. Delucchi, Mark A. 2004. Summary of the Nonmonetary Externalities of Motor-Vehicle Use. Institute of Transportation Studies, University of California, Davis. Dodgson, John S. 1986. “Benefits of Changes in Urban Public Transport Subsidies in the Major Australian Cities.” Economic Record 62 (2): 224–235. Elmore-Yalch, Rebecca. 1998. A Handbook: Using Market Segmentation to Increase Transit Ridership. TCRP Report 36. Washington, D.C: National Academy Press. ESRI. 2016. “Wealth Divides: Exploring the Stark Dividing Lines between Rich and Poor in American Cities.” 2016. http://storymaps.esri.com/stories/2016/wealth-divides/. Filler, Larry. 2015. “Opportunities to Expand the Massachusetts Bay Transportation Authority’s Corporate Pass Program.” A Better City. Fleishman, Daniel, ed. 1996. Fare Policies, Structures, and Technologies. TCRP 10. Washington, D.C: Transportation Research Board/ National Academy Press. ———. , ed. 2003. Fare Policies, Structures and Technologies: Update. TCRP 94. Washington, D.C: Transportation Research Board. Fleishman, Daniel, Frank S. Koppelman, and Joseph L. Schofer. 1991. “Consumer-Based Transit Pricing at the Chicago Transit Authority.” https://trid.trb.org/view.aspx?id=934809. FMCB. 2017. “Fiscal Management and Control Board Strategic Plan.” https://d3044s2alrsxog.cloudfront.net/sites/default/files/fmcb-meeting-docs/reports-policies/2017- mbta-strategic-plan.pdf. FTA. 2016. “National Transit Database Transit Profiles: 2015 Top 50 Summary.” National Transit Database. https://www.transit.dot.gov/sites/fta.dot.gov/files/docs/Transit%20Profiles%202015%20- %20Top%2050%20Agencies%20with%20Summary.pdf. ———. 2017. “National Transit Database Transit Profiles: 2016 Top 50 Summary.” National Transit Database. https://cms.fta.dot.gov/sites/fta.dot.gov/files/docs/ntd/66026/top-50-summary-and- complete-profile-set_1.pdf. ———. 2018. “National Transit Database Monthly Module Adjusted Data Release, December 2017.” https://www.transit.dot.gov/ntd/data-product/monthly-module-adjusted-data-release. Goulet-Langlois, Gabriel. 2015. “Exploring Regularity and Structure in Travel Behavior Using Smart Card Data.” https://dspace.mit.edu/bitstream/handle/1721.1/99546/925480984- MIT.pdf?sequence=1.

219

Greene-Roesel, Ryan, Joe Castiglione, Camille Guiriba, and Mark Bradley. 2018. “BART Perks: Using Incentives to Manage Transit Demand.” In TRB 97th Annual Meeting Compendium of Papers. https://trid.trb.org/view/1496413. Gwilliam, Ken. 2008. “A Review of Issues in Transit Economics.” Research in Transportation Economics 23 (1): 4–22. https://doi.org/10.1016/j.retrec.2008.10.002. Halvorsen, Anne. 2015. “Improving Transit Demand Management with Smart Card Data: General Framework and Applications.” Massachusetts Institute of Technology. http://dspace.mit.edu/handle/1721.1/99543. Harris, Annie, Robert Thomas, and Daniel Boyle. 1999. “Metropolitan Atlanta Rapid Transit Authority Fare Elasticity Model.” Transportation Research Record: Journal of the Transportation Research Board 1669 (January): 123–28. https://doi.org/10.3141/1669-15. Hensher, David A. 1998. “Establishing a Fare Elasticity Regime for Urban Passenger Transport.” Journal of Transport Economics and Policy 32 (2): 221–46. ———. 2008. “Assessing Systematic Sources of Variation in Public Transport Elasticities: Some Comparative Warnings.” Transportation Research Part A: Policy and Practice 42 (7): 1031–42. https://doi.org/10.1016/j.tra.2008.02.002. Hensher, David A., and Jenny King. 1998. “Establishing Fare Elasticity Regimes for Urban Passenger Transport: Time-Based Fares for Concession and Non-Concession Markets Segmented by Trip Length.” Journal of Transportation and Statistics 1 (1): 43–61. Hickey, Robert. 2005. “Impact of Transit Fare Increase on Ridership and Revenue: Metropolitan Transportation Authority, New York City.” Transportation Research Record: Journal of the Transportation Research Board 1927 (January): 239–48. https://doi.org/10.3141/1927-27. Hickey, Robert, Alex Lu, and Alla Reddy. 2010. “Using Quantitative Methods in Equity and Demographic Analysis to Inform Transit Fare Restructuring Decisions.” Transportation Research Record: Journal of the Transportation Research Board 2144 (December): 80–92. https://doi.org/10.3141/2144-10. Hong, Yi. 2006. “Transition to Smart Card Technology: How Transit Operators Can Encourage the Take- up of Smart Card Technology.” Massachusetts Institute of Technology. http://dspace.mit.edu/handle/1721.1/38237. Hörcher, Daniel, Daniel J. Graham, and Richard J. Anderson. 2018. “The Economic Inefficiency of Travel Passes Under Crowding Externalities and Endogenous Capacity.” Journal of Transport Economics and Policy (JTEP) 52 (1): 1–22. ICF Consulting. 2003. “TCRP Report 87: Strategies for Increasing the Effectiveness of Commuter Benefits Programs.” 87. TCRP. Washington, D.C: Transportation Research Board. ———. 2005. “TCRP Report 107: Analyzing the Effectiveness of Commuter Benefits Programs.” 107. TCRP. Washington, D.C.: Transportation Research Board. https://doi.org/10.17226/21979. Imbens, Guido W., and Jeffrey M. Wooldridge. 2009. “Recent Developments in the Econometrics of Program Evaluation.” Journal of Economic Literature 47 (1): 5–86. Iseki, Hiroyuki, Chao Liu, and Gerrit-Jan Knaap. 2015. “Origin-Destination Land Use Ridership Model for Fare Policy Analysis (For the Washington Metropolitan Area Transit Authority).” National Center for Smart Growth Research and Education. http://planitmetro.com/wp- content/uploads/2015/12/151215_LandUseRidershipModel_FinalReport.pdf. Jain, A. K., M. N. Murty, and P. J. Flynn. 1999. “Data Clustering: A Review.” ACM Comput. Surv. 31 (3): 264–323. https://doi.org/10.1145/331499.331504. Jain, Anil K. 2010. “Data Clustering: 50 Years beyond K-Means.” Pattern Recognition Letters, Award winning papers from the 19th International Conference on Pattern Recognition (ICPR), 31 (8): 651–66. https://doi.org/10.1016/j.patrec.2009.09.011. Jain, Nihit. 2011. “Assessing the Impact of Recent Fare Policy Changes on Public Transport Demand in London.” Thesis, Massachusetts Institute of Technology. http://dspace.mit.edu/handle/1721.1/66866.

220

Jansson, Jan Owen. 1979. “Marginal Cost Pricing of Scheduled Transport Services: A Development and Generalisation of Turvey and Mohring’s Theory of Optimal Bus Fares.” Journal of Transport Economics and Policy 13 (3): 268–94. Jara-Díaz, Sergio, Diego Cruz, and César Casanova. 2016. “Optimal Pricing for Travelcards under Income and Car Ownership Inequities.” Transportation Research Part A: Policy and Practice 94: 470–482. Kamfonik, Dianne E. 2013. “Quantifying the Current and Future Impacts of the MBTA Corporate Pass Program.” Massachusetts Institute of Technology. http://dspace.mit.edu/handle/1721.1/82837. Kieu, Le Minh, Ashish Bhaskar, and Edward Chung. 2014. “Transit Passenger Segmentation Using Travel Regularity Mined from Smart Card Transactions Data.” Transportation Research Board 93rd Annual Meeting, January, 12–16. Kocur, George. 2015. “Unified Transportation Payment Media: Options for Massachusetts.” A Better City. Koppelman, Frank S. 1983. “Predicting Transit Ridership in Response to Transit Service Changes.” Journal of Transportation Engineering 109 (4): 548–64. https://doi.org/10.1061/(ASCE)0733- 947X(1983)109:4(548). Krizek, Kevin J., and Ahmed El-Geneidy. 2007. “Segmenting Preferences and Habits of Transit Users and Non-Users.” Journal of Public Transportation 10 (3): 5. Kumar, Ashok. 1980. “Pivot Point Modeling Procedures in Demand Estimation.” Journal of Transportation Engineering 106 (6). Kuzmyak, J. Richard, John E. Evans, and Richard H. Pratt. 2010. Employer and Institutional TDM Strategies -- Traveler Response to Transportation System Changes. Washington, D.C.: National Academies Press. https://doi.org/10.17226/14393. Lambrecht, Anja, Katja Seim, and Bernd Skiera. 2007. “Does Uncertainty Matter? Consumer Behavior under Three-Part Tariffs.” Marketing Science 26 (5): 698–710. Lathia, Neal, and Licia Capra. 2011. “How Smart Is Your Smartcard?: Measuring Travel Behaviours, Perceptions, and Incentives.” In Proceedings of the 13th International Conference on Ubiquitous Computing, 291–300. ACM. http://dl.acm.org/citation.cfm?id=2030152. Lipscombe, Peter. 2016. “Transit Fare Policy: An International Best Practices Review for Metro Vancouver.” Vancouver: Translink. https://www.translink.ca/- /media/Documents/plans_and_projects/transit_fare_review/Transit-Fare-Policy-Best-Practices- Review_Full-Report_2016.pdf. Litman, Todd. 2004. “Transit Price Elasticities and Cross-Elasticities.” Journal of Public Transportation 7 (2): 3. ———. 2017. “Transit Price Elasticities and Cross - Elasticities.” Victoria Transport Policy Institute. https://www.vtpi.org/tranelas.pdf. Lozada, John, Nicholas Hart, and Annette Demchur. 2017. “MBTA Title VI Report.” MBTA. https://cdn.mbta.com/sites/default/files/2017-11/2017-2020-mbta-title-vi-report.pdf. MBTA. 2012. “MBTA Service and Fare Changes Effective July 1, 2012.” https://cdn.mbta.com/uploadedfiles/Fares_and_Passes_v2/MBTA_Service%20and%20Fare%20C hanges%20July%201.pdf. ———. 2014a. “MBTA Fare Changes Effective July 1, 2014.” https://cdn.mbta.com/uploadedfiles/Fares_and_Passes_v2/MBTA%20Fare%20Changes_July201 4.pdf. ———. 2014b. Ridership and Service Statistics. 14th ed. MBTA. http://www.mbta.com/uploadedfiles/documents/bluebook%202010.pdf. ———. 2015. “MBTA Statement of Revenue and Expenses: FY 1991 to FY 2016 Budget.” https://mbta.com/financials/mbta-budget. ———. 2016a. “MBTA Fare Changes Effective July 1, 2016.” https://cdn.mbtace.com/uploadedfiles/About_the_T/Board_Meetings/FINAL%20FARE%20CHA NGES.pdf.

221

———. 2016b. “MBTA Tariff and Statement of Fare and Transfer Rules.” https://cdn.mbta.com/sites/default/files/fmcb-meeting-docs/2016/june/060616-Tariff-and- Statement-of-Fare-and-Transfer-Rules.pdf. ———. 2016c. “Massachusetts Bay Transportation Authority Automated Fare Collections System Services: Request for Proposals for Systems Integrator.” MBTA. https://cdn.mbtace.com/sites/default/files/projects/afc2-rfp-systems-integrator.pdf. McCollum, Brian E., and Richard H. Pratt. 2004. Traveler Response to Transportation System Changes Handbook, Third Edition: Chapter 12, Transit Pricing and Fares. Washington, D.C.: Transportation Research Board. https://doi.org/10.17226/13800. McElroy, David P. 2009. “Integrating Transit Pass Ownership into Mode Choice Modelling.” Thesis. https://tspace.library.utoronto.ca/handle/1807/17698. McFadden, Daniel. 1977. “Quantitative Methods for Analyzing Travel Behaviour of Individuals: Some Recent Developments.” Cowles Foundation Discussion Paper 474. Cowles Foundation for Research in Economics, Yale University. https://econpapers.repec.org/paper/cwlcwldpp/474.htm. Metrolinx. 2015. “Approach to Fares Around the World.” Fare Integration. Metrolinx. http://www.metrolinx.com/en/regionalplanning/fareintegration/20150302_Approach_to_Fares_E N.pdf. Miller, Caroline, and Ian Savage. 2017. “Does the Demand Response to Transit Fare Increases Vary by Income?” Transport Policy 55 (April): 79–86. https://doi.org/10.1016/j.tranpol.2017.01.006. Miller, Mark A., Larry S. Englisher, Rick Halvorsen, and Bruce Kaplan. 2005. “Transit Service Integration Practices: An Assessment of U.S. Experiences,” March. https://escholarship.org/uc/item/5pk4n6j1. Mohring, Herbert. 1972. “Optimization and Scale Economies in Urban Bus Transportation.” The American Economic Review 62 (4): 591–604. Multisystems, Inc. 2000. “Fare Structure Pricing Research and Update of Ridership/Revenue Fares Model.” Chicago Transit Authority. Nair, Harikesh, Jean-Pierre Dubé, and Pradeep Chintagunta. 2005. “Accounting for Primary and Secondary Demand Effects with Aggregate Data.” Marketing Science 24 (3): 444–460. Narayanan, Sridhar, Pradeep K. Chintagunta, and Eugenio J. Miravete. 2007. “The Role of Self Selection, Usage Uncertainty and Learning in the Demand for Local Telephone Service.” Quantitative Marketing and Economics 5 (1): 1–34. Neff, John, and Matthew Dickens. 2017. “2016 Public Transportation Fact Book,” February. https://trid.trb.org/view/1483547. Nelson, Peter, Andrew Baglino, Winston Harrington, Elena Safirova, and Abram Lipman. 2007. “Transit in Washington, DC: Current Benefits and Optimal Level of Provision.” Journal of Urban Economics 62 (2): 231–51. https://doi.org/10.1016/j.jue.2007.02.001. Newman, Jeffrey, Mark Ferguson, and Laurie Garrow. 2012. “Estimating Discrete Choice Models with Incomplete Data.” Transportation Research Record: Journal of the Transportation Research Board 2302 (December): 130–37. https://doi.org/10.3141/2302-14. Ofsevit, Ari. 2017. “The Amateur Planner: How Many People Use Commuter Rail? More than You Might Think.” The Amateur Planner (blog). March 1, 2017. http://amateurplanner.blogspot.com/2017/03/how-many-people-use-commuter-rail-more.html. Oram, R. L. 1990. “Deep Discount Fares: Building Transit Productivity With Innovative Pricing.” Transportation Quarterly 44 (3). https://trid.trb.org/view/312654. Ortega-Tong, Meisy Andrea. 2013. “Classification of London’s Public Transport Users Using Smart Card Data.” Massachusetts Institute of Technology. http://dspace.mit.edu/handle/1721.1/82844. Parry, Ian W. H., and Kenneth A. Small. 2009. “Should Urban Transit Subsidies Be Reduced?” The American Economic Review; Nashville 99 (3): 700–724. http://dx.doi.org/10.1257/aer.99.3.700. Paulley, Neil, Richard Balcombe, Roger Mackett, Helena Titheridge, John Preston, Mark Wardman, Jeremy Shires, and Peter White. 2006. “The Demand for Public Transport: The Effects of Fares,

222

Quality of Service, Income and Car Ownership.” Transport Policy, Innovation and Integration in Urban Transport Policy, 13 (4): 295–306. https://doi.org/10.1016/j.tranpol.2005.12.004. Phillips, Robert Lewis. 2005. Pricing and Revenue Optimization. Stanford University Press. Pincus, Kate Samantha. 2014. “Analysis of the 2012 Massachusetts Bay Transportation Authority Fare Increase Using Automated Fare Collection Data.” Massachusetts Institute of Technology. http://dspace.mit.edu/handle/1721.1/89869. Rosenfield, Adam. 2018. “Driving Change: How Workplace Transportation Benefit Reforms Can Nudge Solo Car Commuters Towards Sustainable Travel Modes.” Massachusetts Institute of Technology. Sharaby, Nir, and Yoram Shiftan. 2012. “The Impact of Fare Integration on Travel Behavior and Transit Ridership.” Transport Policy 21 (May): 63–70. https://doi.org/10.1016/j.tranpol.2012.01.015. Shmueli, Galit, Peter C. Bruce, Inbal Yahav, Nitin R. Patel, and Kenneth C. Lichtendahl Jr. 2017. Data Mining for Business Analytics: Concepts, Techniques, and Applications in R. John Wiley & Sons. Sound Transit. 2015. “ST3 Regional High-Capacity Transit System Plan: Transit Ridership Forecasting Methodology Report.” http://www.wsdot.wa.gov/partners/erp/background/ST3%20Draft%20RidershipForecastingMeth odologyReport_6March2015.pdf. Steer Davies Gleave. 2017. “GTHA Fare Integration: Draft Preliminary Business Case.” Metrolinx. http://www.metrolinx.com/en/regionalplanning/fareintegration/Draft%20Preliminary%20Busines s%20Case%20for%20Fare%20Integration%20in%20the%20GTHA%20v4.0.pdf. ———. n.d. “Ticket Sales and Passenger Forecasting Model Development for Metro-North Rail Road.” Accessed April 21, 2017. http://na.steerdaviesgleave.com/casestudies/ticket-sales-and-passenger- forecasting-model-development-metro-north-rail-road. Stefanski, Leonard A., and Raymond J. Carroll. 1985. “Covariate Measurement Error in Logistic Regression.” The Annals of Statistics, 1335–1351. Stuntz, Andrew, John Attanucci, and Frederick P. Salvucci. 2017. “A Process for Transit Fare Structure and Fare Level Analysis: Case Study at the Massachusetts Bay Transportation Authority.” In TRB 96th Annual Meeting Compendium of Papers. https://trid.trb.org/view/1438026. Taplin, John HE, David A. Hensher, and Brett Smith. 1999. “Preserving the Symmetry of Estimated Commuter Travel Elasticities.” Transportation Research Part B: Methodological 33 (3): 215– 232. Tirachini, Alejandro, David A. Hensher, and John M. Rose. 2014. “Multimodal Pricing and Optimal Design of Urban Public Transport: The Interplay between Traffic Congestion and Bus Crowding.” Transportation Research Part B: Methodological 61 (March): 33–54. https://doi.org/10.1016/j.trb.2014.01.003. Train, Kenneth. 1993. Qualitative Choice Analysis: Theory, Econometrics, and Applications to Automobile Demand. Cambridge, MA: MIT Press. Train, Kenneth E., Daniel L. McFadden, and Moshe Ben-Akiva. 1987. “The Demand for Local Telephone Service: A Fully Discrete Model of Residential Calling Patterns and Service Choices.” The RAND Journal of Economics 18 (1): 109–23. https://doi.org/10.2307/2555538. Translink. 2016. “Transit Fare Review: Peer Agencies at a Glance.” Transit Fare Review. Vancouver: Translink. https://www.translink.ca/- /media/Documents/plans_and_projects/transit_fare_review/Appendix-G--Peer-Review.pdf. TranSystems. 2007. Elements Needed to Create High Ridership Transit Systems. Vol. 66. Transportation Research Board. http://www.trb.org/Publications/Blurbs/158910.aspx. Turvey, Ralph, and Herbert Mohring. 1975. “Optimal Bus Fares.” Journal of Transport Economics and Policy 9 (3): 280–86. Verbich, David, and Ahmed El-Geneidy. 2017. “Public Transit Fare Structure and Social Vulnerability in Montreal, Canada.” Transportation Research Part A: Policy and Practice 96 (February): 43–53. https://doi.org/10.1016/j.tra.2016.12.003. Vickrey, William. 1980. “Optimal Transit Subsidy Policy.” Transportation 9 (4): 389–409.

223

Wardman, Mark, and Jeremy Toner. 2003. “Econometric Modelling of Competition between Train Ticket Types.” In Proceedings of the European Transport Conference, 24. London, UK. Weesie, Louis, Nederlandse Spoorwegen, Freek Hofker, Maarten Kroes, Alex Mitrani, and Steer Davies Gleave. 2009. “The Impact of Tariff Differentiation on Time of Day Choice and Rail Demand in the Netherlands.” http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.678.3166&rep=rep1&type=pdf. WMATA. 2009. “Short Term Ridership Forecasting Model (Version 3.0).” January. https://www.wmata.com/initiatives/plans/upload/Short-Term-Ridership-Forecasting.pdf. ———. 2016. “Evaluation of WMATA’s Rail Fare Activities.” November 10, 2016. https://www.wmata.com/about/inspector-general/upload/Evaluation-of-WMATA-s-Fare- Activities-Report-OIG-17-02-Post.pdf. Zureiqat, Hazem Marwan. 2008. “Fare Policy Analysis for Public Transport: A Discrete-Continuous Modeling Approach Using Panel Data.” Massachusetts Institute of Technology. http://dspace.mit.edu/handle/1721.1/43748.

224

Appendix: Cluster-Based Segmentation Using AFC Data Motivation and Literature Review The Massachusetts Bay Transportation Authority (MBTA) operates bus, heavy/light rail, commuter rail, and ferry transit service in the Boston metro area. The MBTA collects large amounts of data on its customers' transit purchase and travel behaviors passively through its fare collection system. This Automated Fare Collection (AFC) data is used to inform a wide range of operations, planning, and policy decisions (such as replacement service during construction, routes and schedules for new services, and fare changes).

While AFC data provides good coverage on the bus and rapid transit portions of the network (which use automatic fare validation), the data is limited to times and locations of purchases and travel. In order for the MBTA to learn more about its customers and develop more effective customer segmentations for different applications, the agency must either conduct expensive surveys and pilot programs or glean additional insights from its existing AFC data. One opportunity to glean more from AFC data is temporal travel patterns. The MBTA can easily segment customers by total ridership, but beyond this simple metric there are currently no succinct ways of describing the frequency and regularity with which customers ride the MBTA. For example, it could be useful to distinguish commuters with highly regular travel times, commuters with irregular or flexible travel times, and irregular users.

In the last few years there have been several MIT master's theses out of the MIT Transit Lab focused on application of clustering techniques to transit AFC data. Ortega-Tong (2013) applied k-means clustering to Oyster Cards at Transport for London (TfL) using a variety of spatial, temporal, transit mode, and fare class features, ultimately identifying eight clusters of regular and occasional users that were interpretable but unstable over time. Goulet-Langlois (2015) extended this work at TfL by clustering cards on more detailed activity sequences and new metrics of travel regularity; the resulting cluster-based segmentation has been used for several applications at TfL, including evaluation of station-level retail offerings and customer information based the types of riders that regularly use each station. Halvorsen (2015) applied k-means clustering to morning travelers on the MTR system in Hong Kong in order to segment an analysis of customers who shifted their travel time in response to a new fare incentive program (the "Early Bird Promotion"). Finally, Basu (2018) extended and generalized Halvorsen’s work, using k- means to develop cluster-based segmentations of MTR cards that can be easily adapted to a range of applications (with particular attention to personalized information provision, such as relevant incident alerts through a transit agency mobile app). Basu created separate short-term and long-term segmentations using custom spatial and temporal features derived from AFC data.

This Appendix begins to extend this line of research to the MBTA for the first time. It focuses on temporal features only, drawing directly on the final features selected in Basu (2018).

225

Data and Features Clusterings are developed using a random sample of 10,000 CharlieCards (MBTA smartcards) and four weeks of MBTA automated fare collection (AFC) data, from Monday 10/2/2017 through Sunday 10/29/2017.

AFC records contain card information (serial numbers and fare media stock types that uniquely identify specific cards or tickets) and transaction information including timestamp, AFC device, type of transaction, fare product, and transaction amount (for payments or fare deductions). This analysis focuses on the transaction timestamps for fare validations (i.e. transit rides as opposed to fare product purchases).

This analysis also uses output of an existing origin-destination-transfer (ODX) inference process built on top of the MBTA AFC data. Among other outputs, ODX identifies the first AFC transaction in each transfer trip, which allows me to develop features based on trips rather than rides (avoiding double- counting of transfer trips). ODX also infers destinations and arrival times for most (but not all) trips in the AFC data; ideally final trip arrival times might be used rather than trip departure times for certain temporal features, but there were enough missing destination inferences that this proved infeasible.

This clustering work is based on the five temporal features that Basu (2018) adopted for long-term segmentation applications at MTR in Hong Kong. Two of these features are measured over four weeks for each card (the entire period used for clustering):

1. Active: The proportion of weekdays in the study period with observed travel

2. Range: The proportion of the study period between the first and last weekday of observed travel (inclusive)

Three additional temporal features are measured over one day, and average values are taken for each card over the four week period:51

3. Intensity: The average number of trips per day on days with observed travel

4. First: The average start time of the first trip on days with observed travel (in fractional hours)52

5. Last: The average start time of the last trip on days with observed travel (in fractional hours)

First, 57 outlier cards with intensities over 5 were removed from the sample of 10,000 cards. Features for the remaining cards are summarized in the figure below. Active and range are both bimodal, suggesting divisions between frequent and infrequent users as well as one-time versus consistent users. Intensity is a smoother distribution, but with spikes at 1 and 2 trips per day (on days with trips) that could potentially separate one-way versus two-way transit travelers. Average time of the first trip of the day is surprisingly spread, with multiple peaks at morning rush hour and mid-day; average time of the last trip is more

51 Use of median values might be preferred, but there is no median aggregate function in PostgreSQL where features were created. 52 Using the end time of first and last trips would have been ideal, but it is frequently missing in the ODX inference data.

226 concentrated on evening rush hour. Apart from some definitional correlations (e.g. active <= range), there are no obvious patterns across pairs of features.

Scatterplot Matrix of Clustering Features

The five features used for clustering were scaled in two different ways, using z-score standardization and scaling to the unit interval. The unit scaling tended to produce clusters that were primarily distinguished by the active and range features (which are already constrained to the unit interval), while clusterings with z-score normalization were differentiated by a mix of all five features. It seems preferable to make use of a variety of temporal features, so final clusterings shown in this Appendix used z-score standardization. (Note that unit-scaled features are still sometimes used to describe clusters.) K-Means Clustering K-means clustering was performed using the kmeans function in the open source software R.

To begin, different numbers of clusters were tested by looking at total within-cluster sum of squared distances. These values are shown in the figure below. Using the visual “elbow test,” clusterings with 3 clusters and 6 clusters were selected for detailed analysis.

227

Total Within-Cluster Sum of Squared Distances for Different k Values

The scatterplot matrix and heat map below summarize the clustering with k=3. Cluster 1 appears to be round-trip transit commuters, with high levels of activity concentrated at morning and evening rush hours. Clusters 2 and 3 are short-term / one-time or occasional users, distinguished by whether they tended to travel in the morning and middle of the day (cluster 3) or in the afternoon and evening (cluster 2).

228

Scatterplot Matrix, 3 Clusters

Heat Map, 3 Clusters

Feature Averages, 3 Clusters cluster active range intensity firston laston 1 0.73 0.94 2.14 10.27 16.37 2 0.13 0.31 1.68 16.17 18.18 3 0.13 0.28 1.64 10.78 12.84

Moving to 6 clusters, the heat map below shows additional useful distinctions. Round-trip commuters are even more sharply delineated in cluster 2. Cluster 1 picks up cards that are active throughout the study period but on fewer weekdays, traveling primarily in the afternoon and evening (perhaps one-way transit

229 commuters and non-workers who depend on transit). Clusters 3, 4, 5, and 6 differentiate occasional users. The intensity of transit use on active days is high for cluster 5, suggesting visitors. Typical time of travel is morning for cluster 3, midday/afternoon for cluster 4, and evening for cluster 6; the high variation in the active and range features in cluster 3 suggest that these morning travelers are a mix of occasional transit commuters and one-off, non-commute trips.

Heat Map, 6 Clusters

Feature Averages, 6 Clusters cluster active range intensity firston laston 1 0.4 0.84 1.62 14.21 16.66 2 0.81 0.96 2.25 9.35 16.5 3 0.19 0.39 1.4 9.33 10.6 4 0.08 0.13 1.61 12.65 15.01 5 0.13 0.23 3.24 12.17 17.4 6 0.07 0.15 1.42 18 19.22

As initial checks of robustness, the data were clustered two more times using different random initializations – once with all of the data and once on half of the data. The resulting clustering for k=3 varied, with occasional travelers sometimes divided by activity level rather than time of travel, but the clustering with k=6 was qualitatively similar to the clustering presented above.

To better characterize these clusters and to provide an initial sense of policy relevance, the clusters can also be described using other card information in AFC. As an illustration, the graphs below show transit mode, fare product user type, and fare product payment structure composition of each cluster in terms of the number of fare validations (“taps”) for the clustered sample of 10,000 cards. A little over 40% of taps are at gated rail stations for transit commuters (clusters 1 and 2 with k=6), while transit mode shares vary across the clusters of occasional users. (Note that the “commuter” clusters represent a large majority of total ridership, over 80% of taps; given the importance of this group of users, it would be worthwhile to cluster them separately from occasional users in order to develop meaningful segments within the commuter population.)

230

Transit Mode of AFC Taps

The figures below show that there is also substantial variation in the fare product types used by the cards assigned to the different clusters (measured again using the number of AFC taps on the sample of cards during the study period). Use of pass products is generally consistent with the active and intensity features, since customer choices between fare product payment structures are driven largely by expected frequency of travel. The most frequent “commuters” in cluster 2 (with k=6) rely almost exclusively on passes, while the less-regular commuters or transit-dependent riders in cluster 1 use mostly pay-as-you-go (PAYG); in spite of the regularity of their transit travel, there is not a pass product that is financially attractive for most cards in cluster 1 given the total frequency and temporal distribution of their ridership. This type of segmented summary could inform modification or development of fare products that are strategically oriented to the observed activity patterns of transit users.

231

Fare Product Payment Structure (“Tariff”) of AFC Taps

Fare Product User Type/Discount Type of AFC Taps

232

Conclusions and Extensions K-means clustering was applied to temporal features of transit travel at the MBTA. The goal was to identify a few interpretable, stable clusters that could potentially be used to segment transit policy and planning analyses using existing data in AFC (rather than expensive customer surveys). K-means produced clusterings with fairly even cluster sizes, relatively clear interpretations, and some stability. Using six clusters produced groups with distinctly different temporal ridership patterns. These results are promising and suggest the potential for useful cluster-based segmentation at the MBTA.

Possible extensions to this work include the following:

 Dimension reduction prior to clustering (e.g. principal component analysis)

 Seeding clusters based on expected or desired segment characteristics, or using k-means++ to improve initialization

 Clustering frequent transit travelers separately from occasional users (to identify segments that more evenly divide ridership rather than cards)

 Additional robustness checks for k-means clustering (partitioning, temporal stability over time, etc.)

 Assignment of all cards to resulting clusters using cluster centroids

 Further descriptive summaries using resulting clusters (e.g. cluster composition of ridership by time of day, day of week, mode, station/line/route, and fare product attributes)

 Analysis of changes in total ridership by cluster and individual-level changes in cluster membership, both over time and before/after policy events (such as the July 2016 fare change)

 Clustering using alternative temporal features, such as variability in first trip time

 Clustering using spatial travel features

233