Estimating Railway Ridership
Total Page:16
File Type:pdf, Size:1020Kb
28-04-2016 Estimating Railway Ridership DEMAND FOR NEW RAILWAY STATIONS IN THE NETHERLANDS TSJIBBE HARTHOLT S1496352 COMMITTEE: K. GEURS (Chairman) University of Twente L. LA PAIX PUELLO University of Twente T. BRANDS Goudappel Coffeng 0 1 I. SUMMARY Demand estimation for new railway stations is an essential step in determining the feasibility of a new proposed railway stations. Multiple demand estimation models already exist. However these are not always accurate or freely available for use. Therefore a new demand estimation model was developed which is able to provide rail ridership estimations. Main question of this thesis that will be answered is: How can the daily number of passengers of a new train station be forecasted on the basis of departure station choice and network accessibility? Aim is to estimate a demand estimation model which is valid for the whole of the Netherlands and focusses on proposed sprinter train stations. Factors determining total rail ridership Rail ridership can be determined by three main factors: Built environment factors Socio-economic factors Network dependent factors Built environment factors are factors that describe the situation in the direct environment of the station. A subdivision can be made into station environment factors based on the three d’s as described by Cervero and Knockel-man (1997): o Density: Describing the amount of activities in the proximity of the station. This could be the e.g. number of jobs, number of students, shops or total population. o Diversity: describing the diversity of the activities that take place in the proximity of the station. o Design: variables describing the properties of a station (area) as a direct consequence of its design. E.g. the accessibility by bike (bicycle parking available), design of the station itself (architecture) or perceived safety. The socio-economic variables are mainly adding an additional layer to the density variables. They give additional information on for example income, employment, age, or car ownership which can increase of decrease the probability a person will use the train. Network dependent variables describe the connectivity of the station with the other station in the network. This can be described with variables such as frequency, number of lines, intercity service available or an accessibility index. Secondly, network dependent variables can also describe the quality of the potential feeder modes such as the frequency and number of lines for bus, tram and metro or the availability of a park & ride. In total 147 variables have been categorized and tested for their explanatory value. Effects of a new station The opening of a new train station can have several effects. Generally it is assumed a new station will mainly attract new passengers. Because of increased rail accessibility (closer station proximity) after the opening of a new station, this will be most likely the case for some people. However, this increased rail accessibility will also cause an abstraction of demand from existing stations. A part of the passengers using the new station are therefore existing train users. Only their station preference has changed. Finally, a new station can also cause a decrease of passengers elsewhere along the line because of the (slightly) longer travel time. An additional stop a train has to make will increase overall travel time by three minutes on average. Existing passengers might therefore decide to use another mode of transport due to this increase in travel time. 2 Methods available to estimate travel demand Two main types of demand estimation have been identified: Aggregated demand estimation Disaggregated demand estimation Aggregated demand estimation is usually based on regression analysis according to the formula: 푌푖 = 훽0 + ∑ 훽푘훽푖푘 + 휀푖 푘 With parameters: Yi the total number of predicted passengers Β 0 The constant or intercept Β k Estimated parameter for variable k Β ik variable value i for variable k εi error term for variable i This model is commonly used since no disaggregated trip data is needed and is relatively easy to apply. However, regression models are sensitive for the quality of the variables used and potential outliers in the dataset. In order to further improve a regression model several additional actions can be performed: Reference class forecasting: With reference class forecasting all cases are assigned to separate classes together with other similar cases. This will allow for the estimation of separate models adjusted to the reference classes. The use of network distances: By using the network distances instead of Euclidian distances, the accuracy of variables such as the total population the proximity of a station will be improved. The problem of barriers in the landscape such as rivers, highways and the railway line itself limiting the actual catchment area will be solved using this method. ((Upchurch , Kuby, Zoldak, & Barranda, 2004), (O'Neill, Douglas , & JaChing, 1992), (Horner & Murray, 2004).) Distance decay modelling: In several cases it has been observed that people living further away from the station have a lower probability of using the train (Keijer & Rietveld, 2000). Adjusting to this affect with the use of distance decay can therefore improve several variables such as total population) significantly (Gutiérrez et al, 2011). The use of geo-weighted regression allows for a geographic variation in the constants of regression model. Therefore a geo-weighted model can adjust for region differences in the sensitivity of certain variables ((Blainey & Mulley, 2013). Disaggregated demand estimation is usually based on disaggregated trip data. The need for this kind of data makes it harder to apply this type of model. However this type of model is better suited to estimate effects on station choice and competition between stations. It is often applied with the use of a multinomial (or nested) logit model. Such a model will offer multiple alternatives (stations). Based on the unique situation of each case a utility will be assigned to each of the choices. The probability of choosing a choice is then calculated based on these utilities. Research method In this research a combination of these two methods will be used: A multinomial station choice model will be used to improve variables before they are used in a regression analysis. Furthermore an accessibility indicator and distance decay function are estimated to be used as model input as well. Accessibility Indicator The position of the station in relation to the rest of the network has proven to be an important factor in rail demand estimation. In this research an accessibility indicator was estimated to include this aspect in this model as well. These indicators were based on a trip distribution model estimated in Omnitrans. In total three indicators were estimated. The final index score is normalized from 0 till 1. 3 For example the closeness centrality index (CCI) was estimated as: 1 퐶퐶퐼푖 = ∑ (훿푐푖푗 ∗ 퐷푗 ∗ ) 퐶푖푗 + 1 푖푗 With parameters: 퐶퐶퐼푖 The closeness Centrality Index of station i 훿푐푖푗 The probability of taking a trip from station i to j 퐷푗 The total number of passengers arriving at station j 퐶푖푗 The number of transfers needed to get from i to j Distance decay functions Based on survey data conducted in the province of South-Holland distance decay functions were estimated. The functions are separately estimated per station type on the access side and separately for sprinter and intercity stations on the egress side. Multiple function types have been tested but a logarithmic function type proved to have the best fit. The largest difference can be observed between intercity (type 1 & 2) and sprinter stations (type 3 till 6) with intercity stations having a considerable larger catchment area and trip attractively. However, type 1 intercity stations seem to have a slightly larger catchment area than a type 2 station. At the same time type 5 sprinter stations have the smallest catchment areas. 0,8 0,7 0,6 0,5 Weight_type1 Weight_type2 0,4 Weight_type3 0,3 Weight_type4 0,2 Weight_type5 Weight_type6 Numberof trips inhabitant per 0,1 0 0 2000 4000 6000 8000 10000 12000 14000 Distance from station in metres Figure 1: Distance decay functions per station type on the access side of the trip Station choice model Also a multinomial station choice model was estimated based on survey data and the use of Biogeme. The final station choice model was based on a choice set consisting of two closest intercity stations and two closest sprinter stations. Variables included in the model were frequency, availability of guarded bicycle parking, number of BTM lines connecting the station and distance. 4 Regression analysis A regression analysis was performed on the basis of variables adjusted with the distance decay functions and the station choice model resulting in the total potential of train trips from the number of jobs, student places and total population. Furthermore the closeness centrality indexes along with several other variables were included as well. Six different models have been estimated. Two of these models are valid for all sprinter stations, four models are type specific models based on the reference classes: regional and main line models (Table 1). Table 1: Overview of all estimated regression models General Regional Regional Main line General Basic Main line basic extensive basic extensive extensive Cases 307 307 119 119 191 191 R2 0,837 0,871 0,728 0,789 0,798 0,819 Std. Error of the 1005 894 556 489 1193 1140 Estimate Application & discussion of the model Figure 2: demand abstractio of Leeuwarden as a result of the Application of the model can give a demand opening of Leeuwarden-Werpsterhoek estimation of the new station. The effects of demand abstraction of the new station on existing stations can be estimated with the station choice model (see figure 2). When applied the two general model will give the most accurate results.