To Bike Or Not to Bike? Prediction of Public Bike Availability in the Dutch Train Network
Total Page:16
File Type:pdf, Size:1020Kb
Master Thesis To bike or not to bike? Prediction of public bike availability in the Dutch Train Network Gijs de Jager 10006729 Supervisor - Dr. F. M. Nack Second Examiner - Prof. Dr. T. V. van Engers Faculty of Science (FNWI) University of Amsterdam , Amsterdam To Bike or not to Bike? Prediction of public bike activity in the Dutch Train Network Gijs de Jager 10006729 ABSTRACT for the long term. In this thesis we test if the algorithms designed for local Bike In the current situation the consumer can obtain limited Sharing Systems can be applied in a national environment. information about the availability of the number of bikes, In order to make the rental bike a reliable part of the public every fifteen minute there is an update shown in the app transport system. A lot of research has been done on rela- or on the website of the number of bikes at a particular tively small-scale bike rental systems. We contribute to this train station. The problem with this solution, in contrary field by creating a Case Study for a Bike Sharing System to other public transports like a bus, metro or a tram, is that with a larger network of stations and greater variation of one cannot fit the bike in to his planned journey. Ideally, available bikes. We apply and modify a model to predict a passenger going from A to B by train and then continu- how many bikes are available at a certain train station at ing his journey by bike needs to know, before his departure, certain time in the future and, if no bike is available, how whether at least one bike will be available upon his arrival long it will take before a new bike comes available. The at B or, if no bike will be available, how long he needs to Case Study is built upon the data of the OV-fiets System wait for a bike.[2] provided by the NS(Dutch Railroad) and the Royal Dutch The suggested solution is that, in absence of a live up- Institute of Meteorology (KNMI). The predictions are done date, there should be a predicted number of bikes so that by the GAM method, the significance testing between the planning is feasible. Though there are Bike-Sharing-Systems two algorithms is done according the Kolmogorov-Smirnov (BSS) available in other cities and research has been done test. towards the predictability of those systems [5]kaltenbrun- ner2010urban[1][21], they are based on rather different in- Keywords frastructures. In all cases those systems are designed in Bike-sharing system, predictive model, GAM algorithm a context with few stations and low number of available NRMSE, Kolmogorov-Smirnov bikes. This results in the following research question: Can the available bike sharing algorithms be applied to solve the information problem for a environment with a large amount 1. INTRODUCTION of stations and a variety of available bikes? This question 1 23 The OV-Fiets is growing in popularity . In 2014 the num- will be answered by a case study where real data provided ber of bikes that were rented was approximately 1,4 million, by the NS and the Royal Dutch Institute of Meteorology in 2017 the amount was 3,2 million. Because of the growing (KNMI) is used. 4 popularity people are confronted with empty bike stations . This paper is organized as follows. In Chapter 2 the key The Dutch Railroad (NS) has solved this problem for the features of both the Dublinbikes and the OV-Fiets-system short term by buying a lot of extra bikes. But with the on- are presented. The history and development in predicting going growth of the bike rental this solution is insufficient bike availability and uncertainty aware journey planning are 1Best translated as PT-Bike: Public Transport Bike presented in Chapter 3. In Chapter 4 the variables are de- 2http://nieuws.ns.nl/recordaantal-ritten-met-de-ov-fiets-in- scribed and the new algorithm will be presented. In Chapter 2017/ 5 the Case Study is presented, including the results as well 3https://www.ad.nl/utrecht/ov-fiets-is-niet-aan-te- as the recommendations towards NS. Discussion is explained slepen af126e8b9/ in Chapter 7 and we conclude in Chapter 8. 4https://www.ovmagazine.nl/2017/06/ov-fiets-nog-niet-zo- betrouwbaar-als-de-trein-0600/ 2. BIKE SHARING SYSTEMS Worldwide, there are more then 700 Bike Sharing Systems.[3] Usually, they are set in large cities like Washington, Barcelona, Paris, Lyon and Dublin. The systems are often exploited by a stand-alone commercial organization. For example in Paris and Dublin the advertisement company JCDecaux5 owns the bikes and the stands. One can rent a bike, by swiping a card along the central pole at the bike-station and can drop the bike at any sta- 5jcdecaux.com 1 tion in the city, providing at least one empty stand. Also place for bikes, unlike the BSS stations where there is lim- at the most BSS the first 30 minutes of rent are free of ited space for the bikes. charge and, in Barcelona, a supplement of e 4,49 6 is charged Unlike the other BSS system OV-fiets works nation wide. when the rent takes longer that two hours. The combina- Ov-fiets has around 15.000 bikes divided over 317 bike sta- tion of the many stands and the incentive to use a bike tions which are in general placed at train stations. in less than 30 minutes leads to a very short rent term, for example in Lyon this leads to an median rental time of 11 minutes[1]. For cities as Barcelona (Bicing), Washing- 3. THEORETICAL FRAMEWORK ton (Capital Bikes), Paris(V´elib), Lyon(V´elo) and Dublin 3.1 Predictive models (DublinBikes) certain predictive algorithms has been devel- Researchers have tried to solve the problem of predicting oped[10][21][6][2][11]. However, those applications are not bikes in bike stands for a Dublinbike like system, like Froahlich integrated into a larger public service system. et al.[5], Kaltenbrunner et al.[10], Borgnat et al.[1] and Yoon et al.[21]. 2.1 Dublinbikes Froahlich et al. created the basis for this series of research The Dublinbikes7 system had in 2012 550 bikes across 44 where especially Bicing and Velo are being investigated and bike stations in Dublin. The stations are open from 05:00 where the researchers try to predict accurately how many a.m. to 0:30 a.m. seven days per week. And they had bikes are at a particular bike stand at a particular time in over more then four million rentals between 2009 and 2012. the future. Froahlich investigates the Barcelona Bicing sys- Dublinbikes provides real time information about the amount tem and makes use of a Bayesian Network(BN) and com- of the bikes at the stations but doesn't given any informa- pares it to three other methods. tion for future prediction. They use the same pricing model The first, and most simple, is Last Value (LV). LV predicts as Vel´oand Bicing since they are also owned by JCDecaux. the number of bikes at a bike stand by giving the count of We choose for the DublinBikes as a example because Chen the last known value. For example, when one wants to know et al.[2] use the DublinBikes data. Their model is build how many bikes there are at a particular stand in about an according the same approach as ours, namely design a solu- hour. LV will check the current value, let's say 25 bikes, and tion for planning the rental bike in to a journey, even when return 25 bikes as the predicted value. Froehlich shows that there is no bike available. Also their model focus mainly on LV is quite accurate until the prediction window exceeds the the switch between public transport and rental bikes where 60 minutes, after which it is out performed by Historic Mean models design for i.e. Bicing[5] and Velib[6] depend heavily (HM) and Historic Trend (HT) methods[5]. on bike station to bike station calculations and the way the HM calculates the historic average of the amount of bikes amount of bikes are distributed and less on predicting the for the same time (in history) as the predicted time. Froehlich amount of bikes for the journey planning of the traveler. On et al. shows that this predictive model is highly unstable top of that has research shown that the model of Chen et therefore concludes that the distribution of bikes at busy al. performs best. stations is very irregular. HT uses HM and LV plus an extra feature, namely: the calculation takes the Historic Mean of the amount of bikes at the time of request, the Historic mean of the time that 2.2 OV-Fiets one wants to predict and calculates the difference. And adds In contrast, the OV-fiets8 system works differently from that difference up on the value of LV. the BSS mentioned above. First of all the OV-fiets sys- Froehlich et al. shows that the Bayesian Network out- tem is deeply integrated in the Dutch Railway company (NS performs these three approaches. They propose three input Groep), which is a semi-public company owned by the Dutch nodes: time, bikes and Prediction Window. time is a day government. NS-Stations, part of the NS-Group, manages divided in 24 hours. bike is divided in 5 values of 20% that the OV-fiets. The main goal of the OV-fiets is to give the represents the amount of bikes at a station.