Delays and Timetabling for Passenger Trains

CARL-WILLIAM PALMQVIST FACULTY OF ENGINEERING | UNIVERSITY 2019

317 Travel by train has increased steadily for the last 30 years. In order to build trust in and shift even more traffic to railways, more trains must arrive on time. In practice, many train delays are caused by small disturbances at stations, which add up. One issue is that the scheduled dwell times are simply too short. Another is that punctuality falls quickly if it is either warm or cold. A third is that interactions between trains rarely go as planned. One suggestion for how to reduce delays is to paint markings that show where passengers should wait. Another is to remove switches, so that remaining ones can be better maintained. A third is to make railways more resilient to the weather variations of today, and to the climate changes of tomorrow. Cost-effective improvements can also be made with timetables. More of the planning can be automated, so that planners can focus more on setting appropriate dwell times and on improving the timetabling guidelines. This way, many more trains can be on time.

Lund University Faculty of Engineering Department of Technology and Society

Bulletin 317 953103 ISBN 978-91-7895-310-3 ISSN 1653-1930 789178

CODEN: LUTVDG/(TVTT-1059)1-105 9 317 Delays and Timetabling for Passenger Trains

Delays and Timetabling for Passenger Trains

Carl-William Palmqvist

DOCTORAL DISSERTATION by due permission of the Faculty of Engineering, Lund University, . To be defended at V:C, V-House. John Ericssons väg 1, Lund. 2019-11-08 at 10:15.

Faculty opponent Birre Nyström Organisation Document name DOCTORAL DISSERTATION LUND UNIVERSITY Date of issue 2019-11-08

Carl-William Palmqvist Sponsoring organisation K2 – The Swedish Knowledge Centre for Title and subtitle Delays and Timetabling for Passenger Trains Abstract

Travel by train has increased steadily in Sweden the last 30 years. The pace has been about two to three percent per year, and we now have twice as many passengers. With growing awareness of the changing climate, the pace is increasing further.

A problem that affects both passengers and businesses in Sweden is train delays. One way to describe these is as the share of trains that arrive less than six minutes delayed. About 90% of trains in Sweden meet this standard and have done so for many years. In a way this is impressive, since there are now many more trains. Unfortunately, this also means that more and more passengers are affected by delays. This leads to irritation, threatens the shift of traffic to railways, and costs a lot for society. More trains must arrive on time.

This thesis shows that delays are mostly caused by small disturbances – up to a minute or two. Over long journeys, these small disturbances accumulate and sometimes cause quite big delays. These delays mostly occur at stations, where the trains stop, but are then unable to continue on time. It is difficult to say exactly what causes these small disturbances, but the time that the trains are supposed to be at stations – the dwell times – are often too short. Another pattern is seen between delays and weather: if it is either warm or cold, delays increase rapidly. And while winter and snow return every year, they still cause major disruptions.

The thesis holds a few suggestions to reduce delays. One is platform markings that show where the trains will stop, where the doors will be, and where the passengers should wait. This is an easy and affordable way to speed up the stops, so that the trains depart on time. Another measure is to remove switches. Then there are fewer parts that can fail, and those that remain can be maintained to a higher standard. A third way is to adapt the railway, so that it better withstands the weather variations of today, and the climate changes of tomorrow. Something that has been done in other countries is to shade and air-condition electronics and signals along the railway. Then the components to not overheat, and more trains run on time.

Many things can also be done with timetables, so that more trains run on time, without a rise in costs. More of the planning can be automated. Then more time can be spent on giving trains appropriate dwell times. Infrastructure managers should also do more to evaluate and improve the rules and guidelines that govern timetabling. In this way we can improve timetables gradually from year to year, with fewer and fewer delays as a consequence. These suggestions do not solve all of the railway’s issues, but they would lead to many more trains arriving on time.

Key words Railways, Trains, Delays, Punctuality, Timetable

Classification system and/or index terms (if any)

Supplementary bibliographical information Language English

ISSN and key title 1653-1930 ISBN 978-91-7895-310-3 (printed) Bulletin – Lund University, Faculty of Engineering, Department of ISBN 978-91-7895-311-0 (pdf) Technology and Society, 317 Recipient’s notes Number of pages 105 Price

Security classification

I, the undersigned, being the copyright owner of the abstract of the above-mentioned dissertation, hereby grant to all reference sources permission to publish and disseminate the abstract of the above-mentioned dissertation.

Signature Date 2019-09-30 Delays and Timetabling for Passenger Trains

Carl-William Palmqvist Front cover photo by Carl-William Palmqvist Back cover photo by Kennet Ruona Copyright pp 1-105 (Carl-William Palmqvist) Paper 1 © WIT Press. This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 United States License. Paper 2 © by the Authors. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 United States License. Paper 3 © by the Authors. Paper 4 © by the Authors. This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 United States License. Paper 5 © by the Authors.

Faculty of Engineering Technology and Society

ISBN 978-91-7895-310-3 (printed) ISBN 978-91-7895-311-0 (pdf) ISSN 1653-1930

Printed in Sweden by Media-Tryck, Lund University Lund 2019 To Irene Table of Contents Populärvetenskaplig sammanfattning ...... 11 Acknowledgements ...... 12 Introduction ...... 17 Structure of thesis ...... 18 Background ...... 19 Timetable planning ...... 19 Quality of timetables ...... 19 Capacity allocation ...... 24 Robust timetable planning ...... 26 Run time supplements ...... 27 Dwell times ...... 28 Headway times ...... 29 Train delays and punctuality ...... 31 Delay definitions ...... 31 Earlier research on delays ...... 33 Research gaps ...... 37 Aim ...... 39 Research questions ...... 39 Delimitations ...... 40 Method and Data ...... 41 Quantitative, qualitative and mixed methods ...... 41 Working with delays and punctuality ...... 42 Measuring and observing delays ...... 42 Varying punctuality definitions ...... 43 Quantitative data used ...... 46 Train movement data ...... 46 Dwell times and passenger counts ...... 48 Timetable data ...... 50 Weather data ...... 51 Infrastructure data ...... 52 Analytical methods used ...... 54 Pre-processing the data ...... 56 Visual analysis and Welch’s t-test ...... 58 Ordinary Least Squares Regression ...... 59 Interviews with timetable planners ...... 60 Research Process and Results ...... 63 How the papers build on each other ...... 64 The pilot study ...... 65 The subsequent papers ...... 65 Overview of papers and findings ...... 67 Paper 1. The pilot study ...... 67 Paper 2. Four types of factors ...... 68 Paper 3. Margins and interactions ...... 70 Paper 4. Problems in planning practice ...... 71 Paper 5. Passengers and dwell times ...... 72 Results ...... 74 Linking the papers to the research questions ...... 74 RQ1. How are the delays distributed? ...... 76 RQ2. What factors correlate with delays, and to what extent? ...... 76 RQ3. How can the allocation of run time margins be improved? ...... 78 RQ4. How can the allocation of dwell times be improved? ...... 79 RQ5. How can the practice of timetable planning be improved? ...... 81 Concluding Discussion ...... 85 Recommendations ...... 85 Size and allocation of margins ...... 85 Scheduling dwell times ...... 87 The practice of timetabling ...... 88 Some physical changes ...... 90 Some Reflections ...... 91 Further research ...... 91 Reflections on methodology ...... 92 Contributions of the thesis ...... 95 References ...... 97 List of Included Papers ...... 105 Contribution to papers ...... 105

Populärvetenskaplig sammanfattning

Resandet med tåg har ökat stadigt i Sverige de senaste 30 åren. Takten har varit cirka två-tre procent per år, och vi nu har mer än dubbelt så många resenärer. Med ökat fokus på klimatfrågan stiger resandet nu ännu snabbare. Ett problem som drabbar både resenärer och företag i Sverige är tågförseningar. Ett sätt att beskriva dessa är den andel av alla tåg som kommer fram mer än fem minuter för sent. Räknar man så är ungefär 90 procent av alla tåg punktliga. Så har det varit i många år. Det är på ett sätt imponerande, med tanke på att det nu går många fler tåg. Tyvärr gör det också att allt fler resenärer drabbas. Detta leder till stor irritation, hotar den fortsatta överflyttningen av trafik till järnväg, och kostar mycket för samhället. Fler tåg behöver komma fram i tid. Min forskning visar att förseningarna mest beror på små störningar – upp till någon minut. Över långa resor samlas dessa små störningar ihop och gör att den totala förseningen kan bli stor. De här störningarna sker mest på stationer, där tågen ska stanna, men där de sedan inte klarar att köra vidare i tid. Det är svårt att säga exakt vad dessa störningar beror på, men den tiden tågen ska vara på stationen – uppehållstiden – är ofta för kort. Ett annat samband syns mellan förseningar och väder: om det är varmt eller kallt ökar förseningarna snabbt. Och trots att det varje år blir vinter och snö så leder det ändå till stora besvär. Jag ger en rad förslag för att minska förseningarna. Ett är markeringar på plattformen som visar var tågen ska stanna, var dörrarna kommer vara, och var resenärerna ska stå. Det är ett enkelt och billigt sätt att snabba på uppehållen, så att tågen kommer iväg i tid. En annan åtgärd är att ta bort växlar. Då finns det färre felkällor, och de som finns kvar kan skötas bättre. Ett tredje sätt är att anpassa järnvägen, så att den tål dagens vädervariationer och de klimatförändringar som är på väg. Något man gjort i andra länder är att skugga och ventilera elektronik och signaler längs med banan. Då blir de inte blir för varma, och tågen går oftare i tid. Med tidtabeller kan man också göra mycket för att fler tåg ska gå i tid, utan att det kostar mera. Mer av planeringen kan göras automatiskt. Då kan mer tid läggas på att ge tågen lagom långa uppehållstider. Trafikverket bör också göra mer för att följa upp och förbättra de regler och guider som finns för att planera. På så sätt kan vi få tidtabeller som blir bättre och bättre från år till år, med mindre och mindre förseningar som följd. Dessa förslag löser inte alla järnvägens problem, men de skulle leda till att många fler tåg kommer fram i tid. Då får vi plats för ännu fler tåg och resenärer på spåren.

11 Acknowledgements

One often hears that writing a PhD is a lonely endeavour – but I have never for a moment felt alone during my journey. Firstly, I want to thank my two amazing supervisors: Lena Hiselius and Nils Olsson. Thank you for believing in me, for your inspiring and cheerful pragmatism, and for letting me run loose with your encouragement and light touch. I am very grateful to have shared this journey with you both, and for having had the chance to learn from you. Along the way, I have heard of difficulties and torn relationships between some supervisors and students, but with you, I have always felt encouraged and supported. Thank you! If there is one thing I want to become and accomplish in my career, it is to be a good supervisor and mentor to others, just as you have been to me. An even greater debt is owed to my dear family. Thank you, Irene, for everything. Thank you for filling my life with both laughter and meaning. Thank you for your love and support. Thank you for sharing your life with me, and for the many adventures yet to come. You’re the best, and I love you. Thank you, mom and dad, for providing a wonderful childhood, a stable and supportive home, for all your trust and support, and for watching Onega during so many of my trips. Thanks to Marie- Louise and Cecilia for being such wonderful sisters. Thanks to Oma and Grampa for inspiring me to go deeper into the world of academia and for showing me that things like this can be done. Thanks to Kenneth Håkansson for taking me on as I wrote my master thesis. For believing in me and this research from the beginning and all the way through. Without you, your support, and your wide-reaching network, this thesis would not exist, and I would never have become a researcher at all. I am eternally grateful, and I look forward to continuing to work with you as we implement some of the recommendations. Thanks to Magnus Wahlborg for taking me under your wing and supporting our little group as we grow into this railway research business. Thank you for your vocal support, for introducing us to the community, for inviting us into projects, and for believing in us. Thank you to all the reference group members: Martin Aronsson, Bo-Lennart Nelldal, Britt-Marie Olsson, Hans Dahlberg, Magnus Andersson, Elisabet Spross, Bengt Holmberg, Per Konrad, Kristina Abelin, Olov Lindfeldt, and Jonas Westlund. Thank you for all the long and inspiring discussions, all your questions and input, your introductions into your respective organizations. I look forward to broadening

12 the research scope and digging into implementation together with all of you. Thank you, Anders F. Nilsson, for giving me all the data from Lupp. Thank you to the faculty opponent and to the examination committee for taking the time to read and examine my work. I look forward to having the opportunity of discussing it with you. Thank you to Karin and Anders for reviewing an earlier version of this manuscript, for a stimulating seminar, and for providing hundreds of comments, big and small. It has undoubtedly improved the thesis and made me much less anxious in releasing this version. Thank you both. Thank you to Emma Solinen, Anders Lindfeldt and Hans Sipilä for welcoming me as a fellow doctoral student of railways when I first began and did not know very much about the topic, or about doctoral studies. Thank you to Jennifer Warg and Ingrid Johansson for the collaboration in Plasa and the friendly discussions about both railways and studies. I am glad to have you as peers, and that we will continue working together. Thank you to Martin Joborn, Anders Petersson, Johanna Thörnquist-Krasemann and all the others who help to make KAJT function and who continue to help make Sweden world-leading in capacity-related railway research. Thank you to Sara Gestrelius and Abderrahman Ait Ali, and to Johan Högdahl and Erik Ronnle for our various discussions on research and on life as a doctoral candidate. Thank you to all my colleagues and hosts in Japan, for giving me some of the best times of my life. To Norio Tomii, for acting as a host, mentor and a guide. I can only dream of having anywhere close to the impact that you have had on research and practice, and I hope we can write some more papers together. Thanks to Yasufumi Ochiai for making me feel at home in the lab, for teaching me about timetabling in Japan, and for taking me out to eat so many kinds of delicious food! I really appreciate it and look forward to a continued friendship for many years. Thank you to Chikara Hirai for welcoming me at RTRI so many times, for the nomikais, dinners, and inspiring management style. If I become a manager one day, I want to be a lot like you. Thank you to Takao-san for kindly hosting me in your home and for sharing your delicious cooking with me. I hope we can cook tempura or sukiyaki again one day. Thank you to Wataru Ohnishi for showing me around, and for having one of the most delicious lunches ever, and for your deep and inspiring comments about the joys of working with and learning from your students. Thank you to Takafumi Koseki for welcoming me into your lab at the University of Tokyo, and for showing how someone so esteemed can be so modest and respectful to others. I do not think I have ever seen someone bow quite as deep as you, in all my time in Japan. I hope we can find a way to work together one day. Thanks also to all the railway companies that we visited, and to the staff, management and students at CIT. Thanks to STINT, JSPS and all the fellows of summer program 2018, for providing me the opportunity to visit Japan, and for making it so enjoyable. I will never forget it.

13 Thank you to Thomas Graffagnino for welcoming me to Switzerland and SBB. It was inspiring to see how you work, and to hear how you do so many of the things that we talk about doing here – big and small – and that they make a difference. Thank you to Francesco Corman for inviting me to your new lab at ETH Zürich. I had a great time with your students and colleagues and hope to maybe one day establish a similar setup in Sweden. You guys do great work. A big set of thanks to my great colleagues at Traffic and Roads, Technology and Society and K2. An extra shout out to our awesome administrative staff: Mia, Thomas, Carin, Helena, Astrid and Maria. Without you, the entire operation would collapse. You lay an amazing puzzle. I appreciate how you are always on hand and willing to help with just about anything, and how cooperative and constructive you are with managing the finances – a great contrast to how it seems to be at many other organizations. Thank you to Anders, John and Helena. For supporting and believing in me, and for providing such healthy role models as leaders and managers. Thank you to Hanna for helping me with popularising and disseminating my work. Thank you to Andras for your vast knowledge, unassuming manner, and sheer coolness. Thank you to Ebrahim for sharing your passion for teaching. You are one of the most inspiring teachers I have had, and I hope to one day be able to show some of that same passion and engagement in my own teaching. Thanks to Aliaksei Laureshyn, Till Koglin, Fredrik Pettersson and Fredrik Kopsch for leading the way as researchers and faculty members. It is inspiring to see you establish your own groups and careers, and I am grateful that you are so friendly and open to talking about things, rather than pulling up the ladder behind you. Thank you to Hampus and Maya for what you do in teaching, and for all the lunches and conversations we have had. Thank you to Erik and Lia for a successful cooperation on the Thredbo-paper. Thank you, Jean, for your friendship, for serving as a role model, and for the chance to work together going forward. I hope it lasts for many years yet. Thanks to Daria, for joining the railway team, and for throwing yourself into the field with such good cheer and enthusiasm! Thank you to Emma and Joel, for your easy-going manner, and for being such good mates about sharing the office, and for not complaining about my loud keyboard. Thank you to Carmelo, Chunli and Chiara for being so approachable, friendly, and open with me and the other doctoral students. Thank you to Magnus Andersson and Christina Lindkvist at MAU for your confidence in me and for our discussions. Thank you to Sadegh. One day we will make something great of that satellite data we keep talking about. Thank you to Malin, Jens, Alfred and Ulrik for all our conversation both in the lunchroom and at K2. You really help make it feel like a community. Thank you to Carl for our discussions about programming and for your help in the beginning. Thank you to Pajtim for showing the courage to be so outspoken, and for your big role in keeping the pavement-side of things on track at the department. Thank you

14 to Khashayar for always being perfect – jättebra! Thanks to Azreena for giving such an inspiring defence just before my own, and for giving me hope. Thank you, Janet, for your kind questions and to Oksana, for always lighting up the room with a smile. Thank you to Ellen and Valentin, for being willing to cross the corridor and bridge the gap between our divisions. Thanks again to all of you for creating such a nice community. Thank you to my friends, both old and new, for providing emotional support and a welcome change of scenery. Thank you to Daniel, Hampus and Jocke, for sticking with me for all these years – despite long moves and absences. Thank you to Björn, Rasmus and Eigo, for all the nice meals, games, and conversations during these past years. I treasure you all and am so grateful for your friendship. And last of all, a very big thank you to the funders of my research. Thank you to Trafikverket for providing the bulk of the funding, for encouraging me and the research, for giving me so many connections, for opening up your doors, for nurturing our railway research here in Lund, and for continuing to fund and support our research going forwards. Without you, none of this would have happened. Thank you also to KAJT for inviting us into the capacity research community, for organizing so many great conferences and events, and for providing a great platform for research and dissemination. Thank you to K2, The Swedish Knowledge Centre for Public Transportation, for funding my small projects and for your continued faith, funding, and support. Thank you for providing such a pleasant place to work, and so many great events, connections, discussions, and much more. Thank you once again to JSPS, the Japanese Society for the Promotion of Science, for providing the Summer Program that enabled me to visit Japan for the summer of 2018. Thank you to STINT for choosing me as one of the Swedish candidates for the programme. It has opened so many opportunities, connections, and research ideas for me. It was also one of the best times of my life. I will never forget it, and I look forward to coming back many more times. Thank you to Chiba Institute of Technology for hosting me during that first trip, and again for inviting me to return for an academic fellowship so soon. I really appreciate it, and some of the fruits of the fellowship are found in this thesis. So, as I began to say, four pages ago, I have had plenty of good company along this journey. I am eternally grateful and could not have done it without you. Thank you, and to those many more of you whom I could not mention here. I appreciate your support, confidence, and friendship, and so much more. Thank you all.

15

Introduction

This thesis analyses delays and timetabling for passenger railways. Railways are an important part of the transport system. After decades of decline, rail transportation in Sweden has been growing steadily by about 3% annually since the early 1990s (Trafikanalys 2018a), for both passenger and vehicle kilometres. The rise has been predominantly in passenger trains – mostly for local and regional journeys (Trafikanalys 2018a). In 2015 passenger trains made up 83% of all trains and contributed 10% of all motorised passenger kilometres, while freight trains were responsible for 19% of tonne kilometres (Trafikanalys 2018b). Investments in new infrastructure have not grown to the same extent (Trafikanalys 2018a), and a quarter of the metropolitan lines are now very heavily utilised, at levels associated with a high likelihood of delays, low average speeds and little time for infrastructure maintenance (Trafikverket 2017a). One of the key factors for the attractiveness and efficiency of the railway sector is on-time performance, of which punctuality is a commonly used indicator. In , only 56% of commuters report being satisfied with the punctuality of commuter trains, which is the lowest across all public transport options and the single most important influencing factor for their overall satisfaction with the transportation mode ( läns landsting 2017). The level of punctuality on Swedish railways has been close to 90% for the last several years, with punctuality defined as a maximum delay of five minutes at the final stop (Trafikanalys 2016). This is considered too low by the industry, which has set a goal of 95% by 2020 (Gummesson 2018). While this requires large and rapid investment, the benefits in increased attractiveness and ridership of trains would be considerable. Even small delays of only a few minutes cause considerable inconvenience to both passengers and operators. In our data of more than 200 million observations across some 7.5 million journeys on the Swedish rail network over the years 2011- 2017, delay events of three minutes or less make up 78% of delay time, while those of five minutes or less make up 85% of the delay hours. Small delays thus accumulate and are a very big part of the problem of unreliable railways. Especially so at stations, where 80% of the delay time is made up of delays up to three minutes in size. For passengers, even small delays can lead to missed connections, and to significantly larger total delays, and the punctuality for passengers is often significantly worse than for trains (Parbo, Nielsen & Prato 2016).

17 High quality timetable planning is one way to deal with small train delays. In most countries, timetables are revised at least once per year, and previous studies (e.g. Goverde 2005; Vromans 2005; Yuan & Hansen 2008) indicate that timetable properties can have a large impact on delays and punctuality. By scheduling the times that trains stop at stations (dwell times) and by adding margins (run time supplements) between stations, timetable planners can affect the risk of delays arising, and enable trains to make up for any delays that do occur, sometimes at the cost of longer travel times. By separating the trains from one another in time (increasing headway or buffer times) they can also reduce the risk of delays spreading from one train to another, this has a smaller effect on travel times, but consumes more capacity and reduces the number of trains that it is possible to run.

Structure of thesis

Following on from this short introductory chapter, the rest of the thesis is structured as follows. The second chapter further describes the Background, placing the thesis in a context and with overviews of earlier research on both train delays and timetable planning, and some theory on delays and punctuality. The chapter ends with two identified research gaps. Following on from the identified gaps, the third chapter describes the Aim of the thesis, its research questions, underlying hypotheses and delimitations. Next, the fourth chapter describes the Method and Data used in the thesis. Chapter Five describes the Research Process and Results, how the papers build on each other, an overview of the papers, and tying the findings back to the research questions. Chapter Six is a Concluding Discussion with recommendations and some reflections. Following a list of references is a list of the appended papers and my role in those publications, followed by the papers themselves.

18 Background

Timetable planning

Extensive research has been done on timetable planning in the past. In his overview of timetable research, Hansen (2009) identified some key differences between the state of the art and the state of practice: using precise and realistic estimates of dwell, run and blocking times, and reflecting the variations in behaviour of train crews. In this section a brief overview of the research on timetable planning is made. First, we consider some work with broad perspectives: (1) different ways to consider and assess timetable quality, (2) the strategic trade-off between precision and slack, and (3) some broader perspectives on timetable planning. Then we will discuss more narrow research dealing with the allocation of (4) run time margins, (5) dwell times, and (6) headways. This is followed with some more research and theory on delays and punctuality.

Quality of timetables There are many different dimensions to the concept of timetable quality (see, for instance, Goverde & Hansen 2013 and Gestrelius, Peterson & Aronsson 2019). For passengers it is desirable to have short travel times, high reliability, high frequency in both peak and off-peak, easy and synchronised connections, and a timetable that is easy to learn (e.g. Parbo, Nielsen, & Prato 2016; Kottenhoff & Byström 2010; van Hagen & van Oort 2018). For drivers and other personnel, it is desirable to have good working hours and reliability, and to be able to start and finish the day in the same place; for operators, it is good to match supply to demand and to have a high resource utilisation (Ceder 2001). For the infrastructure manager it is also important to have time left in the timetable for maintenance (see, for instance, Lidén 2018). On a higher level, the infrastructure manager must also balance different aspects of capacity. A popular illustration of this is found in Figure 1, copied from the International Union of Railways (2004), with four dimensions: the number of trains, stability, heterogeneity and average speed.

19 Figure 1 In timetable planning, this figure is a classic demonstration of the balance that must be struck between different dimensions of capacity: the number of trains, the stability, heterogeneity, and average speed. The figure illustrates schematically how mixed (or conventional) rail services compare to metro services, with higher speeds and more heterogeneity at the cost of fewer trains and less stability. Stability refers to reliability, robustness and related concepts, roughly understood as the absence of delays. Heterogneity refers to the variation in types of trains operated in the system. Figure from International Union of Railways (2004)

The role of planners To an extent, timetable planners can and must balance these dimensions, but on the other hand, they are mostly bound by the requests made by the train operating companies. Each of these companies makes their own plans, making the trade-offs they prefer, and the overall timetable becomes a mix of all these different companies. The infrastructure manager can mainly influence this outcome by setting conditions in the Network Statement, and by providing rules, regulations and guidelines for how timetables are to be planned. Individual planners do not have much influence over the overall balance – their job is to follow the guidelines and to do their best to accommodate the requests of the operating companies. On the margin, however, they can and do sometimes use their discretion to balance the different parameters.

20 Robustness of timetables The most relevant dimensions for this thesis are reliability and robustness – avoiding and recovering from delays – or stability, as it is referred to in Figure 1. Even these have multiple aspects, however: (1) to avoid systematic delays by having appropriate and realistic run and dwell times (Hansen 2009), (2) to make up any delays that do occur by ensuring that there are margins both at and between stations (Cerreto et al. 2016), (3) to avoid delays spreading between trains by ensuring that there are sufficient headway and buffer times (i.e. Carey 1999; Khoshniyat & Peterson 2015), (4) to maintain options for dispatchers to reschedule and reroute trains, so that they can more easily reduce the impact and spread of delays (Gestrelius et al. 2012).

Trains and passengers Passengers often place a higher value on travel time reliability than on the travel time itself. Both delays and punctuality are also often significantly worse when considered from the perspective of the passengers rather than the trains, partly due to variations in passenger demands and loads, and partly due to missed connections. In their review of the literature on railway timetable planning with a passenger perspective, Parbo, Nielsen and Prato (2016) stated that the difference in punctuality can be as high as 10%. Rietveld (2007) wrote more on how and why supply-oriented indicators systematically give a more favourable picture than demand-oriented ones that take the passengers into account.

Passenger perspectives Some research that explicitly takes the passengers into account has been carried out by Dollevoet (2013), in a doctoral thesis on delay management and dispatching, which examines passengers and their rerouting choices. Another doctoral thesis was presented by van der Hurk (2015), on how information on and to passengers can be used to improve rail services when there are delays. Cheng and Tsai (2014) also focused on passengers, studying which factors affect the perceived waiting time for passengers during train delays – factors that can make this time more or less tolerable. A final example discussed here is by Batley, Dargay, and Wardman (2011) using econometric models on combined demand and delay data in the UK. They find that while delays are indeed given a high value relative to travel time – and annoy passengers – they have little effect on the travel demand, in either the short or long term. While the passenger perspective is thus important, this thesis is written from the perspective of the infrastructure manager. A precondition for any further optimisation efforts is that the timetable should routinely be executed as scheduled, or at least adhere closely to the plan. Delays ruin this by causing disruptions to passengers, train operating companies and maintenance works, and much more.

21 Timetable precision Olsson et al. (2015) identified two high-level strategies in railway operations and timetable planning, that aim to achieve robustness and reliability: precision and slack. A strategy based on precision contains, as Hansen (2009) indicated, both realistic and precise estimates of run times, dwell times and headway times. In this approach, considerable effort is put into identifying and allocating these times, while little would be added in ways of margins or buffers. This requires, in turn, that considerable attention is paid during maintenance and operations, so that neither the infrastructure nor the rolling stock malfunction, and in ensuring that dispatchers, train drivers and on-board staff, as well as passengers, all behave in a disciplined and timely manner. The timetable and operations are streamlined to be efficient during normal operations, but they have less flexibility during major disruptions. The Japanese railways are prime examples of this strategy (e.g. Yabuki et al. 2018).

Timetable slack The other high-level strategy Olsson et al. (2015) identified is slack. This approach is based on the notion that things inevitably go wrong, and that it is best to be prepared for when they do. In timetables this is achieved by ensuring that trains have large margins – with which to make up delays – and that there are sizeable buffer times scheduled between trains to limit the spread of delays between them. In operations it is achieved by ensuring that there is a reserve capacity of both rolling stock and train crews, and in infrastructure by creating many possibilities for rerouting, both at stations and across the network. The drawback of this approach is that it is costly to maintain the reserve capacity, and that travel times are extended – which is costly for both passengers and operators. Carey (1998) made a related point: as more time is allowed for an activity, there is a behavioural response to make the activity take a longer time. By extending the time to accommodate for the variation, the variation is also increased. This effect can reduce, and potentially eliminate, the increase in reliability from added slack.

Heterogeneity in timetables Another important dimension in railway timetables and operations is heterogeneity (International Union of Railways 2004), see Figure 1. This expresses the variation in services run, often regarding the speeds or headways. A metro line where all trains run in the same way and stop at the same stations, and with equal headways, is a very homogeneous system. The Swedish conventional railway network where high speed, regional and local passenger trains all mix together with freight trains, and there are extra services during rush hours, is a very heterogeneous system.

22 Implications of heterogeneity Different indicators and implications of this heterogeneity can be found in at least two other recent doctoral theses. Vromans (2005) dedicated a chapter to the issue and introduced a family of indicators centred on the Sum of Shortest Headway Reciprocals (SSHR), which has since become popular in the literature. He also listed five different ways to reduce the heterogeneity, focusing on the option to equalise the number of stops among train services. Lindfeldt (2015) instead evaluated performance using empirical data, much like we do in this thesis, using the ratio between the 95th and 10th percentiles of speeds used across the network as an indicator of heterogeneity. Both authors found, broadly speaking, that delays and heterogeneity are correlated, as delays spread more easily between trains in a heterogeneous system.

Planning and learning While they did not write about timetable planning, Argyris and Schön (1996) presented an interesting concept. They considered learning as a process of understanding and eliminating the gap between the expected result and the actual result of an action. The gap can be eliminated by taking corrective measures within the existing values and norms – which they call single-loop learning. It can also be closed by changing the existing values and norms – double-loop learning. This perspective can be applied to timetable planning: single-loop learning means finding ways to apply the current rules, guidelines and policies, while double-loop learning implies choosing better rules, guidelines and policies. In a sense, this thesis is an attempt to achieve and support double-loop learning.

Planning in a wider context Timetable planning fits into a broader context in several ways. Writing about public transportation more generally, Ceder (2001) described the planning process in four steps: network route design, timetable planning, vehicle planning, and crew planning. In his thesis, Watson (2008) instead considered the contrasting needs and preferences of timetable planners and their managers in an organisational context. He found that the privatisation of British Rail had a negative effect on the timetabling process, due to a poor planning and rushed implementation of the new structure. Avelino, te Brömmelstroet & Hulster (2006) compared the politics of timetable planning in the Dutch and Swiss contexts and concluded that “timetable planning is not merely an operational process to be left to engineers or economists” (pp. 20).

23 Capacity allocation Timetable planning is one part of a larger process to allocate capacity on the railways. The process in Sweden is described in the Network Statement (Trafikverket 2015a). From this we have adapted Figure 2, which illustrates the process – with seven main steps. The train operating companies (1) begin by submitting requests for the timetable slots they want for the next year. The timetable planners at the Transport Administration (2) combine these requests into a draft timetable that contains all the trains for the whole year. If there are any conflicts between the requests, there is first (3) a process where the parties are encouraged to coordinate amongst themselves. If they are unable to do so, the Transport Administration (4) tries to settle the dispute in dialogue with the parties. If these attempts are also unsuccessful, the relevant parts of the network are (5) declared to be saturated, and (6) a set of prioritisation criteria is used to determine which requests have priority. The timetable is then (7) finalised and published. Not pictured is the option for dissatisfied train operating companies to appeal the decision to an administrative court. The court cannot change the timetable but may impose sanctions which lead to a change in practice for future years. This process is largely the same in most European countries, following EU regulations (e.g. DB Netz 2017, ProRail 2016, Ali & Eliasson 2019).

Prioritisation The main differences across European countries occur when the voluntary coordination process fails, step (6) above, which happens several times every year. When requesting capacity in Sweden, each train must be classified into one of about

Figure 2 Capacity allocation process in Sweden. Adapted from Trafikverket (2015)

24 30 categories, based on the expected number and types of passengers (local or regional, business or leisure, etc.) or freight. These categories, as well as associations between trains, are given different social cost estimates, based on the methodology in Trafikverket (2018a). The solution which minimises the social welfare costs is then accepted for the final timetable. This approach has some issues, both in accurately estimating the social welfare costs, and in combining publicly subsidised commuter services with commercial long distance services (see Ali, Warg & Eliasson 2019). Other countries have chosen alternative approaches. The United Kingdom has a more qualitative process with an overarching objective and a list of twelve evaluation criteria ( 2018). In Germany (DB Netz 2017), priority is first given to regular-interval or integrated network services, cross-border trains, and train paths for freight trains. In remaining conflicts, priority is given to the trains that would pay more in track charges. In the (ProRail 2016), a cyclic hourly pattern is prioritised, and further conflicts are settled by increasing the track charges to the point that only one actor remains interested – a form of auction.

Ad hoc process The process of timetable planning itself is thus mainly performed in stage (2), with final touches in (7). Once the annual timetable has been published, train operating companies can and do make requests for changes in the so-called ad hoc process. These are mostly for freight trains, where the demand for transportation often fluctuates to a greater extent (Trafikanalys 2018a). For passenger trains, many of the adjustments are in response to maintenance works on the railway that only apply for part of the year and are often announced at later stages. During a year there can be over 80,000 such changes, which are performed on a first come first served basis (Trafikverket 2019).

Dispatching After planning, dispatching takes over as the very important activity of managing the actual operations of trains and maintenance works on a day-to-day basis, at 3.00pm every day. Dispatchers control the signals and switches, coordinate with train drivers and working crews via radio, and ensure that the operations are safe and on time. The topic of dispatching is very large, with a rich research literature of its own. See Lamorgese et al. (2018) for a recent overview. Some further, recent examples are Andreasson, Jansson and Lindblom (2018) on the complex collaboration between drivers and dispatchers, Sandblad, Andersson and Tschirner (2015) on the use of support systems, and Gholami and Törnquist Krasemann (2018) for a heuristic mathematical approach to solving rescheduling problems in real time. Finally, consider Ghaemi et al. (2018), studying the impacts of rescheduling on passenger delays. This literature is, however, largely beyond the scope of this thesis. As is that on the operation of the trains themselves.

25 Robust timetable planning

This section describes three commonly used ways to ensure that a timetable is robust enough to cope with delays: run time supplements (often referred to simply as margins), dwell times, and headway or buffer times. These three concepts are illustrated, together with run times, in a graphical timetable in Figure 3. The Swedish norms are summarised in Table 1.

Figure 3 An illustration of run times, dwell times, headways, buffers and margins in a graphical timetable Each of the sub- sections reviews some relevant literature and presents the current norms and regulations used in Swedish timetable planning. These guidelines are summarised in Table 1.

Table 1 Timetable planning standards for passenger trains in Sweden

Robustness indicator Norm in the Swedish regulations Run time supplement + 3% across the board, included in run time calculation (Banverket 2000) Node supplement 3 - 4 minutes run time supplements to be added per pair of nodes passed, 2 minutes if partial. 2 - 4 nodes exist per railway line (Trafikverket 2015b) Dwell time at stations 2 minutes standard, 1 minute if the number of passengers is small (Banverket 2000) Headway At least 2 - 7 minutes, most commonly 3 - 5 minutes (Trafikverket 2017b)

26 Run time supplements

Size of margins One of the most common and important ways to make a timetable robust as regards delays is to include margins in the form of run time supplements. These supplements are added to the calculated minimum run times (see Pachl 2002) to create flexibility which can be used either by the driver or dispatcher to reduce delays and maintain connections between trains. To an extent, these supplements are mandated by the International Union of Railways (2000): three to seven percent added to the minimum running time, plus an additional one and one and a half minutes for every 100 km. These levels are common in Europe, while six to eight percent are common in North America (Pachl 2002). In Sweden we use a base level of three percent while both the Dutch (Goverde 2005) and Swiss (Vromans 2005) automatically add margins of seven percent.

Node supplements In Sweden (Trafikverket 2015b) and Switzerland (Vromans 2005), planners also add supplements at or between a set of strategically important locations – so-called nodes. This is one way to distribute the 90 seconds per 100 km suggested by the UIC. In Sweden, every railway line has about two to four of these nodes, and the timetable planner must allocate three or four minutes between each pair of adjacent nodes, depending on the type of train (Trafikverket 2015b). If the train only travels part of the distance between two nodes, the supplement should still be at least two minutes. In the United Kingdom, run times are based on previous performance rather than on calculations, and the supplements are thus harder to define (Rudolph 2003).

Discretion in planning In addition to these two methods, timetable planners in Sweden use their discretion when assigning time supplements. One common practice is to add seconds so that the arrival times at stations where the train stops occur at whole minutes. For instance, if a train is meant to arrive at 12:44:27, the planner might add 33 seconds, so that the arrival instead occurs at 12:45:00. Over long journeys, these can add up. Supplements are also sometimes given for trains that are scheduled to stop at a platform which is not on the main track, because this takes slightly longer to get to, and because engineering works are being done on the track, requiring lower speeds for part of the journey (Banverket 2000). These are meant to correct cases where the run time calculation is known to be wrong, and they are not considered as margins in this thesis.

27 Allocation of margins The efficiency of margins can be improved by distributing them in a good way. This has been studied by a number of authors, using different methodological approaches. For instance, Cerreto et al. (2016) presented an empirical study on the quality of run time supplement allocation in timetables, with regard to how trains make up or increase delays during journeys. To help quantify the distribution of margins along a journey, Vromans (2005) introduced the Weighted Average Distance (WAD). He also attempted to find the optimal distribution, using optimisation methods on both hypothetical and real cases, drawing the conclusion that a slight shift towards the beginning was best. Using similar methods, Vekas, van der Vlerk, and Haneveld (2012) also found that it was best for delay recovery not to use a uniform distribution, given some assumptions of the delay distributions. Solinen, Nicholson, and Peterson (2017) used a combination of empirical, simulation and optimisation methods to introduce and consider a more detailed indicator, based on critical points – points in a timetable when one train enters after a preceding train, or when one train overtakes another. These points are very sensitive to delays, both because existing delays tend to increase in these cases, and because any delays here easily spread to other trains. By ensuring that there are enough margins at these points, the allocation of time supplements can be very effective.

Dwell times One of the key issues brought up by Hansen (2009) was to use realistic and precise estimates of dwell times, and that this is often not done – either in practice or research. Peterson (2012) found evidence of this for train services in Sweden – that the dwell times were usually underestimated, without being sufficiently compensated by margins on the line. One good example of how both precise and realistic dwell times can be estimated was provided by Buchmueller, Weidmann and Nash (2008) who modelled actual dwell times in Switzerland, breaking them down into five sub-processes: (1) unlocking doors, (2) opening doors, (3) boarding and alighting, (4) closing doors, and (5) train dispatching. They then used over three million observations for calibration, to ensure that the estimates were realistic. D’Acierno et al. (2017) focused their modelling on (3) boarding and alighting, and on how this depended on the congestion and flows of passengers.

Some earlier studies Along similar lines, there have been many publications that deal with simulations of passengers moving between the train and the platform to study various kinds of passenger management strategies (e.g. Heinz (2003) in a Swedish context, and Baee et al. (2012), Kamizuru, Noguchi & Tomii (2015), Seriani & Fernandez (2015),

28 Zhang, Han & Li (2008) for some international examples). Focusing more on realism than on the precise breakdown of sub-processes, Pedersen, Nygreen and Lindfeldt (2018) studied how actual dwell times vary over time and running direction in . Li, Daamen and Goverde (2015) did something similar with track occupancy data from short intermediate stops in the Netherlands and separating between peak and off-peak hours. Other examples are provided from Italy by Longo and Medeossi (2012), who focused on simulation, and again from the Netherlands by Kecman and Goverde (2015), who were interested in real time prediction of both dwell and run times.

Swedish guidelines The Swedish guidelines (Banverket 2000) state that dwell times for passenger trains should, in general, be scheduled to be two minutes long. Sometimes longer scheduled times are required, and other times, if the number of passengers is small and the train and station are prepared for a speedier boarding process, one minute can be used instead. If the number of passengers is very small, the guidelines state that it is possible to schedule a stop without dwell time, merely slowing the train down to a stop and then starting again immediately, but if this is done the run time on the next line section should be extended, and if passenger numbers increase, the timetable should be redrawn and longer dwell times set. In general, however, it is often difficult to know precisely how much time is required for stops at stations – and as a consequence, margins are not discussed or defined as explicitly for dwell times as they are for run times.

Headway times The last aspect of detailed timetable planning that we will consider in this overview is headways, although they have not been studied explicitly in the thesis. Headways are the times that separate trains using the same infrastructure, and they are mostly relevant on double track lines. Sometimes there is a distinction between headway times, as the minimum time that is technically possible, and buffer times, which make up any additional buffers separating the trains from one another, see Figure 3. This is not done consistently, however, and in practice it is often difficult to separate the two.

Knock-on delays In a paper on ways to measure timetable reliability, Carey (1999) paid special attention to so-called knock-on delays. If trains are very close to one another in time, delays to one will quickly spread to any following, connecting or meeting trains, and these delays are sometimes called knock-on (secondary) delays. By increasing the separation between trains, a buffer is created so that delays do not spread as

29 easily, and Carey (1999) proposed that a number of indicators describing the distribution of headways in a timetable could be used to estimate the reliability of a given timetable. Yuan and Hansen (2008) found that as the scheduled buffer time between trains decreased, the knock-on delays increased exponentially.

Improving robustness Also arguing for an increase in scheduled headway or buffer times, Nelldal, Lindfeldt and Lindfeldt (2009) performed simulation experiments on existing high speed (200 km/h) trains in Sweden. They found that the punctuality for these trains would improve by 5-10 percent if the minimum headways were increased to at least five minutes. Similarly, Dewilde et al. (2013) introduced a method to increase timetable robustness in complex stations by maximising the minimum headway time between a given set of trains – one of the heuristics proposed by Carey (1999). They found that this approach improved the robustness in the station zone of Brussels by eight percent and reduced knock-on delays in the area by half. In Sweden, minimum headways vary from two to seven minutes, depending on the location, but are most usually between three and five minutes (Trafikverket 2017b).

30 Train delays and punctuality

This section describes how delays can be defined, counted and deconstructed before turning to some earlier research on delays and punctuality.

Delay definitions Delays can be considered and measured in a number of ways. The most common is arrival delay, which is defined as the difference between a train’s realised and scheduled arrival time at a certain station. Closely related is the departure delay, the difference between the realised and scheduled departure times. Both arrival and departure delays can be either positive or negative, indicating early arrivals or departures, respectively. Let 𝑡and 𝑡denote the scheduled and realised times, and 𝑑 the delay:

𝑑𝑡 𝑡 (1)

Another way to consider delays is to recognise that the delay upon arrival and departure are the result of delayed processes: either the run time between stations, or the dwell time at stations (or in some cases the depot). The run time is the time it takes to run between two adjacent stations, A and B, and can be calculated as the difference between the arrival time (arr) at station B and the departure time (dep) at station A. This can be calculated both for the realised and scheduled times.

𝑡, 𝑡, 𝑡, (2)

The run time delay is then the difference between the realised and scheduled run times, which is equivalent to the difference between the arrival delay at B and the departure delay at A.

𝑑, 𝑡, 𝑡, 𝑑, 𝑑, 𝑡, 𝑡, 𝑡, 𝑡, (3)

The dwell time is the time that a train spends at a station other than the first or final station. It is the difference between the arrival and departure times.

𝑡 𝑡 𝑡 (4)

As with run times, the dwell time can be calculated both for realised and scheduled times, and the dwell time delay is the difference between these two times.

31 Equivalently, it can be specified as the difference between the arrival and departure delays at the given station.

𝑑 𝑡 𝑡 𝑑 𝑑 𝑡 𝑡 𝑡 𝑡 (5)

These times, delays and equations are illustrated and exemplified in Figure 4.

Figure 4 An illustration of arrival, departure, run and dwell times, and the associated delays. The diagonal lines represent the train paths, with the dotted line denoting the the scheduled path, and the bold line the realised path. Scheduled and realised arrival and departure times are denoted by the vertical lines, while run times, dwell times and delays are illustrated by the horisontal arrows. The scheduled run and dwell times have been pictured twice, to help illustrate the size of the run and dwell time delays.

Early arrivals A possible objection to the definition above is that that passenger trains are often not allowed to depart early from scheduled stops. Thus, a train that arrives ahead of schedule must wait for longer, so that the following dwell time is necessarily extended, without the train necessarily departing with a delay. In some cases, this might be considered as less problematic, and not considered as a delay, but rather

32 as some different category of time. The same could be said for run times that are extended for trains that run ahead of schedule. From the producer and timetable planning perspectives taken in this thesis, however, even these deviations from the timetable are considered problematic – because the trains are consuming capacity (occupying tracks and platforms) where and when they should not be.

Other perspectives For transportation that does not follow timetables, delays are instead often discussed in the frame of travel time variability (i.e. Noland & Polak 2002; Rietveld, Bruinsma & Van Vuuren 2001). This is commonly the case for road-based transportation. The baseline can then be the time required in free-flowing traffic during good conditions, while factors like congestion and poor weather lead to longer travel times, which can be considered as a form of delay. When there is no timetable with which to compare, this is often the only alternative. Freight trains in the USA are traditionally run without timetables (Gorman 2009), and American conceptions of train delays are thus often quite distinct from those in Eurasian contexts. The realised run time is instead compared against the baseline of a free-flowing run time, without any stops or delays.

Earlier research on delays Punctuality is one of the most important factors for a railway system (Gummesson 2018) and its passengers (i.e. Stockholms läns landsting, 2017) and it directly affects the competitiveness against other transport modes (Nyström 2008). This section briefly highlights some earlier research on train delays and punctuality, from a few different perspectives relevant to this thesis.

Data and delays With increasing access to large volumes of data, researchers can study delays and their causes on a relatively detailed level. One overview of the many IT systems, sub-systems and databases in the Swedish railway industry was provided by Thaduri, Galar and Kumar (2015). They called for more researchers and practitioners to apply big data analytics to make use and sense of these large troves of data. Several different approaches have been applied successfully. In the USA, Gorman (2009) applied econometric models on freight train data with the aim of predicting delays due to congestion. Wallander and Mäkitalo (2012) used a data- mining approach to analyse train delays more generally. Another approach made by Marković et al. (2015), was to analyse the relation between train delays and various characteristics of the railway system using machine learning models.

33 25 000

20 000

15 000

10 000 Delay hours Delay 5 000

0

Figure 5 Delay hours by a number of causes. Adapted from (Trafikverket 2018c)

Delay cause coding In practice, delays have often been studied using manually reported delay causes, see for instance Veiseth, Olsson and Saetermo (2007) and Trafikverket (2018b). Figure 5 illustrates data from the latter on the eight biggest causes of delays that are monitored and targeted by the main railway actors in Sweden, as they are currently categorised in the industry. However, manual attribution of delays is prone to errors and often quite inconsistent (Nyström 2008), with an estimated reliability around 80% (Nilsson et al. 2015). Importantly, only delays that are three minutes long, or more, are coded – smaller delays are not included in these statistics or datasets.

Weather and delays Changing perspectives, the influence of weather and climate change on train delays and punctuality has received increasing attention in the literature. Some ten years ago, Koetse and Rietveld (2009) found that “studies that investigate the effects of weather or climate change on rail transport and infrastructure are scarce” (pp. 212), but they cite two studies suggesting that weather was behind about 10% of failures and accidents. Since then, however, more studies have been published. One good example is by Xia et al. (2013), who estimated how wind, temperature and rain cause delays in the Dutch railways, mainly by damaging the infrastructure. Similar work has been done by Nagy and Csiszár (2015), who highlighted the effects of weather conditions on the punctuality of Hungarian passenger trains, and by Xu, Corman and Peng (2016) who analysed the disruptions in the Chinese high-speed railway and found that almost 90% of these were due to bad weather.

34 Some authors focus on more specific conditions, such as Ferranti et al. (2016), who studied how heat causes failures in the railroad infrastructure in England, particularly in the signalling systems. In a similar paper, Ferranti et al. (2018) found that a short heatwave in the UK led to more than a doubling of delays, due in a large part to speed restrictions, and they caution that such events will become more common. Another example of the opposite scenario is by Zakeri and Olsson (2017), who investigated the impact of weather on the punctuality of local trains in the Oslo area. They found strong correlations between punctuality and temperatures below minus 7℃ and snowfall of at least 15 cm. Also relevant for winter conditions Palin et al. (2016) described how seasonal forecasts of the North Atlantic Oscillation, a pressure differential related to either cold and calm or mild and stormy winters, can help predict the scope of disruptions to both railways and other transport modes in the UK, months in advance. Continuing on the use of weather forecasts, in his doctoral thesis Wang (2018), proposed methods to incorporate these into dispatching, estimating with the help of simulations that this could reduce train delays in the UK by about 20%. Liu et al. (2018) instead wrote on the susceptibility to heavy rain and related hazards of Chinese railways. Frauenfelder et al. (2017) evaluated the vulnerability of Norwegian roads and railways to extreme weather events, also focusing on heavy rainfall and debris flows such as rock falls and avalanches, both now and in the future. Similarly, Sa’adin, Kaewunruen, and Jaroszweski (2016) evaluated the weather and climate risks facing a planned high-speed rail connection between Malaysia and Singapore, concluding that heavy precipitation – along with associated debris flows – is the biggest risk factor.

Infrastructure and delays Another frequent and related cause of delays is infrastructure failure. Veiseth et al. (2007) linked infrastructure data with delay and punctuality data to study the infrastructure’s influence on rail punctuality. They reported that some 30% of delay hours in Norway were caused by infrastructure failures and suggested that the quality of punctuality data could be improved by connecting it with infrastructure and operational databases. Stenström et al. (2015) developed a composite indicator for benchmarking and monitoring of rail infrastructure, considering four factors: failure frequency, train delays, logistic time and repair time. Also in a Swedish context, Wiklund (2006) studied the relationship between preventative maintenance of the infrastructure and severe disruptions and delays for trains – identifying that the overhead line is the most vulnerable and critical component, but also that this is difficult to make more robust using only increased maintenance. Mattsson and Jenelius (2015) provided an overview and a discussion of the research on vulnerability and resiliency in transportation networks more generally. Ferranti et al. (2016) studied heat-related infrastructure failures in Southeast England, while Hawchar et al. (2018) discussed the vulnerability of

35 critical infrastructure with regard to climate change, and Dobney et al. (2009) discussed the delays caused by rail buckling, to name but a few.

Trains and delays Traffic density can also be correlated with delays, as trains interfere with one another and can cause delays to spread between trains. For instance, Olsson and Haugland (2004) found that the management of train crossings is a key success factor on single track lines. The previously mentioned study by Gorman (2009) also showed that the number of crossings, passes and overtakes consistently had high impacts on delays for freight trains in the USA, although delays are measured differently on the American freight railways. The International Union of Railways (2004) has developed a method to calculate the capacity utilisation across railway networks. This method has been adapted by the Swedish Transport Administration for use on its network (Trafikverket 2017a), and it shows that many lines are very highly utilised, with severe congestion as a consequence.

Station stops and delays The last perspective we will cover is delays occurring at stations. For instance, Wiggenraad (2001) studied seven Dutch train stations in detail and found that the realised dwell times were longer than scheduled, that the scheduled dwell times were the same at both peak and off-peak times, and that passengers concentrated around platform access points. Yuan and Hansen (2002) also studied delays at Dutch stations and found that the mean excess dwell time was around 30 seconds. Sometimes this was due to the train having arrived early and not being permitted to depart before the schedule, but at other times it was due to a lack of discipline among train drivers and conductors. Yet another study around Dutch stations, by Nie and Hansen (2005), found that trains there operated at lower than designed speeds, and that realised dwell times at platforms were systematically extended beyond what was scheduled, both because of other trains blocking their routes, and because of the behaviour of train personnel. Similar findings have been made elsewhere in the world. For instance, Harris, Mjøsund and Haugland (2013) studied delays at stations in the Oslo area and claimed that these delays were often small, poorly recorded, and not well understood. On the opposite side of the world, in New Zealand, Ceder and Hassold (2015) also found that one of the main delay mechanisms was increased actual dwell times, caused by heavy passenger loads that were not sufficiently compensated by increasing the scheduled dwell times.

36 Research gaps

In the background, we have gone through some of the work that has been done on train delays and timetable planning. This overview reveals two gaps, which this thesis aims to fill. The first is that there are few empirical studies describing multiple causes of small delays, especially in Sweden. While there has recently been an increase in the amount of railway operations research in Sweden, much of it is focused in one way or another on simulation or optimisation. Empirical elements have been quite rare and rather limited in scope. In the international literature, the volume of empirical research is larger and increasing, however, it is typically either focused on a narrow range of factors correlated to delays, such as high temperatures, storms, or congestion. Studies that do consider multiple types of factors are usually based on manually reported causes of delay, which is problematic because of human error, and the omission of small delays. There is thus a need for more broad, empirical studies on delays that do not rely on manually reported causes. The second gap is that there are relatively few studies that evaluate the effects of timetable planning in practice. There is certainly a sizeable literature on timetable planning, with a large segment of it building on an operations research framework with mathematical modelling and optimisation, and another big part that is based on computer simulations. Fewer authors have been concerned with evaluating the effects of the timetable planning that is done in practice. Some qualitative work has been done about the situation of planners, but quantitative and empirical work identifying the different strategies, policies and decisions made by planners is rare.

37

Aim

The overarching aim of the thesis is to: increase the understanding of delays that occur for passenger trains in Sweden, in order to reduce delays, primarily through improved timetable planning. This can be conceptualised as an instance of double- loop learning in timetable planning, as illustrated in Figure 9 on page 83. To break the aim down into more manageable pieces, we have created five research questions which mirror the research gaps identified in the previous section.

Research questions

RQ1. How are the delays distributed, in a broad sense of the word?

RQ2. What factors correlate with delays, and to what extent?

RQ3. How can the allocation of run time margins be improved?

RQ4. How can the allocation of dwell times be improved?

RQ5. How can the practice of timetable planning be improved?

39 Delimitations

Infrastructure manager’s perspective This thesis is primarily written from the point of view of an infrastructure manager responsible for coordinating and scheduling railway services. The perspectives of passengers and operators are important and have a lot to offer, but they are not the focus of this thesis. This means that we do not, for instance, delve into the issues of generalised travel time, giving delays different weights based on their lengths, or consider the extent to which passengers adapt to delays that occur, or to their justified demands for information during disruptions. Instead, the focus is on being able to schedule and manage traffic in an efficient and reliable way. The benefits for the passengers follow naturally: if the trains are on time, the passengers will be as well. Similarly, the perspective is pragmatic and empirical, rather than focusing on achieving mathematically optimal solutions. Being able to create realistic timetables that the trains can reliably stick to is a precondition for later optimisation.

Conventional passenger trains The thesis is focused on trains with passengers, not freight. There are vast differences in both the timetabling and delays for these two types of transport, and we have chosen to focus on passenger trains. The demand for transportation of freight is much more variable than for passengers, and timetables for these trains are often created or adjusted at a much later stage. Even then, there are often large deviations between how long and heavy a train is scheduled to be, and what is actually used This makes the timetable unrealistic, and large deviations from it are frequent. This results in a split, with many freight trains having very large delays, while many others depart and arrive long before they are scheduled. The focus is on conventional passenger railways, with varying degrees of heterogeneity, not on metro, or tramways.

Focus within train paths In relation to timetable planning, the thesis is focused on dwell times and margins within each train path, not on aspects such as headways, buffer times and heterogeneity. These are also important, as they affect the spread of delays between trains, but they are beyond the scope of this thesis.

Other delimitations The work also does not consider explicitly the influence of dispatching, maintenance, or the physical design of the railway on delays. These are all important fields, which do relate to delays, but they are also beyond the scope of this thesis. We are ambivalent as to whether timetables should be cyclic or not, although the thesis should be relevant even for such cases.

40 Method and Data

Quantitative, qualitative and mixed methods

This thesis includes both quantitative and qualitative methods, and this section describes some of the benefits of each approach and of combining them as we have done. The emphasis has been on quantitative methods, used in Papers 1, 2 and 5. Purely qualitative methods were used in Paper 4, while Paper 3 was written using mixed methods. This is illustrated in Table 2.

Table 2 Methodological approach across the five papers

Methodological approach Paper 1 Paper 2 Paper 3 Paper 4 Paper 5 Quantitative X X X X Qualitative X X

Quantitative methods allow us to describe what happened. In the research field of railway timetabling and operations, quantitative methods have historically tended to fall into one of three main approaches: optimisation, simulation, and empirical. Many examples of these have been described in the chapter Background. With qualitative methods such as interviews, it is possible to gain insight into the values, priorities, perspectives and perceptions of relevant people and actors. In this thesis, timetable planning plays a key part, and so interviewing timetable planners is a natural and important part of the process. The qualitative work included in this thesis draws mostly on the qualitative research interview approach of Kvale (1997). Mixing both methods provides the best of both worlds: the quantitative data informs and reinforces the qualitative aspects, and vice versa (Johnson and Onwuegbuzie 2007). For instance, when studying timetable planning, the quantitative methods also allow us to see what decisions planners have made, which is a good complement to what is written in the guidelines or stated by the planners in interviews. Connecting this to the large sets of train movements enables us to see the consequences of their decisions, and to evaluate which of their strategies and decisions work well in practice, and which ones do not, in a form of triangulation (Fellows and Liu 2015).

41 Working with delays and punctuality

This section describes the different ways that delays and punctuality have been considered and measured throughout the papers included in this thesis. A brief summary is presented in Table 3, with Paper 1 considering both run and dwell times, Paper 2 considering punctuality measured at all scheduled stops, Paper 3 measuring punctuality at the final stop, and Paper 5 returning to dwell time delays. As Paper 4 used purely qualitative methods, it is not included in this section. Considering a range of different approaches in this way provides a more holistic description of the issues of delays and punctuality and helps to ensure that the findings are not sensitive to the exact specification of the specific indicator.

Table 3 Different indicators of delays and punctuality used in the papers

Indicator of delays Paper 1 Paper 2 Paper 3 Paper 5 Run time delays X Dwell time delays X X Punctuality at final destination X Punctuality at all scheduled stops X

Measuring and observing delays The analyses in Papers 1 and 5 are based on run time delays and dwell time delays. These are the differences between the actual and scheduled process times, be they movements along line sections or station stops. Please refer to Train delays and punctuality for more detailed definitions and discussions on delays.

Relative and absolute effects In Paper 1 we use the terms average delay and relative risk, with the latter indicating how much a factor increases or decreases the average delay in relative terms. This was done to normalise as regards the duration of activities (run or dwell time), as delays associated with bad weather, insufficient margins, or reduced traction power, for instance, might be expected to have relative rather than absolute effects. Other delays, perhaps due to a faulty signal or switch, might be better modelled as having absolute impacts, but this was left for subsequent papers. The relative risk of a delay is calculated as the average delay conditional on some value of an explanatory value, divided by the average delay across all station stops or line sections. This makes it easier to discern the effect of factors (e.g. the level of run time supplements, the scheduled dwell time, or the amount of precipitation) where the values or differences are small in absolute terms. The analysis of this paper also mentions delay contribution. Here we multiplied the average delay by the number of observations and compared this to the total amount of delays.

42 Issues with manual coding We do not use manually reported causes of delays in any of the papers, for a number of reasons described in the section Train delays. Most of the delays considered in the papers are also small, up to one or two minutes, whereas the causes are only reported for bigger delays of at least three minutes in size. In the data from the Swedish Transport Administration, 55% of all delay time is too small to be categorised, and in Paper 5 this figure was even higher.

Whether to normalise or not To ease comparisons between the differing lengths of line sections, between the speeds of trains, and so on, in Paper 1 we chose to use ratios instead of absolute time units. Throughout we chose to use the scheduled duration as the denominator. If the scheduled duration is 120 seconds and the total margin for that activity is 12 seconds, we registered that as a margin of 10%. In the train movement data, times are only given in whole minutes, which we simply converted to 60 seconds – the margins in the timetable were, however, presented in seconds. For durations, of run times or dwell times, we normalised with the average duration for the respective activity. In the sampled data those averages were approximately 100 seconds for station stops and 197 seconds for run times on line sections. A scheduled stop of two minutes was thus translated to a duration of 1.20 while a scheduled movement along a line section of two minutes would translate to 0.61. These ratios were then rounded to limit the number of distinct values, and to ensure that there were enough observations in each bin for the average delays to be reasonably stable. In Paper 5, which focused on dwell time delays, there was much less variation in the scheduled dwell times, so this normalisation was not required.

Varying punctuality definitions

The need for aggregation The data volumes across networks are often very large, which makes computations across individual delay observations quite demanding and difficult to perform. Considering individual delays also makes it difficult to follow trains during their journeys, and to gain a holistic picture of what happens during these journeys. For these reasons, it can be useful to study punctuality. This is an aggregate indicator, useful for providing an overview of delays across entire journeys, many trains, and entire networks. It is easier to work with than delays: the number of observations is much lower. It is also a well-known metric to both passengers and practitioners. However, it can be defined in different ways.

43 Delay thresholds In Sweden the convention is to use a threshold of five minutes, so that trains with a delay of up to and including five minutes is considered punctual, but those with delays of six or more minutes are considered unpunctual (Trafikanalys 2019). In other countries, the threshold can be on different levels, e.g. one or three minutes (SBB 2018).

Measurement locations Where punctuality is measured can also vary. The traditional method in Sweden has been to use the final stop, which we did in Paper 3. This was considered quite relevant for long-distance trains mostly serving the end-markets, but less so for regional and especially local trains, which can have many important stops and passengers who only travel part of the journey, and where the most important stations are often in the middle of the journey, although this varies from region to region. To address this, organisations like Trafikverket, Skånetrafiken and SJ have shifted to measuring punctuality at a wider range of stations. The most inclusive version of this is to include all stops, which we did in Paper 2, while more restrictive versions only include major hubs.

Mathematical notation In mathematical terms, the punctuality of train 𝑖’s arrival at station 𝑗 is a function of its arrival delay 𝑑, and the punctuality threshold 𝑘 such that:

1, 𝑑,, 𝑡 𝑝,, 𝑓𝑑,,,𝑡 (6) 0, 𝑑,, 𝑡

Punctuality 𝑃, at station 𝑗 in the time period 𝑘 is then calculated as the weighted average of the punctuality 𝑝,, of 𝑛 arrivals of train 𝑖 during period 𝑘 (which can be any given day, week, month, or year), and the weights 𝑤 (usually set to 1).

∑,,∗ 𝑃, (7) ∑

The inclusivity of the punctuality indicator depends primarily on the setting of 𝑗: this can be a specific station, the final stop for each train 𝑖 (Paper 3), all stations where the train 𝑖 has scheduled stops (Paper 2), or all points where arrivals are registered.

Channel precision This last option, to include stations or control points where the trains do not stop at all, is less relevant to passengers, because they are not affected by this. From a traffic-management perspective, however, there might be some benefits in

44 measuring how well the trains stick to their paths even barring any stops, but then even early arrivals should be identified. In Sweden such an indicator is sometimes used and discussed, called kanalprecision, or channel precision. The exact definition is that the train is allowed to be at most three minutes late, and at most two minutes early, to be considered as being inside the channel. The trains can then be measured continually, as often as the signal system allows, and an aggregate indicator can be calculated for a whole train journey, a part of the network, or a period of time.

Weighted punctuality Of course, it is also possible to consider different trains, so that 𝑖 only covers passenger trains and not freight trains, or only commuter trains, or trains from a specific company. The weights are rarely used, so that by default 𝑤 equals one. If passenger data is available, however, it is possible to set the weights based on the number of passengers getting on or off, so that trains and stations with more passengers are given higher weight.

Treatment of cancellations In both Papers 2 and 3, we considered cancellations as unpunctual. The alternative with cancelled trains is to exclude them entirely from the calculation and to instead present them separately in an indicator known as regularity (considering only the proportion of trains that complete their entire journey, without taking delays into account). In the data used for this thesis, about one percent of trains are cancelled, and excluding them from the punctuality figure would thus give an apparent improvement in punctuality of about one percentage point. By counting cancelled trains as unpunctual, the impact of any adverse weather conditions, for instance, should be captured in the analysis. Overall, however, this has made little difference to the results.

Summary Thus, while the exact definitions can and do vary depending on who is counting and why, punctuality is thus a convenient and commonly used way to aggregate delay data to present a much simpler and more holistic figure, which can be easily discussed and evaluated over time. Many other indicators exist – such as the maximum deviation, the time to recover, and the deviation area (Nicholson et al. 2015) – and can be found throughout the literature, but punctuality is certainly one of the most common.

45 Quantitative data used

This section describes the five sets and sources of quantitative data which were combined and used in Papers 1, 2, 3 and 5. Table 3 summarises which datasets were used in each paper.

Table 3 Overview of quantitative data used in the papers

Quantitative data Paper 1 Paper 2 Paper 3 Paper 5 Train movements X X X X Timetables X X X X Weather X X Infrastructure X Passengers X Capacity utilisation X

Train movement data

Scope of data Papers 1-3 contain Swedish data for the year of 2015. As a pilot study, the first paper only used data on one regional line, with about 363,000 train movements. The later Papers 2 and 3 instead consider all – approximately 32.4 million – movements in the national network. Paper 5 draws partly from the same data source, for the period of 2011-2017, but only covering the commuter trains in and around Stockholm, approximately 16.6 million movements. The paper also includes corresponding data from a commuter railway company in Tokyo from 2013-2018, with about 63.7 million train movements. These figures are summarised in Table 4.

Table 4 Overview of train movement data used for the papers

Paper Underlying train movements Years Comment 1 363k 2015 Regional line in Sweden (Skånebanan) 2 32.4M 2015 Sweden, nationwide 3 32.4M 2015 Sweden, nationwide 5 16.6M, 63.7M 2011-2017, 2013-2018 Stockholm, Tokyo

Structure of data In Sweden, this data is structured so that one row covers the departure from one station (A) and the arrival at the next station (B), with both scheduled and realised times for each. Both Norwegian and Japanese train movement data are instead structured so that one row covers both the arrival and departure at one station (A). Yet another structure is found in Danish data: one row there covers only the arrival,

46 departure, or passing time at a given station (A). There are thus several different ways to specify this kind of data, although they are functionally equivalent: from all of them it is possible to restructure the data and to calculate the arrival and departure delays. It is also possible to calculate the scheduled and actual durations of both dwell times at stations and the runtimes between them. The stations are not necessarily stations where the train stops – in Swedish the term is instead driftplatser – but also include old stations that are no longer serviced, technical stations where trains can meet, and such. The distance between these stations varies across the country. Good maps of the lines and stations can be found at Trafikverket (2018d).

Precision of data The core of the data in this thesis is made up of train movements registered by the signalling system. The system includes track circuits that detect the presence of trains and sends a message when a train enters or exits the circuits. These messages are very precise, of the order of milliseconds, and include both the scheduled and observed times. However, the signals and track circuits that send timestamps are usually located at the edges of the station areas, rather than at the middle of the platform, where the train stops. To adjust for this, an automatic adjustment is made within the signalling system, to account for the time that it takes for a train to move from the signal to the middle of the station, or vice versa, usually of the order of 10- 20 seconds. In Sweden, this adjustment does not differentiate between trains with different characteristics, and the size of the adjustment is not routinely recalibrated.

Truncation of data This causes some imprecision in the timing of train movements. For historical reasons, and partly to cover for this imprecision, the data layer Lupp (Trafikverket, n.d., 2018b) that receives the messages from the signals (sent via UTIN) truncates the data to the minute level. Thus, while the messages have a precision of milliseconds, the data stored only contain minutes. This complicates the processing and interpretation of the data. For one, it is difficult to study activities which take a short amount of time, as well as small delays. One must also be very cautious when studying small samples or individual observations and be aware that a deviation of ±1 minute can be caused by fluctuations of ±1 second, and that errors of ±30 seconds are to be expected.

Usefulness of data With large enough samples and systematic effects, however, even small variations can be captured using these data. Consider that something systematically causes a delay of ten seconds, then about one sixth of the trains will register as having been delayed by one minute, while five sixths will appear to not be delayed at all. Looking

47 at the individual trains, this would be misleading, but averaging across the sample, the average will be a delay of ten seconds.

Interpretation of data In summary, this imprecision and truncation leads to some difficulties in interpreting the train movement data. The data are thus not suitable for precise studies of individual trains with small numbers of observations. However, the errors are reduced when considering the duration of activities – be they run or dwell times – rather than arrivals or departures, and they tend to cancel our over-large numbers of observations. And when considering large samples, even small effects on the level of seconds can, if they are systematic, be captured.

Operational variables The first three papers also contain what we call operational variables. These include the distance travelled, the number of movements for each train set, the number of days a certain train service is run, the number of trains running, and train movements, a certain day, the number of interactions between trains at and between stations, the number of trains arriving at a certain station within a given hour, and other estimates of capacity utilisation at stations. One of these variables is the number of interactions between trains, defined as instances when more than one train is present at one station, or moving along the same line section, in the same direction, at the same time. These operational variables have all been derived from the train movement data described above and provide additional information.

Dwell times and passenger counts Alternative datasets were used to conduct more detailed studies of dwell times in Paper 5, from commuter trains in Stockholm and Tokyo. Both these alternative datasets enabled more precise dwell time studies, with a connection to the passenger flows and without the issues of truncation or adjustment to timing points. Such data are quite rare, and often considered a trade secret. Cross-referencing against train movement data improves the reliability of the observations. By further limiting Swedish data to the same time periods as in the Japanese, the analysis was performed over very comparable sets of high-quality data.

Swedish data About an eighth of the trains in Stockholm are equipped with an automatic passenger counting system, with sensors in the doors to detect the boarding and alighting of passengers at stations. This system also logs arrival, departure and dwell times with a precision of seconds. The data used in the paper span 5.5 million observations over the years 2013-2017, where one observation represents one door

48 at one station stop. While this does not cover all doors of all trains at all stations, over time it gives a good picture of the passenger flows and dwell times. The sensors need to be calibrated from time to time, and they are not very precise for large flows of passengers – these large flows are relatively rare, however.

Japanese data For the trains in Tokyo, the paper utilised a dataset of manual dwell time and passenger count observations. At least twice a year the railway company manually counts passengers on trains at stations, to estimate the level of on-train congestion. This is mandated by the Japanese government, as a way to monitor the peak congestion of trains – a matter of considerable public interest. As such, the observations are centred on the biggest stations, during the busiest times, where the levels of congestion are the highest. From about 6:30am to 9:00am, staff at a number of stations (eight of which are done regularly, the others sporadically) observe all trains that stop or pass by, making note of exact arrival and departure times, the train number and the number of cars, as well as the estimated congestion rate on the train. Approximately 50 trains are observed in this manner, per station, and over the years 2013-2018, more than 4,000 observations were made.

Connecting the Swedish data On a practical level, there is no explicit way to connect the Swedish data with the train movements: there is no train identification number in the detailed observations. As the timestamps are based on the doors opening and closing, rather than entering or exiting the station area, they are also different between the two datasets. Instead, we created an algorithm which matched the two datasets together, based on the origins and destinations, locations and timestamps of the observed trains. Because these trains run with high frequencies, we had to be conservative in the matching process, and overlook trains which deviated from the timetable by more than two minutes. In this way, we could be confident that the datasets were matched together correctly. The cost was a reduction in the number of observations included, but the resulting number was still in the thousands, high enough for the needs of the analysis.

Connecting the Japanese data As the measurements in Tokyo are made manually, some errors can occur. Fortunately, these data did allow for an explicit connection to the train movement data. By cross-referencing these two datasets, it was possible to identify and exclude suspicious observations, where either the signalling system or the human observer could be at fault, again to ensure that only reliable observations were considered in the analysis.

49 Timetable data The main purpose in using these detailed timetables was to calculate how much of a margin has been added, and where it was placed. The size of the margins can be expressed in two ways: as a percentage of the scheduled run time without margins, and as seconds per kilometre, the first one being more common.

Scope of the data Timetables for trains in Sweden are created and stored by the Swedish Transport Administration in the tool TrainPlan. We have received data dumps from this system by the Swedish Transport Administration spanning 2011-2018, and in Papers 2 and 3 we use data from 2015, covering almost 46,000 distinct timetable versions and over 1.1 million train journeys totalling about 32.4 million train movements. To give an overview: 83% of these were for passenger trains, 14% freight trains, and 3% service trains. Forty-three percent of journeys were longer than 100 km, 31% shorter than 50, and 25% between 50 and 100. These data include roughly 80,000 changes that have been made to the timetables during the ad hoc-process (Trafikverket 2019).

Structure of the data Each timetable variant comes with a calendar reference, a long string which specifies on which days during the timetable year it applies. These long strings can be translated into dates, which can then be matched to the train movement data. In addition to the calendars, information is given on the level of trains, train arcs, and link usage. Link usage roughly corresponds to train movements, although the infrastructure is slightly more detailed. Train arcs are series of links which have some common properties, such as top speed, train length, and train type. For instance, a train might change configuration during its journey, and the displayed train number identifier might also change during the journey. The infrastructure model also deviates somewhat from that in the train movement database. Many stations exist in both datasets, but the timetable data have some additional detail, with more timing points. These must be aggregated, so that all margins are counted.

Contents of the data The time stamps in the timetable are specified in seconds, both for arrival, departure and dwell times. Time supplements are listed explicitly, broken down across a few categories: mainly for performance, adjustments, maintenance works and phasing against other trains. Time supplements for maintenance works were not considered as margins, as they compensate for longer run times, but the three other types have been included and aggregated as margins. Performance and adjustment supplements are the most common by far, phasing supplements are uncommon.

50 Unfortunately, there is no explicit difference made between dwell times that are required for a stop, and those that are intended to act as margins. Similarly, while it is possible to calculate the headways between trains using the timetables, there is no explicit way of separating between what is the technical minimum, and what is the additional buffer time.

Processing the data To measure the distribution of margins within a timetable, we use the indicator of Weighted Average Distance (WAD) described in Vromans (2005). This is used to describe how the various time supplements in a timetable are balanced, being more towards the beginning or end of the journey, or in between. It is expressed as a value between 0 and 1, with lower values expressing a shift towards the beginning and higher values a shift towards the end of the journey. In many timetables, there are instances of negative margins: cases where the scheduled time has been manually set to be shorter than the technical minimum, the effects of this have been studied in Papers 2 and 3. Finally, variables such as the travel time without margins, the average speed of the trains, and the average distance between stops, can be derived from the timetable data and add further information.

Weather data Papers 1 and 2 used weather data from the Swedish Meteorological and Hydrological Agency (SMHI). From their website we were able to download all historical observations of snow depth, temperature, wind strength, precipitation in Sweden, which we then used to estimate the weather conditions in which the trains operated.

Scope of the data There are more weather stations in the southern and more populated parts of the country, so that the distance between train stations and meteorological stations differs from place to place. The average and maximum distances for each type of observation are summarised in Table 5. While the stations are by no means perfectly matched in geography, the weather is often quite similar on the scale of 10-20 km, which is the typical range. As with the train movement data, however, the data are better suited for analyses across large samples, rather than individual observations, where the imprecisions in the data can cause problems.

Table 5 Distance between train station and meteorological station across the country

Distance to station Precipitation Snow depth Temperature Wind speed Average (km) 10 19 16 21 Maximum (km) 44 97 50 91

51 Processing the data The weather data in these papers were linked to the punctuality of trains, which is an aggregate indicator for trains that often travel long distances, through varying weather conditions. There are several ways in which to convert these different values to one single variable. With wind, we were interested in the highest speeds and chose to take the maximum. With temperature, we tried both the average, the minimum and the maximum. We ended up choosing the minimum temperature for cold weather, and the maximum temperature for hot weather, arguing that we are most interested in the extremes, but the choice made little difference in the analysis. For precipitation and snow depth, we considered the average, maximum and sum of the measured variables. In the end, we found that the sum best explains the effect of precipitation on punctuality, while for snow depth, we found the average to work very well. These steps were not necessary in Paper 1, where we studied delays and could simply use the closest observations in both time and space, without aggregation. Temperature is measured about 18.7 times per day and station, on average. The average of these was taken to get a daily temperature value for each station. Wind strength was measured at fewer stations, with an average of 23 observations per day. To convert to a daily wind value for each station, we took the maximum value for each day, because we are mainly interested in stronger winds. Snow depth and precipitation is measured daily, but with data missing on average 9% and 0.5 % of the days, respectively.

Connecting the data To match the datasets, we first had to link the train stations to the nearest meteorological stations, for each of the four types of data, transforming the GPS coordinates of the meteorological data to the SWEREF99 coordinates used for the railway stations. The matching was done separately for each weather variable, because not all meteorological stations observe the same variables. As some stations lack observations on some days, the algorithm was set to match the two station sets for each day, to ensure that an observation could always be given.

Infrastructure data From the Swedish rail asset management database, BIS, we have information on the type and location of eight categories of infrastructure elements. These are signals, switches, bridges, tunnels, level crossings, cuttings, embankments and fences, all in all 82,700 elements. As a pilot study in investigating this data, all but 1,500 of the elements could be matched to the railway stations and links found in the train movement data. While there are some issues with the reliability and updating of the data, knowing the number of switches or signals, for instance, that trains pass by

52 can help explain part of their delays, and to explain part of the phenomenon that trains travelling longer distances often tend to have more delays. And while the exact locations of the elements may not be easy to map to the train movement data, the data can be used to describe the varying complexity of infrastructure across different parts of the network. In the future, information might be added to the infrastructure data based on the age or condition of the elements, and this could then be used to further help explain delays that occur.

53 Analytical methods used

This section describes how the data have been pre-processed, and the different analytical methods we have used. An overview of the different categories and variables is presented in Table 6, along with their units, and in which papers they have been included.

Table 6 Overview of influencing, or explanatory, variables studied in the papers, and their units.

Category Variable Unit Papers Weather Precipitation Millimetres 1, 2 Snow depth Centimetres 1, 2 Temperature Degrees Celsius 1, 2 Wind speed Metres per second 2 Timetable Average speed Kilometres per hour 2 Distribution of margins Percentage 2, 3 Dwell time Percentage, seconds 1, 5 Negative margins Binary 2, 3 Number of stops Count 2 Run time Percentage 1 Size of margins Percentage 1, 2, 3 Supplements following stops Seconds 3 Travel time Hours 2 Operational Capacity utilisation Percentage 1 Number of days operated Count 2 Distance travelled Kilometres 2 Earlier delay Seconds 1, 5 Line interactions Count 2, 3 Month Count 1 Movements per day Count 2 Movements per vehicle Count 2 Passenger count Count 5 Station interactions Count 2, 3 Trains per station and hour Count 2 Weekday Count 1 Infrastructure Bridges Count 2 Cuttings Count 2 Embankments Count 2 Fences Count 2 Level crossings Count 2 Signals Count 2 Switches Count 2 Tunnels Count 2

54 The basic approach Throughout Papers 1-3 & 5, the basic approach has been to identify and quantify the link, or correlation, between various delay-variables and other explanatory variables. An important distinction here relates to the terms of correlation, covariation and causation. In these papers, we do not put forward causal mechanisms, or attempt to describe the course of events leading up to a particular delay for a given train. It is not feasible to do this with such large numbers of observations. It is difficult, even when considering individual cases, to disentangle various causes and events from one another. A faulty switch might lead to delays, but the fault might be because of poor maintenance, because of bad weather, because of a flaw in the design or manufacturing process, because of a train passing through it at too high speeds, or with wheel defects, or because of a combination of any or all of these and more.

Causation and correlation Rather than trying to sort out what, precisely, was the cause of events in cases like these, we have stuck to the level of correlation and covariation. While we cannot say what, precisely, caused the delay described above, we might, by using many observations, conclude that delays are more likely when trains pass by many switches, when the weather is bad, when the maintenance of either the infrastructure or the vehicles is not up to par, and when the drivers do not respect the speed limits, for instance. The variables we consider are thus in many cases proxy-variables for what is happening on the ground. While this is not perfect, and it can sometimes be difficult to disentangle the effects of some variables from one another – such as the effect of travelling long distances from the effect of passing many signals – following this approach can point us in the direction of problematic areas, and sometimes suggest mechanisms that lead to delays and possible measures that can be taken to reduce them. A schematic view of this be seen in Figure 6.

Figure 6 By linking observed deviations, such as delays and recovery, to data covering weather, timetable planning, operational factors, infrastructure and passengers (and more), we can identify some patterns. From these patterns we can hypothesise about the underlying mechanisms, which lead to the delays. These hypotheses can be investigated and tested more thoroughly, and if they are found to be believable, we can then propose measures which address the mechanisms and should lead to a reduction in delays.

55 Pre-processing the data Data quality can be an important issue for quantitative methods – especially with empirical data. Missing, faulty or outlying observations might influence the results so that they are not representative. There are several approaches to deal with this, and which have been used in the making of this thesis.

Large datasets The quantitative work in this thesis has covered large samples of trains and train movements. The analysis in Paper 1 spanned 561,000 observations of run and dwell times, Paper 2 analysed the punctuality of over 883,000 trains, Paper 3 was more restrictive with 470,000 trains. Paper 5 had the smallest samples of the quantitative papers, with the analysis covering about 6,000 dwell times. Especially for Papers 1- 3, this suggests that the effects of the odd error or outlier would be to a large extent diluted.

Aggregation With such large numbers of observations, it can often be useful to aggregate observations. The underlying dataset used for Papers 2 and 3, for instance, covered about 32.4 million train movements – aggregated to about 1.1 million train journeys, which were then filtered out further in pre-processing, to the 883,000 and 470,000 observations mentioned in the previous paragraph. Aggregating from observing individual run and dwell time delays to the punctuality of train journeys meant that the variation was reduced, and the impact of faulty, missing or outlying observations reduced: as the punctuality for a given train at a given station is binary, either 0 or 1, even an outlying observation with a very large deviation from the timetable is reduced to the same scale as all the others.

Interpolation In Paper 2 we were interested in calculating punctuality across all intermediate stops, not only the last one. This introduced some difficulty, as there are some observations that are missing from the data. These are points where the train was not cancelled but there was no record of the train arriving or departing, while there were records of it arriving at surrounding stations. Some of these missing observations are seemingly random, others are more frequent at stations like Arlanda Södra, which are only a few hundred metres from another station, Arlanda Norra. Here we used a process of iteratively interpolating the missing observations from surrounding and existing ones. The details are described in depth in the paper, but we were able to go from 7.5% to 0.1% missing observations. We did not perform this process for the other papers because: (1) when we study the delays explicitly as in Paper 1 and 5, we want to focus on real observations and (2) when we only consider punctuality at the final stop, it is not possible to interpolate the relevant

56 missing observations or to be sure that the train did not stop and turn around at an earlier stop. While the interpolation may introduce some small inaccuracies, these are less important when the data are transformed into an aggregate indicator like punctuality.

Normalisation In Paper 1 we remained on the level of run and dwell times. As it was still relevant to compare across differing lengths of line sections and speeds of trains, we normalised all durations, delays and margins – to use ratios instead of absolute time units, with the scheduled duration as the denominator. This transformation of the variables made it easier to compare the figures with each other, and to identify those that were outside the normal variation. In Paper 5, where we also remained on the level of dwell times, the variation in scheduled times was much narrower to begin with, and this step of normalisation was not required.

Unfiltered delays In neither of these papers, dealing explicitly with delays, did we distinguish between primary or secondary delays, or filter the delays based on size (although for technical reasons, larger delays could not be considered for the commuter trains in Stockholm). While filtering based on size can make sense in some circumstances – larger delays causing disproportionate displeasure among passengers (Börjesson and Eliasson 2011), and smaller delays being easier to deal with using timetables – the scope of this thesis is broader, and seeks to give a better understanding of delays in general. Since small delays are so much more common, however, filtering out the larger delays (with a threshold of say 15 or 20 minutes) would have relatively little impact on the averages that we have considered. The standard deviations would be more affected by such a filter, but we have not focused on these in this thesis, and in any case, the effect would be to make the indicators less representative.

Cross-referencing Yet another approach is to use alternative data-sources to cross-reference and validate observations. This was used for both cities studied in Paper 5 and is described in the section Dwell times and passenger counts. In the case of Tokyo in particular we were able to use the automatically registered data to identify errors in the manual observations, and vice versa. In Stockholm, the restrictive set of assumptions used to combine the two datasets also limited the variation in the data, filtering out any large deviations between the two. This approach is not always feasible, and it can be quite restrictive, but it can be a good way to ensure that any remaining observations are reliable.

57 Visual analysis and Welch’s t-test In all the quantitative papers we used visual analysis of the data, both in tables and in plots, to identify the normal range as well as any outlying or abnormal observations. This was primarily relevant for the variables used to explain the variation in delays or punctuality in Papers 1-3. With such large numbers of observations, the visualisation had to be performed after a process of further aggregation – it is not feasible to plot hundreds of thousands of observations. In Papers 1-3 we performed this aggregation in essentially the same way, grouping observations by the value of the studied variables.

Variation across papers In Paper 3 we varied this approach slightly, to see if the results were sensitive to the exact grouping. Instead of grouping together trains that exceed a certain threshold, Paper 3 rounded the variables and considered the punctuality for trains within these groups. The groups could not overlap, which they did in Paper 2. How the results vary in detail across the two approaches can be seen in the section Overview of papers and findings and in the appended papers, but overall the differences were minor.

Welch’s t-test In Papers 1-3 we used a statistical test known as Welch’s t-test, partially to aid in the visual analysis – ensuring that we only plotted values where the difference in punctuality was statistically significant. This is a variation on the commonly known Student’s t-test, which allows for comparisons between samples with unequal size and variance (Welch 1947). If 𝑋 is the sample mean, 𝑁 the sample size and 𝑠 the sample variance, the t-statistic 𝑡 and the degrees of freedom 𝑣 are defined as:

(8) 𝑡 ,𝑣 ,𝑣 𝑁 1

These statistics were then compared to a two-tailed t-distribution, to see if the difference in sample means was statistically significant or not. In Paper 1 this test was used twice per studied variable: once to see if it had an influence on run time delays, and once for dwell time delays. In Papers 2 and 3 it was used to ensure that we did not include values which covered so few observations that we could not be sure of their effects. A threshold of 0.01 was set, but in many cases the p-values were much lower, frequently approaching 0. In other cases, however, the differences were not significant. Examples for this are when wind speeds were studied in Paper 2: only once the wind speed reached 5 m/s did the punctuality begin to be affected, speeds of up to 4 m/s had no statistically significant impact.

58 Curve fitting For each plot in Papers 2 and 3, we were interested in describing the shape of the relationship and thus tried to fit various functions to the scatter diagrams – choosing that which provided the best fit. As this was done on aggregated data, the plotted observations each had different weights – sometimes differing by multiples of several hundred. In Paper 2 we relied on the t-tests and did not use these weights when fitting the trendlines, rather assigning equal weight to every (aggregated) observation and fitting according to the Ordinary (unweighted) Least Squares. In Paper 3 we instead used Weighted Least Squares, with the number of observations per group used as weights. Overall, this choice made little difference to the results.

Ordinary Least Squares Regression Papers 1, 2 and 5 also used ordinary least squares regression to varying extents. In Papers 1 and 3 this was mostly to see which factors had statistically significant impacts, and which did not. In Papers 2 and 5, regression analysis was also used to see to which extent the explanatory variables could explain the variation in punctuality or dwell time delays, respectively. One estimate of this is the coefficient of determination, the R2 or adjusted R2, of the regression model. Paper 5 included a separate set of regressions for this express purpose, including both squared variables and interaction terms, intended to catch non-linear and interaction effects in the data. Such a model is, however, too complicated use in practice. A second set of models was estimated with only linear effects, and no interactions, to provide at least a rough model for how to adapt the scheduled dwell times to the flow of passengers at a given station. In this case, the coefficient estimates were more important. Thus, the regressions were used to (1) check the statistical significance of any effects, (2) check the predictive power of the data, and (3) provide useful coefficient estimates.

Predictive power The predictive power of these regression models has varied substantially – from about 4% in Paper 2, to 40% in Paper 5. There are several reasons for this difference. For one, the scope is more well-defined, with only commuter trains in either Stockholm or Tokyo, in the morning rush hours of weekdays in April or November – compared to all passenger trains in the Sweden for one full year. The variation was thus much smaller to begin with. Second, is that punctuality, studied in Paper 2, is an aggregate indicator which is less granular and detailed than the dwell time delays studied in Paper 5. The cause of events is much more complex for the former. Third, the models in Paper 2 did not include squared or interacting variables – in Paper 5 this inclusion approximately doubled the coefficient of determination. This simply suggests that the problem studied in Paper 2 is more complex, and less well- suited for predictions – not that the regressions there are any worse or less valid.

59 Interviews with timetable planners

To produce the material for Paper 4 (and parts of Paper 3), we carried out semi- structured interviews with timetable planners working at the Swedish Transportation Administration’s office in Malmö. Each interview was approximately an hour long, recorded, and transcribed in full, which resulted in written material of around 50 pages. The results were analysed by manual categorisation and concentration of meaning.

The interviewees The Swedish Transport Administration employs about 20 long-term timetable planners, who work chiefly in the annual timetabling process. In addition to these, there are short-term timetable planners who work in the ad hoc-process. The Swedish railway is divided into eight regions, and the southernmost region is planned from the office in Malmö by four timetable planners, all of whom we interviewed. Two of the planners were men and two women. All of them have worked in the industry for many years, at least since 2003 and going back as far as 1985, and with timetables for nine or more years. The region they plan for is a sort of microcosm of the railway network in Sweden, and it contains a very diverse mix of railway lines, train traffic and capacity utilisation.

Motivation We used a qualitative method because this allowed us to effectively study the values and priorities of those involved. In this choice of method, we thus applied a qualitative approach on a topic that is typically studied using quantitative methods. We prepared an interview guide based on four areas which were identified before the interviews: (1) guidelines and support, (2) rules of thumb, (3) feedback loops, (4) trade-offs, with a handful of guiding questions in each area.

Processing The process of analysing the transcribed material was carried out in sequence. The first step was to sort the different interviewer-interviewee exchanges by area, rather than chronologically – what is often called categorisation. The second step was to concentrate the meaning of the answers by cutting superfluous words and sometimes reformulating entire paragraphs into a few sentences. This process reduced the volume of text from 24,500 words to 4,500 and enabled a better overview of what was said. After the interview answers were concentrated, they were sorted into 16 new sub-areas, and the contents then summarised further, reducing the volume from 4,500 to 500 words. This made obtaining an overview of the contents manageable.

60 Alternative reading The analysis in Paper 4 is based on an alternate reading of the interview responses: instances where the planners described feedback from dispatchers about errors in the timetable. Several of their statements could explain why timetabling errors sometimes happen; these were condensed into a list of eleven reasons. At this point, we looked for different ways to group and categorise the answers, looking for themes on a higher analytical level, and came up with the three following categories, which we use in the analysis: (1) “missing tools and support”, (2) “role conflict”, and (3) “single-loop learning” (see Argyris & Schön, 1996). These three categories are used to explain and discuss the reasons behind the errors more deeply.

61

Research Process and Results

This section briefly describes each of the five papers: how they are structured, how they lead into one another, and how they fit together. The connections are illustrated in Figure 7. The papers follow the basic format of describing the situation, analysing it, and presenting recommendations for how to improve the situation. Describing the situation entails collecting, processing and summarising the data. In Papers 1 and 5 this is about the distribution of delays, in Papers 2 and 3 about the level of punctuality and how this varies over time and for different types of trains, and in Paper 4 about describing the situation of the timetable planners and the different errors that they sometimes make. Analysing the situation consists of linking different types of data together, running regressions, performing t-tests and creating plots, in the quantitative papers. In Paper 4 we analysed the transcripts more closely to find the reasons that errors are sometimes made, and to categorise them into themes. Finally, based on the analysis, all the included papers propose practical measures that can be taken to improve the situation and reduce delays. Most of these measures relate to timetable planning, as such changes are quite inexpensive and quick to carry out, while, as we show in the papers, they would have real effects on delays and punctuality. Others focus on physical changes to the infrastructure, or on changing some operational aspects. These recommendations have been collected in a separate section.

63 How the papers build on each other

This section briefly describes how the pilot study in Paper 1 set the stage for the following papers, and how these flowed from and built upon one another. This is illustrated schematically in Figure 7.

Figure 7 A schematic illustration of how the five papers are connected. Paper 1 is the pilot study, and both its findings and basic methods lead directly to Papers 2 and 3. The finding that most delays occur at stations, rather than between them, is also the basis of Paper 5. Paper 2 extends the approach from Paper 1 to more data and more factors, but still shows a big role for timetable planning, which inspires Papers 3 and 4. Paper 3 uses the same approach as Papers 1 and 2, but focuses on timetable planning. It includes some interviews with timetable planners, the analyses of which are expanded greatly upon in Paper 4. These interviews also shed light upon the scheduling of dwell times, which combined with the findings in Paper 1, lead well into Paper 5, which is focused on dwell times and dwell time delays.

64 The pilot study Paper 1 is based on a pilot study, intended to test the basic approach and data to be used in the research, and to help direct the topic of the research. We limited the geography to Skånebanan, a regional railway line in Southern Sweden, and used one full year of observations, 2015.

The basic approach The basic approach was to link train movement data with other relevant datasets that might help both explain the delays that occur and provide ways to reduce them. In this paper we also had access to timetable data showing the level and distribution of margins, to weather data including temperature, wind, precipitation and snow depth, and to data on theoretically estimated capacity utilisation.

Run and dwell time delays One of the innovations in this first paper was to consider run and dwell time delays separately, to identify more clearly when and where the delays occurred. This led to the result that the risk of being delayed was a great deal higher at stations, where as many as 40% of dwell times took at least one minute longer than scheduled. This was one of the key drivers behind later performing the study described in Paper 5, focusing more on one aspect of what happens at stations, namely the exchange of passengers.

Other findings The other main findings, to be followed up in later papers, were that the details of a timetable were strongly associated with the risk of delays, that weather had important but somewhat complicated impacts, and that theoretically derived indicators of capacity utilisation were not so strongly linked to delays. These findings, and the success of the basic method, led on to Papers 2 and 3.

The subsequent papers

Extending and broadening Paper 2 is basically an extension of Paper 1, both in that it considers the national railway network instead of a single line, and in that the number of influencing factors considered is greatly increased. Beyond some refinements and new variables from the previous datasets, a dataset of infrastructure components such as switches, signals, tunnels, bridges and more was added. The idea is that breakdowns in infrastructure sometime cause delays, and that trains passing by many components are exposed to a higher risk of such delays. To deal with the vastly increased number of observations, we shifted from studying individual delays to the punctuality of

65 trains. The findings basically validated those from Paper 1, and as it is difficult to do much about the weather, and adaptations in the infrastructure can be quite expensive and take a long time to perform, while timetables are redone every year at very little cost, we chose to look more closely at the role of timetable planning in Papers 3 and 4.

Focusing on the timetables In Paper 3 we used the same train movement and timetable data as in Paper 2. However, the aim was much more focused on describing the process and strategies in timetable planning - particularly around the issue of allocating various kinds of run time margins - and evaluating the effects of these strategies. Based on literature and interviews with planners, we identified a few different ways to think about when to allocate margins and how to distribute them. We then used the actual timetable data to see to what extent they were used, and empirical data on train movements to identify the real effects on punctuality.

Focusing on the planners Paper 4 is focused on the timetabling process, extending the analysis of the interviews with planners, and their situation, from Paper 3. As previous papers had illustrated that even seemingly small decisions in timetables impact punctuality, it seemed prudent to interview the planners who make the decisions. The interviews were centred on four topics: feedback loops, support and guidelines, trade-offs, and rules of thumb. In the analysis we then returned to the transcribed material with the aim of finding out what errors are made in timetable planning, and what the consequences and causes of these errors are. The results align well with the finding in Paper 1, that most delays happen at stations. This combination of results leads into Paper 5, where we focused explicitly on dwell time delays.

Focusing on the dwell times In Paper 5 we looked closely at dwell times and dwell time delays, following up on findings from Papers 1 and 4. The former identifying that most delays occur at stations, rather than between them; the latter identifying both that dwell times are often scheduled unrealistically, and that these outcomes are not evaluated by planners. It is also a complement to the focus on run time margins found in Paper 3. The paper is also the result of gaining access to new datasets, from commuter trains in both Stockholm and Tokyo, containing observations of dwell times and passenger counts, along with the train movements. These data enabled much more detailed study of the delays that happen at stations, and into one of the key causes of such delays – the exchange of passengers.

66 Overview of papers and findings

This section briefly summarises the aims, analyses and findings of the five included papers. This is summarised in Table 7.

Table 7 Overview of papers with aim, dataset, analysis and delay indicator used

Paper Aim Dataset Analysis Delay indicator 1 Improve knowledge on delays and One regional Regressions + t-tests + Run & dwell time the methods to study them line, 1 year plots delay 2 Identify and quantify the impact of Sweden, 1 Regressions + t-tests + Punctuality at all several weather, timetable, year plots stops operational and infrastructure variables on punctuality 3 Study the punctuality effects of Sweden, 1 Regressions + t-tests + Punctuality at final strategies for allocating margins year plots stop 4 Describe the situation for Four Qualitative based on Errors in timetable timetable planners in Sweden. planners interviews planning Identify common errors in timetables, as well as the reasons behind them. 5 To study dwell time delays and to Stockholm + Regressions Dwell time delay see how much they can be Tokyo, explained using passenger data. several years

Paper 1. The pilot study The main aim of this paper is to learn more about how delays are distributed and about how they are associated with various weather, timetable and operational variables. As a pilot study intended to test the approach and methods for future papers, the analysis is based on one year (2015) of data from one regional single- track line in Sweden.

Method In order to analyse the link between the studied variables and the delays we make use of three basic steps. The first is to determine if the studied variables had a statistically significant impact on the delays, using Welch’s t-test. For example, we looked at temperatures below zero and temperatures above zero and used Welch’s t-test to test whether the average delays for these two samples are significantly different from each other. The second step is to perform linear regressions for delays at station stops and line sections respectively, in order to give an overview of the trend of each studied variable. The third step of the analysis includes a set of plots, each emphasising different aspects of the data. These plots together give a more nuanced picture than either the t-tests or the regressions, and visual inspection can reveal if the studied variables vary along with delays in any recognisable pattern.

67 Findings We find that the delays mostly occur at stations – both the frequency and severity are much higher there than on the line sections between stations, where the average delay is very close to zero. At stations, the scheduled dwell time explains more of the delays than any other factor we consider in the paper. The average delay is significant if the scheduled dwell time is lower than 160 seconds, while the greatest delay reduction occurs at dwell times of 210 seconds, beyond that the average delays increase. On the line sections, the single most important factor is the level of margins: most of these delays are caused when trains have no margins, although negative margins do occur and contribute to a small degree. We find that there are clearly diminishing marginal returns, and that the most effective level for delay reduction is around 10%. Both temperature and snow had small but statistically significant impacts on delays, and similarly there were small but statistically significant differences in average delays across weekdays, which can be explained by the variation in the number of trains run. Finally, small arrival delays to stations tended to speed up the stops, while early or on-time arrivals are on average delayed by around 30 seconds when departing the station. Larger arrival delays are associated with even larger dwell time delays.

Paper 2. Four types of factors The purpose of this paper was to identify and quantify the link between several weather, timetable, operational and infrastructure variables and the punctuality of passenger trains in Sweden. We use data on all passenger trains in Sweden during the year of 2015. The method is similar to Paper 1, a major difference being that we study punctuality (across all scheduled stops and with a delay threshold of five minutes), rather than run and dwell time deviations. This greatly reduces the number of observations and the computational load compared to studying delays directly, and facilitates the larger study area, while giving a more holistic picture than only considering punctuality at the final destination.

Weather factors The results indicate that punctuality is clearly correlated with weather. The relationship with temperature is exponential, for both low and high temperatures. Compared to the average, punctuality drops by 50% at -30 ℃, and by 26% at 27 ℃. Strong winds also lower punctuality: when they exceed 23 m/s, it is about 9% lower than average, following a power curve. Precipitation lowers punctuality moderately, in a linear fashion, as it is accumulated throughout the journey. And while less than 6% of the observations in our dataset have average snow depths larger than 1 cm, the effect is quite large: at an average of 5 cm the drop in punctuality is about 17.5%- points, following a logarithmic curve.

68 Timetable factors The results regarding timetable variables suggest that margins are correlated to punctuality up to a point of around 12 s/km, or 25-30% of the minimum run time. Then it is around 2% higher than average, at even higher levels the correlation turns negative. Similarly, the weighted average distance (WAD) of margins is correlated to punctuality up to a point of about 0.60, where it is about 1% higher than average, before the correlation switches sign. When they exist, negative margins are linked to, on average, 2.8% lower punctuality.

Operational factors We also consider a range of operational factors. The single best indicator for punctuality is the distance travelled by a train, with about 3% per 100 km – the correlation coefficient of -0.20 is the highest in our findings. Highly correlated to this is the duration in time, where every hour is linked to a decrease in punctuality of about 1.6%. Still, increasing average speeds of trains is linked with decreasing punctuality, with the airport trains being an exception. This suggests that the ability to accommodate heterogeneous traffic, with widely varying speeds on the same line, has been overestimated. The average distance between stops also appears to be linked to lower punctuality in a mostly linear fashion, by about 1.3% for every 10 km. The link between punctuality and the number of trains is best described using a quadratic function, but the effect is relatively small – the highest punctuality drop we see is 1.2%. At stations, the number of trains that arrive per hour is instead linked to punctuality in a linear and positive manner: at volumes of at least 20 trains per station and hour, the punctuality is about 2.5% higher than average – though this may be explained by a high proportion of short-distance commuter trains, with higher punctuality. We also see that interactions between trains – instances where trains are at the same place at the same time – are associated with lower punctuality: by about 1% and 2.2% for interactions at and between stations, respectively.

Infrastructure factors Turning finally towards infrastructure, plotting the number of switches, tunnels and fences against punctuality, we find that they fit best to quadratic functions. The mechanisms behind these negative synergies are not clear, and should be studied further, but we can perhaps speculate that it has to do with difficulties of maintenance and complex environments with more trains, people and animals in the vicinity that can cause problems. Bridges, signals, level crossings, and cuttings show linear relationships to punctuality. Signals have the largest effects in terms of magnitude, being associated with punctuality drops of around 22% at the most, although this is highly correlated to the distance travelled.

69 Paper 3. Margins and interactions The purpose of this paper was to study the effects on punctuality of some strategies for allocation of margins in timetables for passenger trains by analysing empirical data. The analysis focuses on aspects that relate to the planning of a timetable, i.e. the size and distribution of margins, and on margins “within” train paths rather than those “between” them. As such, this study does not address headway or buffer times. The method and data are similar to Papers 1 and 2, with all passenger trains in Sweden during 2015. As we are interested in how margins are distributed along the journey and timetable, it is important to consider an aggregated indicator like punctuality, rather than looking at each delay directly. Here we use the conventional Swedish definition of punctuality: considering the final destination and a delay threshold of five minutes. This enables comparisons against Paper 2, and for us to see the difference between using different punctuality definitions. Another difference is that we use weighted least squares and the bi-square method to help produce more robust estimates.

Interactions between trains The results indicate a negative association between interactions and punctuality, by about 1.2 and 3.9% each at and between stations, respectively. There is a wide variation in the number of interactions at stations: the mode is 1, the mean 2.89 and the 95th percentile 8. Interactions between stations are uncommon, occurring only for 2-3% of trains.

Size of margins The results on margins indicate a positive correlation, but the slope is shallow at about 1/8: to achieve an improvement in punctuality by five percentage points, margins of the order of 40% of the minimum run time must be added. However, we estimate that when negative margins exist in a train’s timetable, its punctuality is on average 4.1% lower.

Distribution of margins The distribution of margins is also of interest. We found that the average WAD of margins was 0.56, and that the slope as regards punctuality is about 0.11. Supplements are also frequent directly following scheduled stops. They cluster around the “round” numbers of 30, 60, 90 and 120 seconds, although not having any supplements was the most common. We found that larger supplements, of at least 60 seconds, were associated with higher punctuality, while it was worse to have small supplements than to have none. These findings are statistically significant. We hypothesise that there is a tension between the added time supplements and the drivers’ behaviour, such that drivers sometimes overcompensate when given small supplements, believing their effects to be larger than they really are.

70 Paper 4. Problems in planning practice The paper is based on interviews with experienced timetable planners in southern Sweden and gives a description of their current situation. It identifies common errors in timetables, which influence the punctuality, as well as the reasons behind them.

Responsibilities The planners stated that they have a large individual responsibility in learning what is necessary and in performing quality control, and that there is not enough time for either. They described receiving frequent comments from dispatchers, centred on four areas: (a) crossing train paths at stations, (b) wrong track allocation of trains at stations, especially long trains, (c) insufficient dwell and meet times at stations and (d) insufficient headways leading to delays spreading. Yet, there is no established system or routine to keep track of or utilise comments from dispatchers. The planners also feel that important conditions change from year to year, making it difficult to draw direct comparisons and learn over time.

Difficulties The planning method has been largely the same for the last 20 years or so, but the work is becoming more difficult due to the increasing number of trains and engineering works. Problems increasingly occur at the stations, where there is insufficient capacity. This is exacerbated by ad hoc planners bending the rules to fit in more trains. Trainplan is the main tool used at the Swedish Transport Administration and it does not handle track allocation, conflict management, or provide topographical information. The more experienced planners work based more on discussions with the train operating companies than on a strict application of the guidelines. The guidelines were mentioned by all the interviewees, but they are interpreted liberally and were not described as helpful.

Margins All four planners use different methods to assign margins. Another common practice is to adjust arrival and departure times at stations to occur at whole minutes: usually adding, but sometimes subtracting, seconds up to the whole minute. Negative margins are often used for local trains on single-tracks at the request of train operating companies. The explanations for this vary from person to person, but they say that: “it has always been like this”. Only the most experienced planner adds a minute after a scheduled stop, although others were aware of the practice.

71 Dwell times The planner only deviates from requests when necessary, and this is done in dialogue with the applicant. The latter sets the dwell times. The planners say that the standard is two minutes, but they give the impression that shorter times dominate. Local trains are often given the same arrival and departure times, with no scheduled dwell time, to avoid waiting unnecessarily in case the train is delayed or the number of passengers is small. Dwell times in excess of two minutes are primarily for connections and phasing reasons, and in some cases during winter breaks.

Paper 5. Passengers and dwell times The aim of this paper is to study the dwell time delays that occur for commuter trains in Stockholm and Tokyo, and to see how many of them can be explained using passenger data. Different timetabling policies are discussed, with the intent of improving timetable planning and punctuality, and to reduce delays in both cities.

Method In the analysis we used passenger counts to explain the variation in dwell time delays. We also had information of the arrival delay to the station. The effects are studied using regression analysis, with two purposes: (1) to estimate the degree to which we can explain the variation in the dwell time delays, using passenger data, and (2) to estimate the impact of passenger volumes on the dwell time delays, in a comprehensible way. The first point relates to the coefficient of determination, the adjusted R2, while the second has to do with coefficient estimates. To fulfil the first purpose, it is relevant to include both squared variables, as well as variable interactions, as these may well, in fact, have impacts on the delays and add to the share which we can explain. The second purpose, however, is better served by keeping the models simple and straightforward, sacrificing explanatory power for interpretability.

Delay distribution In Stockholm and Tokyo, minor delays of at most five minutes make up 96 and 97% of delay hours respectively. In both cities, the delays mostly occur at stations in the form of dwell time delays. In Stockholm, 91% of the total delay time is generated at stations, while the corresponding figure in Tokyo is 88%. Thus, small dwell time delays make up a clear majority of all delays.

72 Scheduled dwell times The railway companies in the two cities have different policies on scheduling dwell times. The default in Stockholm is to use 42 seconds regardless of day, time or station, with a few exceptions. In Tokyo, the dwell times are adjusted to a much greater extent. The range is quite similar to that found in Stockholm, but the variation is greater: the 5th, 25th, 50th, 75th and 95th percentile values are 40, 45, 50, 60 and 115 seconds, respectively. Almost across the board, the dwell times in Tokyo are also longer than the 42 seconds used in Stockholm, and adjustments are made in five-second intervals, across different train services, stations, and hours.

Dwell time delays Dwell time delays are similar across the two combined datasets. In Stockholm the median delay is 6 seconds, compared to 5 in Tokyo, and the 95th percentile values are 34 and 30 seconds respectively. About the same proportion of trains make up time during the stops, in the two cities, but in Tokyo the ones that do make up slightly more time, with the fifth percentile being 21 seconds early there, compared to 8 seconds in Stockholm. The range of dwell time delays, from the 5th to the 95th percentiles, are quite small and comparable: 42 seconds in Stockholm and 51 seconds in Tokyo. As these are commuter trains, however, stops are frequent, and the seconds add up.

Explaining the variation In both cases, about 40% of the variation is explained using the full models, with squares and interactions, and about half as much by the simple linear models. In Tokyo, we find that a rise in the congestion rate of 10% corresponds to an increased dwell time delay of about one second. For arrival delay and scheduled dwell time, we find small negative effects – this implies that trains that arrive late have slightly shorter dwell times than otherwise, and that the influence of the personnel and driver are somewhat successful in reducing delays. Longer scheduled dwell times see slightly less delays, indicating that they include some margins, not just the minimum required time for the increased congestion rates. In Stockholm, the estimates are about 0.4 seconds per person and car, in either direction. We also see a slight effect from congestion, as the number of passing travellers further increases the delays.

73 Results

In this section we link the results from each of the papers to the overarching aim and research questions of the thesis, summarise them, and discuss our findings in relation to the research literature. In this way we work towards the overarching aim: to better understand the smaller delays that occur for passenger trains in Sweden, in order to improve timetable planning so that the delays are reduced. Schematically, the process of moving from observed deviations through influencing factors on to proposed measures can be illustrated as in Figure 8.

Figure 8 By linking observed deviations, such as delays and recovery, to data covering weather, timetable planning, operational factors, infrastructure and passengers (and more), we can identify some patterns. From these patterns we can hypothesise about the underlying mechanisms, which lead to the delays. These hypotheses can be investigated and tested more thoroughly, and if they are found to be believable, we can then propose measures which address the mechanisms and should lead to a reduction in delays.

Linking the papers to the research questions A brief summary of how the papers link to the research questions is shown in Table 8. The links are then discussed in turn, before we turn to the results.

Table 8 Connection between papers and research questions

Research Question Paper 1 Paper 2 Paper 3 Paper 4 Paper 5 RQ1. How are the delays distributed? X X RQ2. What factors affect delays, and to what X X X extent? RQ3. How can the allocation of run time X X X margins be improved? RQ4. How can the allocation of dwell times X X X be improved? RQ5. How can the practice of timetable XX planning be improved?

RQ1. How are the delays distributed, in a broad sense of the word? To understand and analyse the delays, we first need to know how they are distributed. The distribution can be considered along many dimensions, such as: among trains, in time, in space, and by size. The question is primarily addressed in Papers 1 and 5, which both consider delays explicitly, without using the aggregated indicator of punctuality. Paper 1 makes the distinction between dwell and run time

74 delays – happening at or between stations. It also describes how delays vary with weekdays and the months of the year, while Paper 5 describes the distribution by size, with over 95% of all delay time being made up of delays smaller than six minutes for commuter trains.

RQ2. What factors correlate with delays, and to what extent? Beyond understanding their distribution, we are interested in quantifying the link between delays and certain influencing factors, ranging from weather factors, timetable factors, operational factors, infrastructure factors and passenger factors. Papers 1 and 2 both consider the weather variables temperature, precipitation and snow depth, along with various timetable and operational factors, such as various indicators of capacity utilisation. Paper 2 also considers wind speed, more operational factors, and the complexity of the infrastructure while using a much larger dataset. Paper 5 considers the influence of passengers, using more detailed data.

RQ3. How can the allocation of run time margins be improved? One common way of reducing delays is to allocate run time margins in timetables, so that the train can run faster than scheduled and catch up to the timetable. How this allocation can be improved is addressed in Papers 1, 2 and 3. Paper 1 considers the size of margin, section by section. Papers 2 and 3 instead consider the whole journey, with the total size of margins, the distribution of these margins across the journey, the occurrence of negative margins, and, in Paper 3, margins directly following station stops.

RQ4. How can the allocation of dwell times be improved? Another way to make a timetable robust as regards delays is to schedule dwell times that are longer than necessary, so that any delays are reduced once the train reaches the station. These dwell times are considered explicitly in Papers 1 and 5, both of which present recommendations. Paper 1 considers the length of dwell times, and Paper 5 focuses on shorter stops, particularly for commuter trains. Paper 3 considers run time supplements directly following scheduled stops, intended to make up for delays that occur when dwell times are insufficient.

RQ5. How can the practice of timetable planning be improved? The process of timetable planning itself can both introduce delays and help reduce them. Paper 4 discusses at length how aspects of the timetabling process can be improved in this regard, with recommendations about both tools and support, routines for systematically evaluating outcomes, and clarifying the role of planners at different organisations. Paper 5 also touches upon the timetable planning process, focusing on the allocation and evaluation of dwell times.

75 RQ1. How are the delays distributed?

At stations, not between them In Paper 1 we found that the average delay for a train on a line section between two stations is very close to zero, while at stations almost half of all stops take longer than scheduled, and the delays that do occur also tend to be greater at station stops than on line sections. This pattern was found again in Paper 5, where more than 90% of all delay time occurred at stations.

Most are small In Paper 5 we found that in both Stockholm and Tokyo, minor delays of at most five minutes make up more than 95% of delay hours for commuter trains. For passenger trains in Sweden overall, the corresponding figure is about 85%.

In time There are small variations between months, with slightly fewer delays during spring and autumn. We also saw small but statistically significant differences in average delays across weekdays – although these were almost entirely explained by how many trains run each day. Traffic volume does not appear to be a major concern however: variations due to a higher number of trains per day are small, and in Paper 2 we found that punctuality is marginally higher at more busy stations and times.

Long distances Another way to consider how delays are distributed, is that they are larger and more frequent for trains travelling long distances. In Paper 2, the single best indicator for punctuality we find is the distance travelled by a train, with about -3% per 100 km. Closely related to this is the travel time of the journey, where the slope is about -1.6% of punctuality per hour. These are well in line with estimates in the literature showing that distance covered was statistically significant in determining punctuality.

RQ2. What factors correlate with delays, and to what extent?

Weather factors Both high and low temperatures are associated with large problems in operations, and we have identified exponential relationships between how more extreme temperatures lead to increasing drops in punctuality. In the extremes, with temperatures of positive or negative 30˚C, punctuality can drop by 50 percentage points or more, with an almost total collapse of operations. In the face of increasing

76 temperatures and more frequent heat waves, this suggests that more ought to be done to increase the railway systems’ resilience to high temperatures. Snow also has a clear effect on punctuality. Even with as little as five centimetres of snow, average punctuality was 17.5% lower than normal. These effects are substantially larger than we have found in the literature, as is the case for cold temperatures. The relation appears to be logarithmic, however, so while increasing amounts of snow does lead to more problems, the rate is diminishing. This may suggest an increased preparedness and ability to deal with snow in the regions where large snow depths are often found. When it comes to wind, delays appear to increase with increasing wind speeds: at ten metres per second punctuality is 2% lower than normal, and when we approach storm-level wind speeds of 23 metres per second, the punctuality drop is 9%. Again, this is a larger effect than we have found in comparable studies from other countries. Finally, we saw that the more precipitation a train is exposed to, the lower the punctuality. The figures are more modest, however, and the variation is only of the order of a couple of percentage points. In summary, weather can cause considerable delays, and the problems often begin even during seemingly normal conditions.

Timetable factors These are discussed under RQ.3-5.

Operational factors Punctuality is negatively correlated with the length of the journey, on average dropping by about 3% per 100 km, with the length between stops, and with the maximum speed of the trains. These patterns are likely due to problems stemming from heterogeneous speed profiles – long-distance and high-speed trains traversing multiple shorter distance and lower speed commuter train systems. This hypothesis is supported by repeated findings that interactions between trains, especially between stations, are associated with lower punctuality (by about 1% and 2-4% each, at and between stations, respectively). This issue appears to be more severe than the volume of traffic, per se, which only has a small impact.

Infrastructure factors The overall picture is that a simple infrastructure with less components performs better. We found a quadratic, negative relationship between punctuality and the number of switches. This suggests a potential of gaining disproportionately large punctuality benefits by limiting their number, particularly in large stations, where the numbers are high and even small gains in punctuality are very valuable. With signals, the relationship is linear, and more clearly related to the distance traversed.

77 Passenger factors The data on passengers we used in Paper 5 could explain approximately 40% of the variation in dwell time delays in rush hours for commuter trains. While by no means perfect, this is high compared to other studies attempting to explain delays or punctuality with empirical data. It is also a reasonable number, considering the range of other factors which affect delays. Comparing Stockholm to Tokyo, we find that the latter has about 60% less delays per degree of passenger congestion – they are much more efficient at reliably boarding and alighting. About half of this effect is due to Tokyo’s trains having twice as many doors – the other half can be explained by a combination of more appropriate scheduled dwell times, markings on the platforms, platform screen doors, more staff, and more discipline among both staff and passengers.

RQ3. How can the allocation of run time margins be improved?

Size of margins We find that levels of around 10% are the most efficient and effective, this is where the big gains are made, and that the benefits diminish at higher levels. This means that run time margins for Swedish trains are, in general, sufficiently large, and that they could in many cases be reduced somewhat, without major consequences for the punctuality of operations. Conversely, it would be quite costly to make major improvements in punctuality using only increased margins: to improve punctuality by five percentage points, journey times need to be extended by approximately 40%. This is not acceptable in most cases. One possible way of explaining the diminishing returns of margins is a behavioural response of the driver and other personnel.

No negative margins It is also important to avoid allocating any negative margins. We have seen that in 2015 about 40% of passenger trains had negative margins on at least one section, and that this was associated with a punctuality drop of 3-4%, depending on the definition of punctuality. In a time when the punctuality of the railways is about 5% lower than the stated goal of the industry, this is a very big and clear effect. Arguments about how the negative margins are compensated by positive ones on the next sections thus do not seem to hold in practice.

Distribution of margins How the margins are distributed can make a big difference. The important thing is that there are margins everywhere – not only for each train, but on every part of the journey. Otherwise the risk of delays increases dramatically. Beyond this, we have found that punctuality is highest when the centre of gravity of margins is about 0.6-

78 0.7. That is, close to the middle, but shifted a little towards the end. While there are large variations between trains, the average in Sweden is about 0.56, and could thus be shifted slightly to the second half of the journey.

Margins after stops Margins can also be placed directly after station stops – as stops are known to generate delays, which can then be recovered directly by a strategically placed time supplement. It is important that these margins are large enough – at least one minute per stop. If they are smaller than that, delays instead tend to increase. One possible explanation for this is a behavioural response: by allowing more time exiting stations, the train crew takes more time in executing that activity. In this case, they overcompensate. It is possible that this is made worse by the truncation of seconds in some sub-systems in the Swedish railway, so that the driver believes that the supplements given are larger than they really are. These problems disappear if the dwell time is long enough to begin with, without compensation from run time margins, and in these cases, punctuality is on average about three percentage points higher. The results are slightly better still if the compensation is large enough, and with a supplement of at least one minute after the stop, the punctuality is yet another percentage point higher.

RQ4. How can the allocation of dwell times be improved?

Reallocating time In Paper 1 we drew attention to the importance of dwell times. We saw a clear relation between the scheduled dwell time and the risk of delays for passenger trains on the studied regional line, Skånebanan. In the paper we presented an example of how dwell times could be re-allocated to reduce the delays. By setting 80% of dwell times to a relatively short 50 seconds, and 20% to 210 seconds, which was the level most effective at reducing delays, we estimated that delays on the line could be reduced by about 80%, without any increase in travel times.

The importance of passengers Towards the end of the research we returned to the topic of dwell times. Paper 5 in particular deals to a large extent with how they can be adapted to the number of passengers, with case studies from both Stockholm and Tokyo. In the first case, dwell times for commuter trains are generally not adapted to passenger volumes, while such adjustments are very common and important in Tokyo. This helps to keep their punctuality higher than in Stockholm, despite having many more trains, and many more passengers.

79 Often overlooked We have also found that dwell times are often overlooked in Swedish timetable planning. This is seen in the interviews in Papers 3 and 4, where timetable planners describe that they often receive feedback from dispatchers about how dwell times are insufficient, while there is no discussion or thought about how to adjust and improve these dwell times. This is also seen in Paper 5, where the difference in approach and object of focus between planners in Sweden and Japan becomes evident. While planners in Japan work in a very conscious and dedicated way to fine tune dwell times, planners in Sweden have long used the same template, with only a few exceptions. Both Papers 1 and 5 describe how most delays in Sweden arise at stations, indicating that the dwell times are not as robust, and do not contain the same level of margins, as the sections between stations.

Old strategy We suggest that timetable planners in Sweden focus more on allocating appropriate dwell times. For too long, dwell times have been treated like some sort of residual, been scheduled with a rough template that has been applied too liberally, or knowingly been given too little time. There has been a strategy that delays can arise at stations, to be recovered on the line, where there are margins. While this strategy could work, in principle, we have not found evidence that it is successful. While there are often large margins on the line sections between stations, they do not make up for the delays that occur at stations. Instead, it leads to systematic delays at stations, which introduces instability in the timetable, disrupts the scheduled interactions between trains, and largely leaves the operations to dispatchers rather than planners. Regardless of whether this strategy has been successful in the past, it does not work with today’s complex and dense traffic.

New strategy The baseline must be to allocate realistic dwell times, to allocate the time that it takes to complete a stop. Not the least amount of time that is possible, or a time that is sufficient for half of all trains, but the time that it should take for the given train at the respective station and time. This implies a move from rough rules of thumb, and that dwell times may well vary from hour to hour, station to station, and potentially from train to train. It may well be permissible to schedule longer dwell times than necessary, and to use them as a type of margin to absorb delays, but not to deliberately create delays to artificially speed up station stops. Of course, it will not be possible to entirely avoid dwell time delays, just as it is unrealistic to allocate run times that are never exceeded. But the current state is shifted heavily towards too short dwell times, and to dwell time delays being much more common than run time delays.

80 RQ5. How can the practice of timetable planning be improved?

Improved planning software For Papers 3 and 4 we interviewed timetable planners and found several weaknesses in the process for timetable planning. Based on the interviews, we found that the main planning software, Trainplan, supports neither conflict detection nor track allocation at stations. Both of these lead to recurring mistakes in planning, to delays, and to problems for dispatchers. The software was introduced around the new millennium, and has thus been used for almost two decades, without fixes to these issues. While a new tool (TPOS) is currently under development, which is intended to address these issues and modernise the planning process, the project is delayed, and it is unclear whether all the features will be implemented. This is reminiscent of the introduction of Trainplan, and it is thus important to emphasise these issues and to ensure that they are addressed by the new tools, and that these are implemented.

Improved documentation There are also deficiencies in the documentation, in a broader perspective. Timetable planners describe how important information about which interlocking systems are in use at various stations, how they work and how they must be handled from a planning perspective, is primarily documented in binders which the planners themselves must navigate, interpret and memorise. This is difficult and leads to mistakes being made. The guidelines used for planning have also remained in use for many years without major revisions, despite not giving much support in the issues which planners describe struggling with, and they are interpreted very liberally.

More systematic evaluation Connected to the lacking documentation, we see in Papers 3 and 4 that there is no systematic evaluation of timetable planning. It is up to each planner to find out what works well, and what should be improved. There are no tools to help with this, no detailed statistics, and far too little time. They also describe how important preconditions in the requests for train paths change from year to year, so that the problems and complaints that have arisen throughout one year cannot so easily be applied to the next year. This makes it very difficult to achieve a systematic evaluation, and evolution, of the timetable quality.

81 Clarified roles and responsibilities We see in Papers 3 and 4 that there is a lack of clarity and a role conflict in the role of timetable planner at the infrastructure manager – the Swedish Transport Administration. On the one hand, they are responsible for making sure that the timetable comes together, is feasible, and maintains a high quality. On the other hand, they are in close contact with timetable planners at the train operating companies. These companies run the trains, are closest to the customers, and ought to know best how they want to their timetables to be planned. Sometimes this leads to conflicts and situations where the train operating company would rather that the timetable planners at the infrastructure manager be less stringent with the guidelines and what would make for a robust timetable, to be able to run more trains. At the infrastructure manager the planners also appear to interpret the timetable requests as well motivated and thought through. They strive to make as few deviations from these as possible. The picture at the train operating companies can be quite different: that they do not have the same resources for timetable planning, that they usually stick to what they did last year, or to commercial demands, trusting that the infrastructure manager will ensure that the timetables maintain a high level of quality. Of course, this situation can lead to a gap, and to the quality suffering, as both parties assume that the other one has the responsibility.

Improved feedback loops The results presented in Paper 4 suggest that both researchers and practitioners should focus more on identifying and improving the relevant feedback-loops, to achieve a higher level of learning among those involved. Single-loop learning is both a technical and organisational issue. Since the tools are lacking, planners are hard-pressed just to finish their work. There is simply not sufficient time to perform quality control. Because there is no systematic review of the quality and outcome, there is no way to begin to improve the rules and guidelines, or to create a better timetable. As the tools do not provide enough assistance, the focus is, and must be, on creating a timetable before creating a better timetable. Creating a better timetable is what we imagine the timetable planners of the future will be tasked with doing, when more of the work has been automated and the software tools provide far more assistance. Rather than trying to manually execute all the details, they will choose which heuristics, goal-functions and constraints to apply in different scenarios, to achieve the best overall results. An illustration of this wider feedback-loop, or double-loop learning, is found in Figure 9.

82

Figure 9 A schematic illustration of the timetable planning process with single- and double-loop learning. In the case of single- loop learning, the timetable planner uses comparison between scheduled and realised time to inform their own discretion. In double-loop learning, the comparison is used to inform the understanding of various factors relating to weather, timetables, the infrastructure, passengers, and operational concerns, and this improved understanding is used to improve the requests, run time calculation, and timetable planning guidelines. The outcome is then less dependent on the discretion of individual planners, and the planners can focus more on improving the run time calculations and guidelines, as well as potentially helping the train operating companies make more informed requests, than on remembering countless details, tweaks and exceptions to the guidelines.

International relevance This study focuses on Sweden. In the literature, we have seen large similarities with the United Kingdom, which uses the same tools, and has a similarly deregulated market. The new tools that are currently being implemented in Sweden have recently been implemented, by the same supplier, in both Norway and Denmark. Experts from the infrastructure manager and largest train operator in the Netherlands (Olink and Scheepmaker 2017) describe similar issues with errors causing delays and infeasibility there. We believe that the planning process is largely similar across most European countries, although the level of deregulation and competition between train operating companies for capacity may vary, as will the tools and contexts.

83

Concluding Discussion

Recommendations

This section describes our recommendations about what to do, in practice, based on the results and conclusions. Of course, there is some overlap between this and the conclusions themselves. The recommendations are placed in four clusters: allocating margins, scheduling dwell times, timetabling process, and physical changes. All of them are intended to reduce delays and improve punctuality, while being highly cost-efficient. In practice, these can be considered for implementation by the Swedish Transport Administration in their new planning tool TPOS, and by other infrastructure managers and railway operators.

Size and allocation of margins Margins – or run time supplements – clearly play an important part in timetables. However, the process for allocating them currently takes too long, and is too arbitrary, and it both could and should be automated to a greater extent, so that planners can spend more time and energy on more difficult and important issues.

New baseline When timetables are created, one of the first steps is to calculate run times. Currently, these calculations assume that trains run 3% slower than possible, essentially adding an automatic level of margins of 3%, in excess of the levels we have discussed in the thesis. We recommend that the procedure is changed, to instead automatically add margins of 10% to all run times. This is a good base level and compromise, higher than what is common internationally, but lower than what is currently added manually, and it is very simple to do. This automatic addition should replace the current practice of node supplements, which is confusing, time-consuming and highly arbitrary. Ideally, the margins should be made explicit and displayed so that both timetable planners and train drivers can easily see them.

85 Automatic distribution Taking this a step further, the computer could be instructed to add margins in such a way that the weighted average distance comes in at around 0.67, while keeping the margins as evenly distributed as possible. This is a slightly more complicated adjustment, but it can be performed automatically either ahead of time, or on the fly, and it should improve punctuality by about one half of a percentage point. In parts of the network that are not so highly congested, it might also be worth adding a time supplement of 60 seconds directly following a scheduled stop, to further absorb delays that often happen there. While this slows down the flow of trains leaving the station and reduces the capacity, making it infeasible in some cases, in many cases this is not a problem, and the punctuality of the affected trains would likely improve by about one percentage point.

No manual rounding Planners should be instructed not to concern themselves with adding or subtracting seconds so that the trains arrive to and depart from stations at whole minutes. These adjustments take a lot of time to do while they neither improve the quality of the timetable, nor are necessary for the technical systems, passengers, or train drivers. In the rare cases that the systems do require times on whole minutes, these adjustments can and should be made automatically by the computer. Timetable planners should not spend any time on this.

No negative margins Negative margins should neither be tolerated nor possible to add in the timetabling software. With an increase in the automatic, base-line margins, it is conceivable that manual subtractions of a few seconds would still leave a positive level of margins, but it should never be possible to reduce margins to below a level of about 5%. If the trains do not fit in the timetable, they should not be run.

Role for manual additions Manually adding to the margins should still be allowed, but with the purpose of coordinating train paths, rather than absorbing delays. Another permissible case is when maintenance works are being carried out and margins are needed to compensate for reduced speeds, although properly recalculating the run times and rescheduling meetings and overtakes is much better than simply adding margins, this may not be realistic in the short term.

86 Summary If all these recommendations were implemented, the size of margins would decrease somewhat for most trains. On average, the difference may be of the order of about five percentage points of the run time. Because there are diminishing marginal returns, this reduction will not have a significant impact on punctuality, likely of the order of five tenths of a percentage point, and it will be more than made up for by the improved efficiency of the allocation, and the reduction in mistakes. The time is better spent on increasing dwell times, which are systematically too short.

Scheduling dwell times

Bigger role in planning Instead of focusing so much on margins, which can easily be automated, this thesis suggests that more time and attention should be spent on dwell times. This includes the efforts of the planners who schedule the times, the dispatchers and train crews that are engaged in operations, and for managers, analysts, planners and others who evaluate the outcomes. The ambition should be to use realistic dwell times, so that the stops routinely take the amount of time that they are supposed to. Again, it is illustrative to compare against run times, as in Paper 1: the probability of delay is three times higher for dwell times. If dwell times were scheduled realistically, instead of optimistically, the frequency and magnitude of these delays would decline rapidly, and a large portion of all delays would be eliminated. While the scheduled travel times would increase somewhat, which could be compensated by reduced margins between stations, adjusting for systematic delays does not increase the total realised travel times.

Higher precision in planning Our recommendation is to vary the dwell times by train type and station, in intervals of 5-10 seconds, depending on what is most practical from a technical standpoint. In principle, the times should also be varied over time, so that they are shorter in off-peak hours, at weekends and other times when the flows of passengers are lower. Practically speaking, however, this step is difficult to implement right away, and there are benefits to having the same timetable all day and every day, even though there are some losses in efficiency. In this case, it is much better to plan for the high flows during peak hours, when dwell times are longer and the risk of delays is higher, and to accept slightly longer times in the off-peak periods, than to be efficient while off-peak and constantly struggle with delays when the flows of trains and passengers are the highest. The arrival and departure times should be displayed to the drivers and dispatchers, but not necessarily to the passengers, with a precision of seconds.

87 Periodic delay recovery To further improve punctuality, a good strategy can be to make about one in five stops significantly longer than the others, of the order of 3-4 minutes. This has a stabilising effect on the traffic and helps to return trains to their scheduled paths. It is more relevant to use for trains that travel long distances, and where the stops are less frequent. The punctuality is much lower for these trains, and the extensions to travel time are not so big across the whole journey. For local trains that make many stops, the increases to travel time would be considerable, while the punctuality is already quite good.

Systematic evaluation and adjustment Finally, it is important to evaluate the outcomes of the dwell times, and to adjust them going forward, at least once per year. These evaluations should be done by station and train type, and the lessons are easy to apply in practice. In order to do this in a rational way, the infrastructure manager should systematically collect data on actual dwell times – on the level of seconds – from on-board systems, or at least require that the operators report the distributions of dwell times by train type and station, as a part of the information required to allocate capacity. These data are much more direct and useful for timetable planning than aggregate indicators such as punctuality and are in fact a precondition for proper scheduling.

The practice of timetabling

Clarify roles and responsibilities The Swedish Transport Administration should assume clear responsibility in ensuring that timetables maintain a high quality. Currently, there is some uncertainty if they should do so, or if that should be the domain of the train operating companies. As this requires some highly specialised expertise and resources, which not all companies possess, the baseline must however be that the infrastructure manager performs this role until the companies clearly express the interest and display the ability to do so. The infrastructure manager is also the only actor that can properly coordinate both traffic and maintenance projects from a multitude of companies. If some companies later decide that timetable planning is a core function for them, and invest in this capability, this responsibility can perhaps be shifted. As it stands, however, the only feasible solution is that the infrastructure manager assumes the responsibility.

88 Improve the tools To enable the timetable planners to fulfil this responsibility, better tools are required. These tools should, among other things, be able to: (1) automate the allocation of margins, (2) solve the issues of rounding to whole minutes, (3) eliminate the practice of negative margins, (4) detect conflicting train movements, (5) display the topography, (6) suggest reasonable dwell times, (7) support track allocation at stations and (8) keep track of feedback from dispatchers. In order to make the feedback loops shorter and address small mistakes, it should also be possible to make minor changes to timetables during the year, without having to create entirely new train identification numbers. Either on a continual basis, or on one or two occasions during the year, before the next annual timetable takes effect. While each change might only affect one or two trains, and only marginally affect punctuality, it promotes a culture of constant, gradual improvements and fine-tuning, which is very important in the long term.

Systematic evaluation and improvement Equally important is to make the evaluation of previous timetables and outcomes a routine part of the process and job description of timetable planners, and to allocate enough time for this. Many of the other recommendations we make regarding timetable planning free up time, which can then be used for these purposes. In line with shifting more attention towards evaluation, one key step is to improve the documentation of policies, strategies, guidelines and analyses, so that these things are clearly written down. Another part of this is to organise joint conferences and seminars where timetable planners can meet across regions, organisations, and sometimes countries, to share experiences, discuss policies, communicate with one another, and learn from research.

Seconds, not minutes Finally, some of the problems we have identified in timetables and timetable planning occur because planning, operation and evaluation is done on the level of minutes, rather than seconds. That dwell times are considered as residuals is often due to planners wanting to have departure times at whole minutes. The same holds for many both positive and negative margins, caused by a belief that some scheduled times need to occur at whole minutes. The first leads to waste and the other causes delays, and in either case the timetable planners spend time and mental capacity on matters that do not create value, rather than spending time on creating a high-quality timetable. Using minutes rather than seconds also makes it more difficult for train drivers and dispatchers to operate in a good way, because they receive feedback about being ahead of or behind schedule only when the deviation has reached one or more minutes.

89 Some physical changes

Markings on platforms Finally, we can recommend some physical changes in the infrastructure that would likely reduce delays. The first is to make physical markings on platforms where the trains and doors will stop. If train drivers stick consistently to these, it will become clear to the waiting passengers where they should stand. This makes the process of boarding and alighting faster and more reliable, both because it distributes passengers more evenly along the platform, and because those waiting to board are not standing in the way of those getting off the train. For trains with seat reservations, the benefits are even greater, as passengers who know where to board will do so more quickly, without having to move along the platform once the train has arrived. As small delays at stations are a very big part of the problem in Swedish railways, addressing the root cause in this way is an important step towards increasing overall punctuality.

Reduce the number of switches Another measure that can be taken, is to gradually begin removing non-critical switches that are rarely used. They are statistically disproportionately connected to decreased punctuality, introduce more complexity, risk breaking down, and need to be maintained. This maintenance diverts resources from more useful infrastructure, which in turn breaks down more frequently than would otherwise be the case. By getting rid of these rarely used switches, there will be fewer elements that can break down, and more time and money left to maintain those that remain, in a virtuous circle, with less delays and higher punctuality as a result.

Improve the resiliency Finally, a programme should be initiated to improve the resiliency of infrastructure with regard to weather and climate, particularly temperature and snow. The Swedish railways are currently very vulnerable to adverse weather, which will only become more frequent and extreme as a result of climate change. Types of measures include shielding signals and other electronics from direct sunlight, which can lead to overheating even during moderate air temperatures, limiting the number of switches so that snow and ice cause less problems and are easier to deal with, and making sure that the drainage is sufficient, to avoid flooding.

90 Some Reflections

Further research

More on dwell times One of the contributions of this thesis is to show the importance of dwell times – a large part of all delays occur at station stops. Meanwhile, one of the key differences when interviewing timetable planners in Sweden and Japan is precisely the awareness of dwell times. Put together, this suggests that dwell times should receive considerably more attention in research as well as practice. This should cover the scheduling of appropriate dwell times, better understanding the different mechanisms that lead to delays at stations, coming up with measures to address these mechanisms and delays, and testing the effectiveness of these methods both in simulations and in practice.

Interactions and dispatching decisions Interactions between trains is one of the factors most clearly linked to decreased punctuality in this thesis. Many of these interactions are due to some kind of dispatching decisions – holding a train back, shifting a train meeting from one station to another, etc. These interactions and decisions can be identified and evaluated using historical data such as that used in this thesis. This can even open a window into understanding dispatching strategies used in practice – an understanding which could improve the quality of dispatching and the punctuality of operations. The same methods can also inform simulation models about how dispatching is performed in practice, and thus help to further calibrate and validate dispatching modules.

Climate adaptation pathways We have shown that the Swedish railways are not robust when it comes to variations in weather and are particularly vulnerable to high temperatures and extreme weather conditions. With climate change, these conditions will become increasingly prevalent, and the railways must adapt. There is a large body of research around this subject in other sectors, partially under the label of adaptation pathways, and there is a lot more to be done to this end in railway research. Understanding more precisely which components break down, how these breakdowns can be avoided, where the problems are the largest and where they are expected to increase the most, when measures can and must be taken, and how to avoid investing in the wrong things, are just some of the questions that can and must be addressed.

91 Reflections on methodology

The basic approach The large amount of data used in this thesis provided a good opportunity for quantitative methods. With such large datasets they are able to detect and quantify even subtle influences on delays and punctuality and – with the help of visualisation – to describe the approximate shapes of these effects. Another strength of a quantitative and empirical approach is that it is possible to detect and evaluate both interactions of trains, and the various choices and strategies used by timetable planners. There are some difficulties, however. Visualising data, for instance, becomes difficult – on a practical level – when the number of observations range from hundreds of thousands to tens of millions. In some of the papers where we studied a wide range of factors, it can also be difficult to disentangle the effects of one influence from another. While methods like regression help with this – and are to some extent made for this – the results can be difficult to visualise and interpret when the number of factors is large. It can also be quite difficult to establish precise mechanisms of causation or chains of events that are easy to explain to practitioners, using large amounts of data. It is one thing to establish that there is a statistically significant link between high temperatures and decreased punctuality, for instance, and quite another to explain why a certain train that ran on a given day was delayed to the extent that it was.

Truncation One issue with train movement data in Sweden is that it is truncated to the level of minutes, such that the seconds are lost. This, in combination with the fact that the data are based on the occupation of signal blocks rather than the arrival to and departure from the platform, causes imprecision in the data. When run and dwell times are calculated from this data, however, these errors will tend to cancel out over large enough samples. One should take care when considering individual observations, however, as these imprecisions could mean that random variations of a few seconds give observations that vary by up to two minutes in length. These errors are not systematic, however, and with large enough samples, the estimates should not be biased to any real extent. The issues around truncation also become much less relevant when dealing with punctuality, than with individual observations.

92 Aggregation and punctuality An additional benefit of working with punctuality rather than series of individual run and dwell time observations is that it dramatically reduces the number of observations – often by a factor of one hundred or more. This means that any calculations are much easier to perform. As the computational power increases, however, this aggregation into punctuality becomes less relevant, and it is preferable to work with more direct observations of delays. Partly because the increased number of observations makes it easier to determine whether effects are statistically significant or not, and partly because delays are a more precise and direct indicator of what is going on. This can result in a greater explanatory power of the estimated models.

The interviews Interviewing planners was very rewarding. They were very frank and forthcoming, and provided good leads and new information that would not be possible to acquire in other ways. Transcribing the material was very time consuming, but being able to cross-examine the transcribed material, and return to it with new sets of questions, was very interesting and fruitful. A potential drawback with the approach is that it is based on the subjective and not necessarily accurate descriptions of the interviewees. Being able to check these against the actual data is very valuable. For instance, according to the planners themselves, the practice of giving trains negative margins on some line sections should not be very problematic at all, since they are compensated by positive margins on the next sections – but in practice, the data show that the compensation is not successful, and that the delays remain.

The interviewees The question of who is interviewed is also very important, and interviewing planners at train operating companies, or dispatchers, would certainly result in a different description of the issues at hand. Naturally, the subjectivity of the interviewer and interpreter of the transcribed material are also involved, and to some extent this is unavoidable. This is the case with quantitative work as well, however, and in both cases, there is often quite a long process going from raw data to a finished analysis and a set of conclusions. On a personal level, I found that the qualitative approach gave quite a lot of interesting insights, and I believe that there is a lot more such work to be done in the railway sector.

93 Alternative methods Moving on to alternative methods, simulation and optimisation are other commonly used approaches for research on timetable planning. They rely, however, on good delay data and descriptions of planning strategies. Knowledge and data on the sources or distributions of delays cannot be generated with these approaches themselves. Instead, it is useful to have a dialogue between the different approaches, where empirical work – such as that found in this thesis – can inform the more theoretical work on simulation and optimisation, and vice versa.

Delay cause data On the empirical side of research, the main alternative is to base the research on so- called delay cause data. The benefit, compared to what has been presented in this thesis, is that the explicit causes of the delays are known – and coded in the data. There is thus less reliance on statistical relations, and on collecting and combining multiple datasets and sources. One drawback is that the delay cause data omits small delays – the exact threshold varies by country, but in Sweden delays smaller than three minutes in size are not coded. As has been shown in this thesis, small delays make up over half of all delay time – it is not prudent to simply omit them. Other drawbacks are that the data on delay causes is already well known and utilised in the industry, and that there are well documented errors (of the order of 20%) in the coding of delays, as well as more philosophical criticism of how feasible it really is to determine one “cause” for a specific delay. A more fruitful approach might be to combine the two approaches – using alternative datasets and statistical tools to validate the delay cause data, and vice versa. Once this has been done, it might be possible use machine learning approaches to find patterns and classify even the small delays.

Qualitative methods Another approach would be to perform more qualitative work. Timetable planning, and the railway sector in general, has rich processes and traditions to study using qualitative approaches with interviews, document studies, anthropological methods, and others. There are very important interfaces to study, between timetable planning and dispatching, drivers, maintenance, and more. These are quite difficult to approach from a quantitative perspective, and qualitative methods are thus the only real alternative. A risk with relying too heavily on interviews and such, however, is that the research and findings might be limited to what practitioners already know and consider to be problems. It is also difficult to quantify the scope of problems, without using quantitative methods.

94 Contributions of the thesis

The contribution to research The contribution of this thesis to research has mainly been to present new empirical studies that: (1) describe and quantify how a range of factors are associated with delays, (2) evaluate the effects of timetable planning practice on train delays and punctuality, and (3) the overarching method and conceptual figures used to gather, connect, process and analyse vast amounts of data from different sources. The thesis is written in a time when big data is discussed and used seemingly everywhere, and my work has been a step in bringing this approach to railways and timetable planning.

The contribution to practitioners To practitioners, the most significant contribution of the work has been to demonstrate the central importance of small dwell time delays – smaller than three minutes. In Sweden, these are often overlooked, partly because of a conscious but misguided strategy, partly due to lacking tools and routines. We have shown that, in practice, most delays are small and occur at stations. If punctuality is to improve noticeably, these delays must be addressed. The other contribution to practitioners is to show that a great deal can be done with timetable planning. Partly by reducing mistakes, both conscious and inadvertent, that systematically cause delays. Partly by further improving the capacity of timetables to absorb delays. We have shown that the distribution of margins is often more important than their overall size, and that the returns from increasing them quickly diminish, suggesting a possibility of reducing margins and travel times without the risk of delays increasing. The thesis also contains a few other suggested measures around stations that would help to reduce delays in a very cost-effective manner.

The contribution to the public The main contribution to society at large is the increased knowledge about and attention to train delays – a problem which affects the daily lives of millions of people. If the recommendations in this thesis are implemented, the problem of delays should diminish significantly. Trains would not systematically be delayed at stations, and neither the summers nor the winters would cause nearly as much disruptions. The timetable would continue to get better year by year, without necessarily extending travel times, and punctuality would climb steadily. Perhaps not quite to the level achieved in countries like Japan, but the industry target of 95% is certainly within reach. This would be a great service to the public. It would raise the level of trust and confidence in the railway and serve to further increase the number and share of journeys made by train. It would also do a lot to reduce the annoyance and frustration that many people who already travel by train.

95

References

Ali, Abderrahman Ait, and Jonas Eliasson. 2019. “Railway Capacity Allocation : A Survey of Market Organizations , Allocation Processes and Track Access Charges.” VTI Working paper 2019:1. Linköping: VTI. Ali, Abderrahman Ait, Jennifer Warg, and Jonas Eliasson. 2019. “Pricing Commercial Train Path Requests Based on Societal Costs.” VTI Working paper 2019:2. Linköping: VTI. Andreasson, Rebecca, Anders A. Jansson, and Jessica Lindblom. 2018. “The Coordination between Train Traffic Controllers and Train Drivers: A Distributed Cognition Perspective on Railway.” Cognition, Technology and Work. https://doi.org/https://doi.org/10.1007/s10111-018-0513-z. Argyris, Chris, and Donald Schön. 1996. Organizational Learning II: Theory Method and Practice. Reading, Mass.: Addison-Wesley. Avelino, Flor, M.C.G. te Brömmelstroet, and G Hulster. 2006. “The Politics of Timetable Planning: Comparing the Dutch to the Swiss.” In Proceedings of the Colloquium Vervoersplanologisch Speurwerk 2006. Baee, Sonia, Farshad Eshghi, S. Mehdi Hashemi, and Rayehe Moienfar. 2012. “Passenger Boarding/Alighting Management in Urban Rail Transportation.” In Proceedings of the 2012 Joint Rail Conference. https://doi.org/10.1115/jrc2012-74102. Banverket. 2000. “Riktlinjer För Tidtabellskonstruktion För Tåg På Statens Spåranläggningar.” Internal document. Borlänge: Banverket. Batley, Richard, Joyce Dargay, and Mark Wardman. 2011. “The Impact of Lateness and Reliability on Passenger Rail Demand.” Transportation Research Part E: Logistics and Transportation Review 47 (1): 61–72. https://doi.org/10.1016/j.tre.2010.07.004. Börjesson, Maria, and Jonas Eliasson. 2011. “On the Use of ‘Average Delay’ as a Measure of Train Reliability.” Transportation Research Part A: Policy and Practice 45 (3): 171–84. https://doi.org/10.1016/j.tra.2010.12.002. Buchmueller, S., U. Weidmann, and A. Nash. 2008. “Development of a Dwell Time Calculation Model for Timetable Planning.” WIT Transactions on the Built Environment 103: 525–34. https://doi.org/10.2495/CR080511. Carey, Malachy. 1998. “Optimizing Scheduled Times, Allowing for Behavioural Response.” Transportation Research Part B: Methodological 32B (5): 329–42. https://doi.org/10.1016/S0191-2615(97)00039-8. ———. 1999. “Ex Ante Heuristic Measures of Schedule Reliability.” Transportation Research Part B: Methodological 33 (7): 473–94. https://doi.org/10.1016/S0191- 2615(99)00002-8.

97 Ceder, Avishai. 2001. “Public Transport Scheduling.” In Handbook of Transport Systems and Traffic Control, edited by K.J. Button and D.A. Hensher, 1st ed., 539–58. Oxford: Elsevier. Ceder, Avishai, and Stephan Hassold. 2015. “Applied Analysis for Improving Rail-Network Operations.” Journal of Rail Transport Planning and Management 5 (2): 50–63. https://doi.org/10.1016/j.jrtpm.2015.06.001. Cerreto, Fabrizio, Otto Anker Nielsen, Steven Harrod, and Bo Friis Nielsen. 2016. “Causal Analysis of Railway Running Delays.” In 11th World Congress on Railway Research (WCRR 2016). Cheng, Yung-hsiang, and Yu-chun Tsai. 2014. “Train Delay and Perceived-Wait Time : Passengers ’ Perspective.” Transport Reviews 34 (6): 710–29. https://doi.org/10.1080/01441647.2014.975169. D’Acierno, Luca, Marilisa Botte, Antonio Placido, Chiara Caropreso, and Bruno Montella. 2017. “Methodology for Determining Dwell Times Consistent with Passenger Flows in the Case of Metro Services.” Urban Rail Transit 3 (2): 73–89. https://doi.org/10.1007/s40864-017-0062-4. DB Netz. 2017. “Network Statement 2018.” Frankfurt am Main: DB Netz. Dewilde, Thijs, Peter Sels, Dirk Cattrysse, and Pieter Vansteenwegen. 2013. “Robust Railway Station Planning: An Interaction between Routing, Timetabling and Platforming.” Journal of Rail Transport Planning and Management 3 (3): 68–77. https://doi.org/10.1016/j.jrtpm.2013.11.002. Dobney, K., C. J. Baker, A. D. Quinn, and L. Chapman. 2009. “Quantifying the Effects of High Summer Temperatures Due to Climate Change on Buckling and Rail Related Delays in South-East United Kingdom.” Meteorological Applications 16: 245–51. https://doi.org/10.1002/met. Dollevoet, Twan. 2013. “Delay Management and Dispatching in Railways.” Doctoral thesis. Rotterdam: Erasmus University Rotterdam. Fellows, R., and A. Liu. 2015. Research Methods for Construction. 4th ed. Hoboken, NJ: Wiley-Blackwell. Ferranti, Emma, Lee Chapman, Susan Lee, David Jaroszweski, Caroline Lowe, Steve McCulloch, and Andrew Quinn. 2018. “The Hottest July Day on the Railway Network: Insights and Thoughts for the Future.” Meteorological Applications 25 (2): 195–208. https://doi.org/10.1002/met.1681. Ferranti, Emma, Lee Chapman, Caroline Lowe, Steve McCulloch, David Jaroszweski, and Andrew Quinn. 2016. “Heat-Related Failures on Southeast England’s Railway Network: Insights and Implications for Heat Risk Management.” Weather, Climate, and Society 8 (2): 177–91. https://doi.org/10.1175/WCAS-D-15-0068.1. Frauenfelder, Regula, Anders Solheim, Ketil Isaksen, Bård Romstad, Anita V. Dyrrdal, Kristine H. H. Ekseth, Alf Harbitz, et al. 2017. “Impacts of Extreme Weather Events on Transport Infrastructure in Norway.” Natural Hazards and Earth System Sciences Discussions, no. December: 1–24. https://doi.org/10.5194/nhess-2017-437. Gestrelius, Sara, Martin Aronsson, Martin Forsgren, and Hans Dahlberg. 2012. “On the Delivery Robustness of Train Timetables with Respect to Production Replanning Possibilities.” In 2nd International Conference on Road and Rail Infrastructure.

98 Gestrelius, Sara, Anders Peterson, and Martin Aronsson. 2019. “Timetable Quality from the Perspective of an Infrastructure Manager in a Deregulated Market: A Case Study of Sweden.” In Proceedings of the 8th International Conference on Railway Operations Modelling and Analysis (ICROMA) - RailNorrköping, 505–24. Ghaemi, Nadjla, Aurelius A. Zilko, Fei Yan, Oded Cats, Dorota Kurowicka, and Rob M.P. Goverde. 2018. “Impact of Railway Disruption Predictions and Rescheduling on Passenger Delays.” Journal of Rail Transport Planning and Management 8 (2): 103– 22. https://doi.org/10.1016/j.jrtpm.2018.02.002. Gholami, Omid, and Johanna Törnquist Krasemann. 2018. “A Heuristic Approach to Solving the Train Traffic Re-Scheduling Problem in Real Time.” Algorithms 11 (4): 55. https://doi.org/10.3390/a11040055. Gorman, Michael F. 2009. “Statistical Estimation of Railroad Congestion Delay.” Transportation Research Part E: Logistics and Transportation Review 45 (3): 446– 56. https://doi.org/10.1016/j.tre.2008.08.004. Goverde, and Ingo Arne Hansen. 2013. “Performance Indicators for Railway Timetables.” In IEEE ICIRT 2013 - Proceedings: IEEE International Conference on Intelligent Rail Transportation. https://doi.org/10.1109/ICIRT.2013.6696312. Goverde, Robert Michael Petrus. 2005. “Punctuality of Railway Operations and Timetable Stability Analysis.” Doctoral thesis. Delft: Delft University of Technology. Gummesson, Mats. 2018. “Tillsammans För Tåg i Tid Resultatrapport 2018.” Report. Stockholm: Järnvägsbranschens Samverkansforum. Hagen, Mark van, and Niels van Oort. 2018. “Improving Railway Passengers Experience : Two Perspectives.” Presented at Conference for Advanced Systems in Public Transport - CASPT 2018. Hansen, I. A. 2009. “Railway Network Timetabling and Dynamic Traffic Management.” International Journal of Civil Engineering 8 (1): 19–32. Harris, Nigel G., Christian S. Mjøsund, and Hans Haugland. 2013. “Improving Railway Performance in Norway.” Journal of Rail Transport Planning and Management 3 (4): 172–80. https://doi.org/10.1016/j.jrtpm.2014.02.002. Hawchar, Lara, Paraic C. Ryan, Owen Naughton, and Paul Nolan. 2018. “High-Level Climate Change Vulnerability Assessment of Irish Critical Infrastructure.” In Civil Engineering Research in Ireland Conference. Heinz, Wiktoria. 2003. “Passenger Service Times on Trains.” Licentiate thesis. Stockholm: Royal Institute of Technology. Hurk, Evelien van der. 2015. “Passengers, Information, and Disruptions.” Doctoral thesis. Rotterdam: Erasmus Rotterdam University. International Union of Railways. 2000. “UIC Code 451-1 Timetable Recovery Margins to Guarantee Timekeeping.” Paris: Union International de Chemin de Fer. ———. 2004. “UIC CODE 406 Capacity 1st Edition.” Paris: Union International de Chemin de Fer. Johnson, R. Burke, and Anthony J. Onwuegbuzie. 2007. “Mixed Methods Research: A Research Paradigm Whose Time Has Come.” Educational Researcher 33 (7): 14–26. https://doi.org/10.3102/0013189x033007014.

99 Kamizuru, Tomomi, Tomoyuki Noguchi, and Norio Tomii. 2015. “Dwell Time Estimation by Passenger Flow Simulation on a Station Platform Based on a Multi-Agent Model.” In RailTokyo2015, 1–15. Kecman, Pavle, and Rob M. P. Goverde. 2015. “Online Prediction of Train Event Times.” IEEE Transactions on Intelligent Transportation Systems 16 (1): 465–74. Khoshniyat, Fahimeh, and Anders Peterson. 2015. “Robustness Improvements in a Train Timetable with Travel Time Dependent Minimum Headways.” Presented at RailTokyo2015. Koetse, Mark J., and Piet Rietveld. 2009. “The Impact of Climate Change and Weather on Transport: An Overview of Empirical Findings.” Transportation Research Part D: Transport and Environment 14 (3): 205–21. https://doi.org/10.1016/j.trd.2008.12.004. Kottenhoff, Karl, and Camilla Byström. 2010. “När Resenärerna Själva Får Välja.” Report. Stockholm: The Royal Institute of Technology. Kvale, Steinar. 1997. Den Kvalitativa Forskningsintervjun. Lund: Studentlitteratur. Lamorgese, Leonardo, Carlo Mannino, Dario Pacciarelli, and Johanna Törnquist Krasemann. 2018. “Train Dispatching.” In Handbook of Optimization in the Railway Industry, edited by R. Borndörfer, 268:265–83. Springer. https://doi.org/10.1007/978- 3-319-72153-8_12. Li, Dewei, Winnie Daamen, and Rob M.P. Goverde. 2015. “Estimation of Train Dwell Time at Short Stops Based on Track Occupation Event Data.” In 6th International Conference on Railway Operations Modelling and Analysis, 1–20. Lidén, Tomas. 2018. Concurrent Planning of Railway Maintenance Windows and Train Services. Linköping Studies in Science and Technology. Dissertations NV - 1957. https://doi.org/10.3384/diss.diva-152491. Lindfeldt, Anders. 2015. “Railway Capacity Analysis: Methods for Simulation and Evaluuation of Timetables, Delays and Infrastructure.” Stockholm: The Royal Institute of Technology. Liu, Kai, Ming Wang, Yinxue Cao, Weihua Zhu, and Guiling Yang. 2018. “Susceptibility of Existing and Planned Chinese Railway System Subjected to Rainfall-Induced Multi-Hazards.” Transportation Research Part A: Policy and Practice 117 (December): 214–26. https://doi.org/10.1016/j.tra.2018.08.030. Longo, G., and G. Medeossi. 2012. “Enhancing Timetable Planning with Stochastic Dwell Time Modelling.” WIT Transactions on the Built Environment 127: 461–71. https://doi.org/10.2495/CR120391. Marković, Nikola, Sanjin Milinković, Konstantin S. Tikhonov, and Paul Schonfeld. 2015. “Analyzing Passenger Train Arrival Delays with Support Vector Regression.” Transportation Research Part C: Emerging Technologies 56: 251–62. https://doi.org/10.1016/j.trc.2015.04.004. Mattsson, Lars Göran, and Erik Jenelius. 2015. “Vulnerability and Resilience of Transport Systems - A Discussion of Recent Research.” Transportation Research Part A: Policy and Practice 81: 16–34. https://doi.org/10.1016/j.tra.2015.06.002. Nagy, Enikő, and Csaba Csiszár. 2015. “Analysis of Delay Causes in Railway Passenger Transportation.” Periodica Polytechnica Transportation Engineering. https://doi.org/10.3311/pptr.7539.

100 Nelldal, Bo-Lennart, Olov Lindfeldt, and Anders Lindfeldt. 2009. “Kapacitetsanalys Av Järnvägsnätet i Sverige Delrapport 1 Hur Många Tåg Kan Man Köra? En Analys Av Teoretisk Och Praktisk Kapacitet.” Report. Stockholm: The Royal Institute of Technology. Network Rail. 2018. “Network Code.” London: Network Rail. Nicholson, Gemma L., David Kirkwood, Clive Roberts, and Felix Schmid. 2015. “Benchmarking and Evaluation of Railway Operations Performance.” Journal of Rail Transport Planning and Management 5 (4): 274–93. https://doi.org/10.1016/j.jrtpm.2015.11.004. Nie, Lei, and Ingo Arne Hansen. 2005. “System Analysis of Train Operations and Track Occupancy at Railway Stations.” European Journal of Transport and Infrastructure Research 5 (1): 31–54. Nilsson, Jan-Eric, Roger Pyddoke, and Inge Vierth. 2015. “Regress - En God Idé i Järnvägssektorn?” Report. Linköping: VTI. Noland, Polak J.W. 2002. “Travel Time Variability: A Review of Theoretical and Empirical Issues.” Transport Reviews 22 (1): 39–54. https://doi.org/10.1080/01441640010022456. Nyström, Birre. 2008. “Aspects of Improving Punctuality.” Doctoral thesis. Luleå: Luleå University of Technology. Olink, H., and G. Scheepmaker. 2017. “RailSys Use by RailwayLab.” Oral presentation at the Royal Institute of Technology. Olsson, N., A. H. Halse, P. M. Hegglund, M. Killi, A. Landmark, and A. Seim. 2015. Punktlighet i Jernbanen – Hvert Sekund Teller. Oslo: SINTEF Akademisk Forlag. Olsson, Nils O.E., and Hans Haugland. 2004. “Influencing Factors on Train Punctuality - Results from Some Norwegian Studies.” Transport Policy 11 (4): 387–97. https://doi.org/10.1016/j.tranpol.2004.07.001. Pachl, J. 2002. Railway Operation and Control. Mountlake Terrace: VTD Rail Publishing. Palin, Erika J., Adam A. Scaife, Emily Wallace, Edward C.D. Pope, Alberto Arribas, and Anca Brookshaw. 2016. “Skillful Seasonal Forecasts of Winter Disruption to the U.K. Transport System.” Journal of Applied Meteorology and Climatology 55 (2): 325–44. https://doi.org/10.1175/JAMC-D-15-0102.1. Parbo, Jens, Otto Anker Nielsen, and Carlo Giacomo Prato. 2016. “Passenger Perspectives in Railway Timetabling: A Literature Review.” Transport Reviews 36 (4): 500–526. https://doi.org/10.1080/01441647.2015.1113574. Pedersen, Timothy, Thomas Nygreen, and Anders Lindfeldt. 2018. “Analysis of Temporal Factors Influencing Minimum Dwell Time Distributions.” WIT Transactions on the Built Environment 181: 447–58. https://doi.org/10.2495/CR180401. Peterson, A. 2012. “Towards a Robust Traffic Timetable for the Swedish Southern Mainline.” In WIT Transactions on the Built Environment, 127:473–84. https://doi.org/10.2495/CR120401. ProRail. 2016. “Network Statement 2017.” : ProRail.

101 Qi, Zhang, Han Baoming, and Li Dewei. 2008. “Modeling and Simulation of Passenger Alighting and Boarding Movement in Beijing Metro Stations.” Transportation Research Part C: Emerging Technologies 16 (5): 635–49. https://doi.org/10.1016/j.trc.2007.12.001. Rietveld, P., F. R. Bruinsma, and D. J. Van Vuuren. 2001. “Coping with Unreliability in Public Transport Chains: A Case Study for Netherlands.” Transportation Research Part A: Policy and Practice 35 (6): 539–59. https://doi.org/10.1016/S0965- 8564(00)00006-9. Rietveld, Piet. 2007. “Six Reasons Why Supply-Oriented Indicators Systematically Overestimate Service Quality in Public Transport.” Transport Reviews 25 (3): 319– 28. https://doi.org/10.1080/0144164042000335814. Rudolph, Raphaela. 2003. “Allowances and Margins in Railway Scheduling.” In Proceedings of the World Congress on Railway Research. Sa’adin, Binti Sazrul Leena, Sakdirat Kaewunruen, and David Jaroszweski. 2016. “Risks of Climate Change with Respect to the Singapore-Malaysia High Speed Rail System.” Climate 4 (4): 65. https://doi.org/10.3390/cli4040065. Sandblad, Bengt, Arne W. Andersson, and Simon Tschirner. 2015. “Information Systems for Cooperation in Operational Train Traffic Control.” In Procedia Manufacturing, 3:2882–88. The Authors. https://doi.org/10.1016/j.promfg.2015.07.793. SBB. 2018. “SBB Facts and Figures 2018.” Bern: SBB-CFF-FFS. https://www.itu.int/en/ITU-D/Statistics/Pages/facts/default.aspx. Seriani, Sebastian, and Rodrigo Fernandez. 2015. “Pedestrian Traffic Management of Boarding and Alighting in Metro Stations.” Transportation Research Part C: Emerging Technologies 53: 76–92. https://doi.org/10.1016/j.trc.2015.02.003. Solinen, Emma, Gemma Nicholson, and Anders Peterson. 2017. “A Microscopic Evaluation of Railway Timetable Robustness and Critical Points.” Journal of Rail Transport Planning and Management 7 (4): 207–23. https://doi.org/10.1016/j.jrtpm.2017.08.005. Stenström, Christer, Aditya Parida, Jan Lundberg, and Uday Kumar. 2015. “Development of an Integrity Index for Benchmarking and Monitoring Rail Infrastructure: Application of Composite Indicators.” International Journal of Rail Transportation 3 (2): 61–80. https://doi.org/10.1080/23248378.2015.1015220. Stockholms läns landsting. 2017. “Trafiknämnden/Trafikförvaltningen Årsrapport 2017.” Annual report. Stockholm: Stockholms läns landsting. Thaduri, Adithya, Diego Galar, and Uday Kumar. 2015. “Railway Assets: A Potential Domain for Big Data Analytics.” Procedia Computer Science 53 (1): 457–67. https://doi.org/10.1016/j.procs.2015.07.323. Trafikanalys. 2016. “Punktlighet På Järnväg 2015.” Report. Stockholm: Trafikanalys. ———. 2018a. “Bantrafik 2017.” Report. Stockholm: Trafikanalys. ———. 2018b. “Transportarbete 2000–2017.” Report. Stockholm: Trafikanalys. ———. 2019. “Punktlighet På Järnväg 2018.” Report. Stockholm: Trafikanalys.

102 Trafikverket. n.d. “Lupp v2.3. UH/Järnvägssystem. Systemöversikt.” Internal document. Borlänge: Trafikverket. ———. 2015a. “Network Statement 2017.” Borlänge: Trafikverket. ———. 2015b. “Noder i Järnvägssystemet.” Internal document. Borlänge: Trafikverket. ———. 2017a. “Järnvägens Kapacitetsutnyttjande 2016.” Report. Borlänge: Trafikverket. ———. 2017b. “Riktlinjer Täthet Mellan Tåg.” Internal document. Borlänge: Trafikverket. ———. 2018a. “Analysmetod Och Samhällsekonomiska Kalkylvärden För Transportsektorn: ASEK 6.1.” Report. Borlänge: Trafikverket. ———. 2018b. “Lupp Uppföljningssystem.” System Och Verktyg - Förvaltning Och Underhåll. 2018. https://www.trafikverket.se/tjanster/system-och- verktyg/forvaltning-och-underhall/Lupp-uppfoljningssystem/. ———. 2018c. “Tillsammans För Tåg i Tid Resultatrapport 2017.” Report. Borlänge: Trafikverket. ———. 2018d. “Trafikledning.” För Dig i Branschen - Järnväg. 2018. https://www.trafikverket.se/for-dig-i-branschen/jarnvag/Trafikledning/. ———. 2019. “Trafikverkets Årsredovisning 2018.” Annual report. Borlänge: Trafikverket. Veiseth, M., N. Olsson, and I.A.F. Saetermo. 2007. “Infrastructure’s Influence on Rail Punctuality.” WIT Transactions on the Built Environment 96 (August 2007): 481–90. https://doi.org/10.2495/UT070451. Vekas, Peter, M.H. van der Vlerk, and Willem K. Klein Haneveld. 2012. “Optimizing Existing Railway Timetables by Means of Stochastic Programming,” In: Stochastic Programming E-print Series. 8. Vromans, Michiel. 2005. “Reliability of Railway Systems.” Doctoral thesis. Rotterdam: Erasmus University Rotterdam. Wallander, Jouni, and Miika Mäkitalo. 2012. “Data Mining in Rail Transport Delay Chain Analysis.” International Journal of Shipping and Transport Logistics. https://doi.org/10.1504/ijstl.2012.047492. Wang, Ying. 2018. “Incorporating Weather Impact in Railway Traffic Control.” Doctoral thesis. Leeds: University of Leeds. Watson, Robert. 2008. “Train Planning in a Fragmented Railway: A British Perspective.” Doctoral thesis. Loughborough: Loughborough University. Welch, B. L. 1947. “The Generalization of Student’s’ Problem When Several Different Population Variances Are Involved.” Biometrika 34 (2): 28–35. Wiggenraad, P. B. L. 2001. “Alighting and Boarding Times of Passengers at Dutch Railway Stations.” Report. Delft: Delft University of Technology. Wiklund, Mats. 2006. “Förebyggande Underhållsåtgärders Effekt På Järnvägstransportsystemets Sårbarhet Försök Med Delfimetoden.” VTI Notat 23- 2006. Report. Linköping: VTI. Xia, Yuanni, Jos N. Van Ommeren, Piet Rietveld, and Willem Verhagen. 2013. “Railway Infrastructure Disturbances and Train Operator Performance: The Role of Weather.” Transportation Research Part D: Transport and Environment 18 (1): 97–102. https://doi.org/10.1016/j.trd.2012.09.008.

103 Xu, Peijuan, Francesco Corman, and Qiyuan Peng. 2016. “Analyzing Railway Disruptions and Their Impact on Delayed Traffic in Chinese High-Speed Railway.” In IFAC- PapersOnLine, 49:84–89. Elsevier B.V. https://doi.org/10.1016/j.ifacol.2016.07.015. Yabuki, H., Y. Takatori, M. Koresawa, and N. Tomii. 2018. “Increasing Robustness of Timetables by Deliberate Operation of Trains to Shorten Headways.” International Journal of Transport Development and Integration 1 (3): 432–41. https://doi.org/10.2495/tdi-v1-n3-432-441. Yuan, Jianxin, and Ingo Arne Hansen. 2002. “Punctuality of Train Traffic in Dutch Railway Stations” Proceedings of the International Conference on Traffic and Transportation Studies (ICTTS) 2002: 522–29. https://doi.org/10.1061/40630(255)73. Yuan, Jianxin, and Ingo Arne Hansen. 2008. “Closed Form Expressions of Optimal Buffer Times between Scheduled Trains at Railway Bottlenecks.” In IEEE Conference on Intelligent Transportation Systems, Proceedings, ITSC, 675–80. IEEE. https://doi.org/10.1109/ITSC.2008.4732539. Zakeri, Ghazal, and Nils O.E. Olsson. 2017. “Investigation of Punctuality of Local Trains - The Case of Oslo Area.” Transportation Research Procedia 27 (2214): 373–79. https://doi.org/10.1016/j.trpro.2017.12.080.

104 List of Included Papers

Paper 1. Palmqvist, C.-W., Olsson, N. O. E. & Hiselius, L. (2017). Delays for passenger trains on a regional railway line in southern Sweden. International Journal of Transport Development and Integration, 1(3), 421–431.

Paper 2. Palmqvist, C.-W., Olsson, N. O. E. & Hiselius, L. (2017). Some influencing factors for passenger train punctuality in Sweden. International Journal of Prognostics and Health Management.

Paper 3. Palmqvist, C.-W., Olsson, N. O. E. & Hiselius, L. (2017). An Empirical Study of Timetable Strategies and Their Effects on Punctuality. Proceedings of the 7th International Conference on Railway Operations Modelling and Analysis – RailLille 2017 (peer-reviewed).

Paper 4. Palmqvist, C.-W., Olsson, N. O. E. & Winslott Hiselius, L. (2018). The Planners’ Perspective on Train Timetable Errors in Sweden. Journal of Advanced Transportation, 2018, 1–17.

Paper 5. Palmqvist, C.-W., Tomii, N. & Ochiai, Y. (2019). Dwell Time Delays for Commuter Trains in Stockholm and Tokyo. In Press. 8th International Conference on Railway Operations Modelling and Analysis – RailNorrköping 2019 (peer-reviewed).

Contribution to papers For Papers 1-4 I was the main author, responsible for the research questions, collecting and processing both quantitative and qualitative data, as well as most references. I designed and performed the analyses, in dialogue with the co-authors, drew conclusions and wrote the papers. The co-authors helped in finding supplementary literature, in discussing the analyses, and in structuring and editing the papers. For Paper 5 I was also the main author, responsible for the research question, processing and analysing the data, as well as writing the paper. The co- authors provided access to, and explained, the Japanese data. They also helped to find literature on Japanese railways, explained the policies and practices of Japanese railway companies, and assisted in fact-checking the manuscript.

105

Paper 1

C.W. Palmqvist et al., Int. J. Transp. Dev. Integr., Vol. 1, No. 3 (2017) 421–431

Delays for passenger trains on a regional railway line in southern Sweden

C.W. Palmqvist1, N.O.E. Olsson2, L. Hiselius1 1 Lund University, Department of Traffic and Roads, Sweden. 2 NTNU Norwegian University of Science and Technology, Norway.

Abstract The aim of this article is to study how selected variables influence delays in train traffic. Data has been collected on train movements, timetables, weather and capacity utilization on a highly utilized single-track railway line in southern Sweden during 2014. Based on this dataset, we have analysed how different factors affect delays in passenger traffic. We measure delays in a novel way, as deviations from the scheduled duration for each line section and station stop, not as deviations from a published or operational timetable, and this allows us to identify when and where the delays first occur.A verage delays were much larger at station stops. The most significant factor affecting delays was the scheduled duration time at station stops and the existence of margins on line sections. If trains arrive to a line section or station stop slightly delayed they speed up the activity, otherwise they are typically delayed. The influence of weather was less significant and somewhat contradictory: snow and cold temperatures increase delays on line sections but reduce them at station stops, while precipitation made no differ- ence. Capacity utilization seems to have a negative correlation with delays, but we have too little vari- ation in the levels to be confident.A ll studied variables, except for precipitation, have impacts that are statistically significant to a very high degree of confidence, using both t-tests and regression analysis. The results of this study have important practical implications for timetable construction; for instance we estimate that a reallocation of scheduled time at stations could reduce delays by as much as 80%. Keywords: allowances, delays, margins, punctuality, railroad, railway, station stop, timetable, trains, travel time variation.

1 INTRODUCTION Punctuality is a key performance indicator and important success factor for the railway sys- tems, and a failure in this key factor affects the competitiveness of railway transports relative to other transport modes [1]. A number of factors can influence the precision of the railway traffic [2]. Harris [3] found that distance covered and train length were statistically significant in determining punctuality. Running time margins are generally added to the nominal running time when setting up a timetable, as described by Goverde [4]. Nominal running times are based on a maximal speed profile in normal conditions. To make a timetable realistic, margins are added to allow for variations in rolling stock performance, individual driving patterns for drivers, variations in weather conditions and other factors that may influence the running time. Running time margins are typically expressed as a percentage of the nominal running time. International guidelines for running time margins are given in [5]. According to Pachl [6], common running time margins in Europe are in the range of 3% to 7% and 6% to 8% in North America. According to Goverde [4], the Dutch Railways use a margin of 7% of the nominal

This paper is part of the proceedings of the 15th International Conference on Railway ­Engineering Design and Operation (COMPRAIL) www.witconferences.com

© 2017 WIT Press, www.witpress.com ISSN: 2058-8305 (paper format), ISSN: 2058-8313 (online), http://www.witpress.com/journals DOI: 10.2495/TDI-V1-N3-421-431 422 C.W. Palmqvist et al., Int. J. Transp. Dev. Integr., Vol. 1, No. 3 (2017) running time. Palmqvist [7] found that the actual running time margins for a selected line in Sweden were about 10%. Punctuality, disturbances and the optimal distribution of supple- ments are discussed at length in Vromans [8]. Kroon et al. [9] developed a measure to assess at which part of the line time supplements were added. Parbo et al. [10] discuss the use of percentiles of a sample of actual times instead of defining theoretical running times and then add a percentage as margin. Ceder and Hassold [11] found that one of the main causes for delays in New Zealand was heavy passenger load, which increases dwell times. Bender et al. [12] state that time required for passenger boarding and alighting at stations is a critical element of overall train service performance. Harris et al. [13] studied delays at stations in the Oslo area. They claim that these delays are often small in nature, poorly recorded and not well understood. Weather conditions can influence the railway traffic. Xia et al. [14] estimate the effects of weather conditions on railway operator performance of passenger train services. They found that wind gust, snow, precipitation, temperature and leaves contributed to infrastructure dis- ruptions. Wei et al. [15] studied the cascade dynamics of delay propagation during severe weather. Ludvigsen and Klaboe [16] and UIC [17] present how weather conditions affect train operator and railway infrastructure performance. Railway capacity is recognized as an important factor regarding train punctuality [18, 19]. A growing data volume and data availability has enabled researches to study delays and delay causes on a relatively detailed level. Markovic et al. [20] proposed machine learning models for analysing the relation between train arrival delays and selected characteristics of the railway system. Gorman [21] used econometric methods on US freight rail data to predict congestion delays. Total train running time was predicted based on free running time predictors (horsepower per ton, track topography and slow orders) and congestion-related factors (meets, passes, overtakes, number of trains, total train hours, train spacing variability and train departure headway). Wallander and Mäkitalo [22] used a data-mining approach for analysing rail transport delays with the aim of developing a more robust timetable structure and provide tools for rail network planning. This article has a similar objective as Wallander and Mäkitalo [22] and shows an example of how this type of analysis could be carried out. We relate delays to deviations from the scheduled running times between stations and to scheduled station times in order to distin- guish between not only the places of occurrence of the delay, but also the actual use of the infrastructure. There are well-established models based on physics for calculating nominal train running times, and such calculations are implemented in timetabling support tools. Time at stations has traditionally been set more empirically or based on tradition. Our aim is to map devia- tions from these scheduled reference times, for each section along a selected line, and to investigate influencing factors. By focusing on how long an activity should take and how long it actually took, this approach also facilitates discussions related to the most efficient use of the infrastructure. We perform the analyses for both line sections and station stops. The main aim of this article is consequently to contribute to the methods by which devia- tions can be understood through this novel approach on defining deviations. The article fur- thermore contributes to the much-needed knowledgebase on deviations and punctuality in the railway system in order to reach a robust system, yet keeping an efficient use of the infra- structure. The analysis is based on data from a single railway track in Sweden and various variables possibly influencing the occurrence and size of the delays. C.W. Palmqvist et al., Int. J. Transp. Dev. Integr., Vol. 1, No. 3 (2017) 423

2 METHOD

2.1 Key terminology and calculations

Punctuality is measured as percentage of late trains [1], while delays are measured by a time unit, such as minutes. Punctuality and delays provide an overview of the performance of a railway system and can be summarized to easily communicated numbers, such as punctuality of all trains on a line for a period, average delay or aggregated delay hours. However, from an analytical point of view, a lot of information is hidden in such aggregation. When analysing railway traffic, nuanced presentations with higher resolution are desired. Rietveld et al. [23] measure unreliability as deviations from a timetable. Noland and Polak [24] discuss travel time variability using a distribution of arrival times, without referencing a timetable. Nicholson et al. [25] instead propose three measures for the term ‘resilience’. The analyses in this article are based on differences between the actual and scheduled durations of activities, be they movements along line sections or station stops. If a stop at a station is scheduled to take two minutes but actually takes three, we would call that a delay of one minute. In the rare case that the stop only took one minute instead of two, that would register as a negative delay of one minute (-1). This article thus differs from what is typically meant by ‘delay’ in that, beyond the scheduled duration and any margins in it, we are not concerned with the timetable per se. If a train is scheduled to arrive at a station at 12:00 and depart at 12:02, thus with a scheduled duration of two minutes, but it actually arrives at 12:01 and departs at 12:03, with an actual duration of two minutes, we would not register that as a delay. Instead, we would have registered the delay, with relation to the arrival and departure times, where and when the delay first occurred. The article furthermore differs from what is typically meant by ‘travel time variability’ in that we use the timetable as the baseline for how long activities ‘should’ take, instead of the average duration as measured over some time span. If a stop at a station is scheduled to take two minutes, but actually takes three minutes, we would consider that a delay of one minute even if the average dwell time is three minutes. From the perspective of travel time variabil- ity, however, there would be no deviation. In the article, we use the term ‘average delay’. What we mean is an average conditional on the value of an explanatory variable. We also discuss relative risk, which indicates how much a factor increases or decreases the average delay in relative terms. The relative risk of delays is calculated as the average delay conditional on some value of an explanatory value, divided by the average delay across all station stops or line sections, as the case may be. This makes it easier to discern the effect of factors where the values or differences are small in absolute terms. The analysis of this article also makes use of the delay contribution. Here we multiply the average delay with the number of observations and compare this to the total amount of delays. This allows us to identify which levels lead to the most delays, overall. To ease comparison between the differing lengths of line sections, between the speeds of trains, and so on, we have chosen to use ratios instead of absolute time units. Throughout we have chosen to use the scheduled duration as the denominator. If the scheduled duration is 100 seconds and the total margin for that activity is 10 seconds, we would register that as an allowance of 0.10 or 10%. 424 C.W. Palmqvist et al., Int. J. Transp. Dev. Integr., Vol. 1, No. 3 (2017)

For durations we normalize with the average duration for the respective activity. Across our sample those averages are approximately 100 seconds for station stops and 197 seconds for line sections. A scheduled stop of two minutes is thus translated to a duration of 1.20 while a scheduled movement along a line section of two minutes would translate to 0.61. We have then rounded these ratios to one or two decimal points to limit the number of distinct values we work with and to ensure that there are enough observations in each bin for their average delays to be reasonably stable.

2.2 Data

The railway line analysed in this study is 113 km long, electrified, single track, with dense and heterogeneous regional traffic. This makes it a fairly typical case in a Scandinavian con- text. It is located in the far south of Sweden and goes almost straight to the east from Helsing- borg on the western coast to near the eastern shore, connecting the and the Southern Main Line. The article is based on four main datasets: train movements on the railway line for the year of 2015, detailed timetables for these trains, weather observations from around the region and the calculated capacity utilization along the railway. These datasets have been linked together through a series of Access databases, resulting in data for over 198,000 stops at 13 stations and 363,000 movements along 20 line sections. Details about the component datasets follow below. The core of our data is that of registered train movements. We do not have data from every signal, but at certain points, typically where trains can meet. Arrival and departure of the trains are logged for purposes of traffic control, punctuality measurement, etc. From these logs we can calculate the actual duration of the activity, whether it was a stop at a station or a movement between them. Timetables for trains in Sweden are created and stored in the tool TrainPlan. The geograph- ical resolution is almost the same as above. We have received exports from this programme, in which we can see every planned activity and its allotted time. This covers both stops and movements. From this we extract the scheduled duration and any margins that may have been allotted. We have meteorological observations from measuring stations in connection to the studied railway line, over the variables – snow depth, precipitation and temperature. The two former variables are measured once per day, the latter every hour. Snow depth is presented as metres, with a precision of two decimal points. Temperature is in degrees Celsius, precipitation in millimetres. The infrastructure manager has adopted an interpretation of the UIC [18, 19] guidelines for how to calculate the capacity utilization on line sections and provided us with the results. However, this framework does not include a method to calculate the utilization of capacity at stations. In order to look at this, we define our own measure. First we calculated, for every day and hour, how many trains passed through the station. Then we took the maximum value and set it as the maximum capacity for every given station. Finally, we compared the utiliza- tion over every day and hour with the maximum for that station to get two ratios, one for the daily utilization and one for the peak utilization. A summary of studied variables is found in Table 1. C.W. Palmqvist et al., Int. J. Transp. Dev. Integr., Vol. 1, No. 3 (2017) 425

Table 1: Description of studied variables. Variable Description Earlier delay The train’s delay at the beginning of the ac- tivity, as a fraction of the scheduled duration of the activity. Early arrivals are counted as negative delays. Scheduled duration The amount of time scheduled for an activity, be it line section or station stop, before mar- gins are applied. Normalized by the average scheduled duration for the respective activity and presented as a fraction. Margins The amount of margins in the timetable for the current activity. Normalized by the scheduled duration and presented as a per- centage. Temperature Degree Celsius, measured hourly. Precipitation Rain or snowfall [mm], measured daily. Snow Depth of snow [m], measured daily. Day of the week Monday (1) through Sunday (7) Month of the year January (1) through December (12) Capacity utilization Percentage of maximum theoretical capacity on line section or at station. Calculated and analysed both for an entire day and for peak load.

2.3 Analysis

In order to analyse the impact of these variables on delays we make use of three basic steps. The first is to determine if the studied variables had a statistically significant impact on the delays. We used Welch’s t-test to do this, which is a more general version of Student’s t-test, which permits the studied samples to have different sizes and variances. Here the data is seg- mented based on the value of the explanatory variable. For example, we look at temperatures below zero and temperatures that are above zero and find out whether the average delays for these samples are significantly different from each other. A second step is to perform linear regressions for delays at station stops and line sections respectively in order to give an overview of the trend of each studied variable. These are not intended to be the focus of the article or to provide good predictive models. This regression analysis assumes that, instead of there being two categories within each variable, there is a linear relationship between the studied variable and delays occurring. The third step of the analysis includes three different plots for each variable across line sections and station stops, each emphasizing different aspects of the data. Each explanatory variable is plotted against the average delay, the relative risk and the delay contribution as defined in Section 2.1 for line sections and station stops respectively. These plots together give a more complete and nuanced picture than either the t-tests or the regressions, and visual 426 C.W. Palmqvist et al., Int. J. Transp. Dev. Integr., Vol. 1, No. 3 (2017) inspection can reveal if the studied variables vary along with delays in any recognizable pat- tern or if there is no pattern at all. Due to limitation of space this article presents only two diagrams. The result is thus mainly presented in text.

3 reSULTS AND DISCUSSION

3.1 line sections and station stops

The presentation of the results begins with an overview over the average delays and prob- abilities of being delayed on line sections and at station stops. As Table 2 shows, the average delay for a train on a line section is very close to zero, while significant delays are to be expected at the scheduled station stops. The values in Table 2 are normalized with regard to the scheduled duration. Table 2 also illustrates that for movements on line sections, the probability of being exactly on time is higher than at stations, and the probability of gaining time is higher than that of losing it. At stations almost half of all stops take longer than scheduled. The delays that do occur also tend to be larger at station stops than on line sections.

3.2 T-test and regression analysis

To test the significance of the studied variables we first use Welch’s t-test. The results of these are summarized in Table 3. Most of the variables were found to have a highly significant impact on delays. We then perform regression analyses on the line section and station stop data respectively. Both of these yield results similar to the t-test, and the key results are also summarized in Table 3. A number of coefficients differ in sign between the regression for line sections and station stops: the three weather variables increase delays on line sections while reducing them at stations, while month by month the delays decrease on line sections and increase at stations.

3.3 about the explanatory variables

In the following section we discuss the results of visual analysis of more than 60 plots over the explanatory variables impacts on delays. This analysis allows us to see patterns that are not visible in Table 3. We do not have the space to show more than a couple of these diagrams but briefly discuss the others in text.

3.3.1 Timetable factors Visual analysis indicates that small earlier deviations from the timetable affect the occurrence delays on line sections. One possible interpretation is that if trains arrive a little before the Table 2: average delays and probabilities of delay. Summary statistics Probability of finishing activity Average Standard Early On time Delayed Delayed by delay deviation more than 100% Line section 0.00 0.01 24% 58% 18% 0% Station stop 0.44 1.21 19% 34% 47% 41% C.W. Palmqvist et al., Int. J. Transp. Dev. Integr., Vol. 1, No. 3 (2017) 427

Table 3: results of t-tests and regression analysis Input Welch’s t-test Regression Variable Source Value Aver- St. Value Aver- St. p- Coef- p- age Dev. age Dev. value ficient value Earlier delay Line >0 0.00 0.00 <=0 0.00 0.01 0.00 0.00 0.00 of train section Station 0.14 1.11 0.78 1.22 0.00 −0.05 0.00 stop Scheduled Line >1 0.00 0.01 <=1 0.00 0.01 0.27 0.00 0.00 duration section relative to Station 0.08 0.73 0.54 1.30 0.00 −0.22 0.00 average stop Margins in Line >0 0.00 0.01 <=0 0.00 0.01 0.00 −0.01 0.00 timetable section Station −0.43 0.92 0.44 1.21 0.00 0.00 0.00 stop Line <0 0.00 0.01 >=0 0.00 0.01 0.00 0.00 0.00 Temperature section Station 0.35 1.20 0.44 1.21 0.00 0.00 0.01 stop Precipitation Line >0 0.00 0.01 0 0.00 0.01 0.05 0.00 0.00 section Station 0.44 1.16 0.44 1.24 1.00 0.00 0.39 stop Snow depth Line >0 0.00 0.01 0 0.00 0.01 0.00 0.02 0.00 section Station 0.29 1.14 0.45 1.21 0.00 −1.16 0.00 stop Day of week Line 6-7 0.00 0.01 1-5 0.00 0.01 0.00 0.00 0.00 section Station 0.42 1.17 0.44 1.22 0.01 −0.01 0.00 stop Month of Line 0.00 0.00 year section Station 0.02 0.00 stop Capacity Line 0.01 0.00 utilization section calculated Station 0.06 0.00 daily stop Capacity Line −0.01 0.00 utilization section calculated at Station −0.03 0.10 peak stop 428 C.W. Palmqvist et al., Int. J. Transp. Dev. Integr., Vol. 1, No. 3 (2017)

Figure 1: relative risk of delays at different margin levels on line sections.

Figure 2: average delays at stations, scheduled duration. scheduled arrival time the drivers seem to slow down, and if they are a little bit late, they try to catch up as best they can. This effect disappears as the deviations become larger, both in positive and negative values. Station stops were shortest when trains arrived a few minutes after the scheduled arrival time. Trains that arrived either on time or early to the station were on average delayed at the stations so that they depart with a delay of around half a minute, regardless of how early they were. Trains that arrived more than a few minutes late were on average delayed even more at the station. On line sections margins have greater impact on delays than the scheduled duration. While longer line sections on average had slightly lower delays than shorter ones, the effect is not statistically significant, as we see in Table 3. Most delays on line sections in our sample are caused when trains have no margins, although negative margins do occur and contribute to a small degree. The relationship is illustrated in Fig. 1. The most effective level for delay reduc- tion is around 10%, while the greatest delay reduction is achieved with margins around 80%. On line sections we find this to be the single most important variable for explaining delays. Margins are less well defined at station stops compared to margins on line sections. In the sample we study only a few hundred trains have clearly labelled margins, and those are on the order of hours per station stop, clearly intended for other purposes. Instead, the sched- uled duration of the stop is more important and is manually set by the timetable constructor. Figure 2 illustrates that the average delay is significant if the scheduled duration is lower than 160 seconds. The greatest reduction in delays occurs at a duration of 210 seconds, and C.W. Palmqvist et al., Int. J. Transp. Dev. Integr., Vol. 1, No. 3 (2017) 429 thereafter additional station time increases the average delays. This variable, scheduled time at station stops, explains more of the delays than any other, both at stations and in total. Knowing this, we might seek to reallocate the time at stations so as to reduce the delays. To do this we chose a strategy where the total scheduled duration across all stations stops is not allowed to increase, while trying to minimize the total delay time. As a result, we set 21% of the station stops to 210 seconds and the rest at 50 seconds. Using the estimated effect on delays from Fig. 2, we found that this reallocation could reduce delays by as much as 80% without increasing the scheduled travel times.

3.3.2 Weather variables None of the weather variables we study are found to have a large impact on delays; the coef- ficients and differences in average delays are relatively small in Table 3, even if the impact is statistically significant for both snow and temperature. Visual inspection reveals that an increasing snow depth increases the delays on line sec- tions and reduces delays at stations, while precipitation does not seem to have a significant effect on delays at all. Temperatures below zero contribute to raising the average delays on line sections, while milder temperatures pull the average towards trains arriving early. At sta- tions we find quite the opposite: the risk of delays decreases around the freezing point, while more normal temperatures are associated with delays.

3.3.3 Time: days of the week, months of the year Visual inspection reveals a noticeable and statistically significant difference between the average delays across the seven days of the week, with delays being the smallest at the week- end and largest on Mondays. This is almost entirely explained by how many trains run each day, as the correlation between the daily averages and the number of trains running on the same days is over 92% on line sections and 70% at stations. The risk of being delayed also varies from month to month. On line sections we do not dis- cern any pattern or seasonal effect, while at stations the risk increased slowly as the months went by. The risk was around 25% lower than average during the early part of the year and about as much higher towards the end.

3.3.4 Capacity utilization Capacity utilization is the factor that we are the least confident about, both for line sections and station stops, and especially using the peak measure. The railway line we study is not very long and only has three different levels of capacity utilization for line sections, calculated by the infrastructure manager. The regression analysis reported in Table 3 showed a small positive correlation, but our visual inspection instead shows a negative correlation between capacity utilization and delays. We hypothesize that this mismatch is due to a too small number of distinct capacity utilization levels in our sample and does not draw any strong conclusions about this variable. Capacity utilization is not defined and calculated in the same way for stations as for line sections, so we define and use a measure of our own, described in the method chapter. When we use the daily measure for stations, the risk is somewhat elevated when the capacity utili- zation is 70%. Using the peak measure, which we are less confident about methodologically, we instead see a reduction at the same level. Again, we do not think any strong conclusions should be drawn from this. 430 C.W. Palmqvist et al., Int. J. Transp. Dev. Integr., Vol. 1, No. 3 (2017)

4 CONCLUSIONS In this article we have used a novel approach of looking at where delays occur, a method which makes it easy to distinguish between delays occurring on line sections and those occur- ring at station stops. By focusing on how long an activity should take and how long it actually took, this approach also facilitates discussions about efficient use of the infrastructure. The results indicate that variables influencing delays do so to different degrees, sometimes even with different signs, on line sections and at station stops. We also showed that, at least on this railway line, delays at station stops were by far the biggest problem. We found that the timetable was the most important factor for explaining delays. Trains with slight delays make better time on average, both on line sections and at station stops. Those that instead are early or with larger delays typically require more time, increasing delays. When margins of around 10% exist on line sections the risk of delays falls, while larger margins are less efficient. On station stops a duration of 210 seconds is optimal for reducing delays, and we propose that these should be mixed with stops of 50 seconds to obtain a good balance between minimizing delays and minimizing the expected travel time. We estimate that by reallocating existing station time in this manner, it is possible to reduce delays by as much as 80%. Weather had some contradictory impacts. On line sections delays increase as temperatures fall, or if there is snow. At stations the same conditions actually improve performance, and in our sample the effect at stations was more important. Precipitation had no significant impact on delays. The average delays are slightly smaller on weekends, especially on line sections, and this is mostly because there are fewer trains running then than during the week. The number of trains also explains part of the variation of delays between months when we look at stations, but not on line sections. We do not detect a clear pattern regarding capacity utiliza- tion and its effect on delays. The results of this study have important practical implications for railway traffic planning. Some factors, such as weather, cannot be influenced, but their impact on traffic is still impor- tant to know when establishing plans. Other factors, such as the distribution of margins, and length of station stops are factors that planners can influence. From a research perspective we have shown that it is both possible and useful to gather and analyse large volumes of data about train movements in this manner: analysing even small deviations from the timetable, identifying when and where the deviations occur, considering the differences between line sections and station stops, and combining several datasets and explanatory variables. We shall continue our research expanding the network size, time period and number of variables studied, and increasing the depth of analysis.

REFERENCES [1] Nyström, B., Aspects of Improving Punctuality, Luleå University of Technology Doc- toral Thesis, 2008. [2] Olsson, N.O.E. & Haugland, H., Influencing factors on train punctuality – results from some Norwegian studies. Transport Policy, 11(4), pp. 387–397, 2004. [3] harris, N.G., Planning Passenger Railways: A Handbook. A & N Harris, London, 1992. [4] goverde, R.M.P. Punctuality of Railway Operations and Timetable Stability Analysis. The Netherlands TRAIL Research School Thesis Series no. T2005/10, 2005. [5] International Union of Railways, UIC Code 451 Timetable recovery margins to guaran- tee timekeeping – Recovery margins, 2000. C.W. Palmqvist et al., Int. J. Transp. Dev. Integr., Vol. 1, No. 3 (2017) 431

[6] Pachl, J., Railway Operation and Control. VTD Rail Publishing: Mountlake Terrace, 2002. [7] Palmqvist, C.W., Running Time Supplements for High Speed Trains. Lund University Master Thesis 257, Lund, 2014. [8] Vromans, M.J.C.M., Reliability of Railway Systems. ERIM PhD series Research in Management 62, 2005. [9] Kroon, L., Dekker, R. & Vromans, M., Cyclic railway timetabling: A stochastic optimi- zation approach. Algorithmic Methods for Railway Optimization, 16, pp. 41–66, 2007. DOI: 10.1007/978-3-540-74247-0_2 [10] Parbo, J., Nielsen, O.A. & Prato, C.G., Passenger perspectives in railway timetabling: A literature review. Transport Reviews, pp. 1–27, 2015. [11] Ceder, A. & Hassold, S., Applied analysis for improving rail-network operations. Jour- nal of Rail Transport Planning & Management, 5, pp. 50–63, 2015. [12] bender, M., Büttner, S. & Krumke, S.O., Online delay management on a single train line: beyond competitive analysis. Public Transport, 5, pp. 243–266, 2013. DOI: 10.1007/s12469-013-0070-z [13] harris, N.G., Mjøsund, C.S. & Haugland, H., Improving railway performance in Nor- way. Journal of Rail Transport Planning & Management, 3(4), pp. 172–180, 2013. [14] Xia, Y., Van Ommeren, J.N., Rietveld, P. & Verhagen, W., Railway infrastructure distur- bances and train operator performance: The role of weather. Transportation Research Part D, 18, pp. 97–102, 2013. DOI:10.1016/j.trd.2012.09.008. [15] Wei, D., Liu, H. & Qin, Y., Modeling cascade dynamics of railway networks under inclement weather. Transportation Research Part E, 80, pp. 95–122, 2015. [16] ludvigsen, J. & Klæboe, R., Extreme weather impacts on freight railways in Europe. Natural Hazards, 70, pp. 707–787, 2014. [17] International Union of Railways, Winter and Railways, UIC/SIAFI Report, 2011. [18] International Union of Railways, UIC CODE 406 Capacity, 2004. [19] International Union of Railways, UIC CODE 406 Capacity, 2013. [20] markovic, N., Milinkovic, S., Tikhonov, K.S. & Schonfeld, P., Analyzing passenger train arrival delays with support vector regression. Transportation Research Part C, 56, pp. 251–262, 2015. [21] gorman, M.F., Statistical estimation of railroad congestion delay. Transportation Re- search Part E, 45, pp. 446–456, 2009. [22] Wallander, J. & Mäkitalo, M., Data mining in rail transport delay chain analysis. Inter- national Journal of Shipping and Transport Logistics, 4(3), pp. 269–285, 2012. [23] rietveld, P., Bruinsma, F. R. & van Vuuren D.J., Coping with unreliability in public transportation chains: A case study for Netherlands. Transportation Research Part A, 35, pp. 539–559, 2001. [24] Noland, R. B. & Polak, J. W., Travel time variability: a review of theoretical and empiri- cal issues. Transport Reviews, 22(1), pp. 39–45, 2002. [25] Nicholson, G., Kirkwood, D., Roberts, C. & Schmid, F., Benchmarking and evaluation of railway operations performance. Journal of Rail Transport Planning & Management, 5(4), pp. 274–293, 2015. DOI: doi:10.1016/j.jrtpm.2015.11.004

Paper 2

Some influencing factors for passenger train punctuality in Sweden

Carl-William Palmqvist1, Nils O.E. Olsson1,2, and Lena Winslott Hiselius1

1 Lund University Faculty of Engineering, Box 118, Lund, 221 00, Sweden 2 Norwegian University of Science and Technology, Høgskoleringen 1, 7491, Trondheim, Norway

[email protected]

ABSTRACT set and agreed upon by the industry, that by 2020 the punctuality across all trains should be 95%, measured as Punctuality is regarded as an important measure of the arriving at the destination with a delay of at most five performance of a railway system, and is the one most minutes. Since 2012 the punctuality has been steady at commonly used and discussed measure both in the industry around 90 % for passenger trains and slightly below 80% for and among travelers. In many countries, the punctuality of freight trains (Transport Analysis, 2016). Large and rapid trains, and thus the performance of the railway system, is improvements are thus required, if the target of 95% is to be deemed as lacking. The aim of this article is to study and reached on time. quantify how several weather-, timetable, operational and infrastructure-related variables influence punctuality in The purpose of this study is to identify and quantify the passenger train traffic. This can contribute to better impact of several weather, timetable, operational and understanding of the performance of railway systems, and infrastructure variables on the punctuality of passenger trains help identify possible improvements. in Sweden. This is a much broader scope than is typically The study is based on a dataset containing detailed timetables seen in the literature, using an extensive dataset. Most and records of all 32.4 million train movements for all trains previous research is focused on a single type of influencing in Sweden during the year of 2015, over 1.1 million variables, such as weather or timetable properties, using departures. Supporting this is a comprehensive register of limited time periods and geographies to illustrate the effects. over 80 000 infrastructure elements, and almost 87 million While this is often necessary, there is a gap in the research weather observations. that attempts to synthesize this knowledge in a more holistic We consider the size and allocation of margins, the existence approach. This paper is intended to bridge that gap. of negative margins, two measures for traffic volume, the journey time and distance, how often different vehicle 1.1. Previous Research individuals are used, the number of line and station interactions between trains, the amount of precipitation, the 1.1.1. Weather temperature, wind speed, snow depth, and eight types of infrastructure elements. We show how these variables affect The influence of weather and climate change on train delays punctuality, and estimate how much of the variation in and punctuality has received considerable attention in the punctuality can be explained by them. literature recently. Brazil et al. (2017) found that precipitation The findings can be used to design timetables, change delayed trains on a metropolitan rail line in Dublin, operational parameters and modify infrastructure design so combining a dataset of over 6 000 train departures with that punctuality improves. They can also help identify areas hourly observations of several weather variables. Zakeri and which should be prioritized in planning, maintenance and Olsson (2017) investigated the impact of weather on research. punctuality of local trains in the Oslo area, and found strong correlations between punctuality and temperatures below - 7℃ and snowfall of at least 15 cm. Xia, Van Ommeren, 1. INTRODUCTION Rietveld and Verhagen (2013) estimate how wind, Punctuality is an important factor for the attractiveness and temperature and precipitation cause disturbances in the efficiency of the railway sector. In Sweden, a target has been railway, mainly by damaging the infrastructure. Qin, Ma and ______Jiang (2017) model how rain, snow, and different Carl-William Palmqvist et al. This is an open-access article distributed under temperature thresholds affect delays on a regional railroad in the terms of the Creative Commons Attribution 3.0 United States License, Sweden. Ludvigsen and Klæboe (2014) describe how long which permits unrestricted use, distribution, and reproduction in any cold spells, heavy snowfall and strong winds have severely medium, provided the original author and source are credited. delayed freight trains across five European countries. Xu,

International Journal of Prognostics and Health Management, ISSN 2153-2648, 2017 020

INTERNATIONAL JOURNAL OF PROGNOSTICS AND HEALTH MANAGEMENT

Corman and Peng (2016) analyze the disruptions in the trains, and excessive passenger loads. Cerreto, Nielsen, Chinese high-speed railway, and find that almost 90% of Harrod and Nielsen (2016) present a preliminary study on the these are due to bad weather. Nagy and Csiszar (2015) quality of time supplement allocation in timetables, and on highlight the effects of weather conditions on the punctuality how trains recover or increase delays that occur during a of Hungarian passenger trains. Ferranti et al. (2016) study journey. Our own previous studies also indicate that how heat causes failures in the railroad infrastructure in properties of the timetable have significant impact on delays England, particularly in the signaling systems. They also and punctuality (Palmqvist, Olsson & Hiselius, 2017a). This discuss the concept of failure harvesting, in which often stems from several strategic decisions that a planner components that fail early in the season are replaced by newer must consider when designing a timetable. On a high level, and more resilient components, reducing the vulnerability these strategies address several issues. This includes the and number of failures later in the season. balance between precision and slack (Olsson et al., 2015), and the balance between using headways to assign buffers Falling leaves often impact punctuality in the autumn, as between trains or using time supplements to assign margins described by Xia et al. (2013), and Brazil et al. (2017) among within train journeys (Nelldal, 2009). Other issues are the others. The leaves create a mulch on the rails which lowers degree to which a cyclic timetable is desired, the the adhesion between the rails and wheels, increasing both heterogeneity of traffic and the degree to which braking and acceleration times, and thus causing delays. homogenization measures are to be employed (Nelldal, Because falling leaves are not typically measured or recorded Lindfeldt & Lindfeldt, 2009). In addition, questions arise like other weather-related variables, both sets of authors use about geographical accessibility and the design of the dummy variables for each month to try to capture this effect. network to be utilized, the balance between stops at end Tahvili (2016) describes extensively and in detail how snow, points or intermediate stations, and other issues. cold temperatures and strong winds cause problems for the An overview of the state of the art in timetable research is railroad, and how the Norwegian rail sector is undertaking provided by Hansen (2009). The author concludes that the winterization measures to reduce these problems in the key issue for high quality timetables is a precise estimation future. Lehtonen (2015) described the conditions of four of blocking times, considering the signals, platforms, train snow-rich winters in the Helsinki-area, and how they caused processing, and using realistic run and dwell times. This is significant issues for the railroads there. Jaroszweski, often not the case in practice. Similarly, queuing and Hooper, Baker, Chapman and Quinn (2015) describe how a simulation models inadequately reflect speed variations and storm in the UK caused very severe delays for both road and the behavior of railway staff. Planners have tools to make rail transport, severing the main link between England and timetables robust against delays, for example by adding time Scotland. Doll et al. (2014) present a case study for adapting supplements, lowering heterogeneity in the timetable by rail roads in Austria to the different climate of 2050, and having uniform stopping patterns, finding optimal speed and conclude that the damages due to weather will increase. reducing interdependencies between trains (Parbo et al., Kellermann, Bubeck, Kundela, Dosio and Thieken (2016) 2016). simulated how the changing climate will affect the frequency of critical meteorological conditions in the Austrian railroads, Carey (1999) discusses several different heuristic measures concluding that while snowfall and extremely cold of timetable reliability, with special consideration to knock- temperatures will become less frequent, intense rainfall and on delays. These include probabilities of delays, calculated in heat waves will become more common. Ford et al. (2015) several ways, and different headway based measures. The simulate how extreme weather events like heat waves and latter are found to be easier to use and calculate, because they floods will become more common in the UK as global do not require nearly as much data. warming continues, leading to increasing disruptions to the Scheepmaker and Goverde (2015) demonstrate that it is more railroad system. Oslakovic, ter Maat, Hartmann and Dewulf energy-efficient to distribute time supplements evenly along (2013) also study how weather conditions cause failures in a train route. Vromans (2005) introduces the measure WAD, infrastructure elements in the Randstad region of the or the Weighted Average Distance, to describe how Netherlands, and how these conditions are likely to become supplements are distributed along the journey, and attempted more frequent as the climate changes. to optimize this using both analytical and numerical methods for some hypothetical and real cases, concluding that a slight 1.1.2. Timetable shift towards the beginning is best. This is further discussed As Parbo, Nielsen and Prato (2016) show, timetable in Andersson (2014). Using simulation-based methods, characteristics are important influencing factors for delays Vromans (2005) and Vekas, van der Vlerk and Haneveld and robustness in railway traffic. Kim, Kang and Bae (2013) (2012) found that a uniform distribution was sub-optimal for present and categorize different types of train delays, delay recovery, given some assumptions of the delay concluding that the important causes in South Korea are short distributions. headways, short scheduled run times, delays of preceding

2

INTERNATIONAL JOURNAL OF PROGNOSTICS AND HEALTH MANAGEMENT

1.1.3. Operational 2.1. Datasets Influencing factors on train punctuality in Norway are This study is based on a database containing three main presented by Olsson and Haugland (2004). In short, the datasets. The core set contains all train movements in authors found that in congested areas the management of Sweden, over 32.4 million, derived from the track blocking boarding and alighting passengers is the key factor, while on and signaling systems for the timetable year of 2015, which single track lines the management of train crossings is the key we use to determine the punctuality of trains. The second set success factor. Gorman (2009) used statistical analysis to contains detailed exports of all train timetables in Sweden study which factors contributed the most to delays for freight during the timetable year of 2015, this covers almost 46,000 trains in the US. He found that the number of meets, passes distinct timetable versions and over 1.1 million departures. and overtakes consistently had the highest impacts, The third dataset contains all historical meteorological suggesting that congestion was the primary cause for delays. observations of snow depth, temperature, wind strength, precipitation in Sweden, which we use to estimate the Wiggenraad (2001) studied seven Dutch train stations in weather conditions in which the trains operated. These detail. He found that dwell times are longer than scheduled, datasets were linked together, and several filters were applied that the dwell times at peak and off-peak were the same, and so that it focuses on only passenger trains, and excludes that passengers concentrated around platform access points. incomplete observations. The remaining data covers over This suggests an improvement potential of shorter real dwell 883 000 completed passenger train journeys across Sweden times if travelers could be distributed more evenly along the during one year. Freight and service trains are not included in platforms. Along the same lines, Nie and Hansen (2005) the analysis, only passenger trains, because the pre- studied trains in the station area of The Hague. They found conditions between the different types are considered too that trains operate at lower than design speeds, and that dwell different, as is the handling of both timetable planners and times at platforms are systematically extended because of traffic control. other trains blocking their routes, and because of the behavior of train personnel. 2.2. Analyzed Variables 1.1.4. Infrastructure We analyze how 36 variables across four categories affect punctuality. The breakdown by category is as follows: six Veiseth, Olsson and Saetermo (2007) links infrastructure data variables related to weather, seven variables related to the with delay and punctuality data to study the infrastructure’s timetable, seven variables related to operations, and 16 influence on rail punctuality. They report that some 30 % of variables relating to eight types of infrastructure elements. delay hours in Norway are caused by infrastructure failures, These are described in the following sections. and suggest that the quality of punctuality data can be improved by connecting it with infrastructure and operational 2.2.1. Punctuality databases. Thaduri, Galar and Kumar (2015) discuss how the many systems and sub-systems in the railway can be studied We define punctuality in the following way: trains arriving at using big data analytics, and gives an overview of the main their scheduled stops with a delay not exceeding five minutes databases used in the Swedish railways. Norrbin, Lin and are considered punctual at that stop and are given a value of Parida (2016) discuss the concept of robustness for railway 1, otherwise a value of 0 is given. In this manner, infrastructure, and present a roadmap for studying and cancellations are counted as non-punctual. For each train, we improving it. Stenström, Parida, Lundberg and Kumar calculate the average of these values to arrive at a punctuality (2015) develop a composite indicator for benchmarking and measure. A train that has four stops and arrives at three of monitoring of rail infrastructure, considering four factors: them punctually, but is not punctual at the fourth, thus failure frequency, train delays, logistic time and repair time. receives the punctuality value of 0.75. In any aggregate of Nikolic et al. (2016) discuss the poor quality of the Serbian trains, the punctuality is calculated as the average of these railway infrastructure, and adapts the measure of Overall values and presented as a fraction. This additional step, of Railway Infrastructure Effectiveness, which is another taking the average across all scheduled stops, gives a more composite indicator developed in Sweden, for use in their holistic picture. In our case, it also improves the overall national network. punctuality of passenger trains by 2.43 percentage points, to 92.17%, compared to when punctuality is only measured at 2. METHOD the destination. We only consider the punctuality of trains, not the size, frequency or distributions for delays or This section describes in turn the datasets used, the variables disturbances. analyzed, and the method of analysis. Manually reported causes for delays are not utilized in this study, for a number of reasons. The reported error causes are already relatively well known and publicized in both Sweden and Norway (see for instance Swedish Transport

3

INTERNATIONAL JOURNAL OF PROGNOSTICS AND HEALTH MANAGEMENT

Administration, 2017, and Veiseth et al., 2007). Delays must Trains often travel long distances, through varying weather be relatively large and obvious for causes to be reported, in conditions. To account for this, and because the locations of our material 55% of delays are small enough that they would meteorological observations are typically different from not be categorized. Manual attribution of delays is also prone those of the train stations, we created an algorithm that to errors and often quite inconsistent (Nyström, 2008), with matches each of the train stations to the nearest an estimated reliability around 80 % (Nilsson, Björklund, meteorological station. The matching was done separately for Pyddoke & Vierth 2015). Research is ongoing on combining each weather variable, because not all meteorological stations the attributions with data of the kind that we use in this study, observe the same variables. And because some stations lack to cross validate among the different sources. observations on some days, the algorithm was set to match the two station sets for each day, to ensure that an observation In practice calculating punctuality is more challenging than it could always be given. may appear, as a substantial number of observations are missing in the data. These are points where the train has not As each train passes several train stations, which can have been canceled but there is no record of the train arriving or different values of the weather variables, there are several departing, and there are records of it arriving at surrounding ways in which to convert these different values to one single stations. One specific example is for airport trains which variable. With wind, we were interested in the highest speeds depart from Arlanda North and stop at Arlanda South one and chose to take the maximum. With temperature, we tried minute later, before heading towards Stockholm C. Very both the average, the minimum and the maximum. We ended often the record for Arlanda South is missing, despite there up choosing the minimum temperature for cold weather, and being records of the train leaving Arlanda North, only 570 m the maximum temperature for hot weather, arguing that we to the north along the same track, and then subsequently are most interested in the extremes, but found that the choice arriving in Stockholm. There are many other examples, made little difference in the analysis. For precipitation and especially for regional trains. We have dealt with this by snow depth, we considered the average, maximum and sum going through a loop of (1) using the average delay at the two of the measured variables. In the end, we found that the sum adjacent stations, when both of these records exist, and best explains the effect of precipitation on punctuality, otherwise (2) using the delay at the previous control point if despite the complicating factor of introducing the distance there is an observation there, or if not then (3) using the delay and number of measuring stations into the variable, we find at the next control point, if there is one there, and if the that the added explanatory value more than makes up for the observation was missing there as well then starting again at reduced independence of the variable. For snow depth, we (1). In this way, small gaps in the record are filled found the average to work very well. immediately, and larger gaps are filled in step by step, as the One of the variables we use to explain punctuality is the loop is repeated. By iterating this process six times, we difference in temperature. It basically represents the reduced the share of missing observations from 7.5 to 0.1 %. difference in temperature that a train is exposed to. Because After this we calculate the punctuality, in the way described of how we have chosen to handle the temperature data, this above. temperature difference should be interpreted as being across the geography, not across time. 2.2.2. Weather data We do not have access to any information on falling leaves, Temperature is measured, on average, about 18.7 times per and because the weather and change of seasons varies quite day and station. The average of these was taken to get a daily considerably within Sweden, we do not think that falling temperature value for each station. Wind strength was leaves can be captured as well by monthly dummies in our measured at fewer stations, with an average of 23 dataset, as may have been the case in countries like the observations per day. To convert to a daily wind value for Netherlands or Ireland. Accordingly, without access to data each station, we took the maximum value for each day, or good proxies, we exclude this phenomenon from our because we are mainly interested in stronger winds. Snow analysis. depth and precipitation is measured daily, but with data missing on average 9 % and 0.5 % of the days, respectively. 2.2.3. Timetable variables An overview of this data is given in Table 1. The seven timetable variables are summarized in Table 2. Table 1. Overview of weather variables In this paper, we look at the size of margins in two slightly Variable Unit No. Obs. No. different ways: as a percentage of the scheduled runtime stations freq. obs. without margins, and as seconds per kilometer. Temperature ℃ 247 Hourly 1 680 k Wind speed m/s 166 Hourly 1 400 k To measure the distribution of margins within a timetable, we Snow depth cm 276 Daily 91 k use the measure of Weighted Average Distance (WAD) Precipitation mm 585 Daily 212 k described in Vromans (2005). This is used to describe how the various time supplements in a timetable are balanced,

4

INTERNATIONAL JOURNAL OF PROGNOSTICS AND HEALTH MANAGEMENT

Table 2. Overview of timetable variables Table 3. Overview of operational variables Variable Unit Variable Unit Number of Total margins % observations Supplement density s/km Station interactions count 2 091 619 WAD of margins % Line interactions count 20 945 Negative margins binary Trains to same station & hour count 4 355 911 Avg. speed with stops km/h Movements per vehicle count 970 Travel time (without margins) h Movements per day count 364 Distance/no. of scheduled stops km Days run per train number count 5 726 Travel distance km 883 715 being more towards the beginning or end of the journey, or in between. It is expressed as a decimal and normally takes counted, and the count is used as a variable to explain values between 0 and 1, with lower values expressing a shift punctuality variations. towards the beginning and higher values a shift towards the As a measure of traffic intensity and station size, we count end of the journey. the number of trains arriving at the same station at the same In some timetables, there are negative margins: cases where hour as the train. Another measure of traffic intensity is the the scheduled time has been manually set to be shorter than number of movements per day during the studied year, which the technical minimum. We use a dummy variable with the we count across the whole network. Some vehicles are used value of 1 if there are any instances of this in a timetable, and more frequently than others, and to study the effect of this on 0 if there are not. punctuality we count the number of movements per vehicle individual during the timetable year. A movement is defined The travel time without margins, or the scheduled duration of as crossing one line section, between two control points. the journey, is included to help differentiate between the distance covered and the time in the system. Some train numbers are also run more frequently than others: some run almost every day, others only a handful of times or The average speed is calculated to include stopping times, even once during a year. Those that run more often might be and is derived by dividing the distance by the duration of the expected to perform better, because of increased routine and journey (as defined in the timetables). increased incentives. To study this, we count the number of Another timetable characteristic is the average distance days run per train number. between stops, which is calculated by dividing the distance While passenger volumes on both trains and stations are with the number of scheduled stops. believed to be an important factor for delays, this type of data This paper does not consider headway or buffer times, or is often confidential and difficult to access, so the effects of other measures of margins between trains, only margins these must be left for future studies. assigned within each train path. 2.2.5. Infrastructure elements 2.2.4. Operational variables From the Swedish rail asset management database BIS, see An overview of the seven studied operational variables is Thaduri et al. (2015) for a description, we have high level given in Table 3. information on eight types of infrastructure elements: their type and location. An overview of these is given in Table 4. The distance covered by trains has been known to influence punctuality since at least Harris (1992), who showed that We match all but 1 500 of 82 700 elements to the train distance covered was statistically significant in determining stations and railway links. We can distinguish between punctuality. We include travel distance as a parameter, elements in station areas and those on links, but choose not to measured in kilometers. do so in this analysis to keep the number of variables from growing too large, and because the results are largely the In this paper, as in a previous one (Palmqvist et al., 2017a) same for most variables. we consider and count interactions between trains. We define an interaction as an instance when two trains are at the same To add a dimension to this analysis, we also consider the place at the same time, which can happen at a station, or on a density of elements, not only their number. We do this by line section. When on a line section, the trains need to be dividing the distance traveled by a train by the count of traveling in the same direction to be counted as an interaction. elements of a given type passed by the same train, to arrive at This happens relatively frequently on double tracks, but is an average distance between the elements. also possible on some single tracks, if there are multiple blocks between two stations. The number of interactions are

5

INTERNATIONAL JOURNAL OF PROGNOSTICS AND HEALTH MANAGEMENT

Table 4. Overview of infrastructure elements Distance Power (Distance) Infrastructure element Approximate number 30,0% of elements y = 0,0003x1,0068 Signal 31 000 R² = 0,9545 Switch 15 500 Bridge 4 000 20,0% Tunnel 200 Level crossing 11 000 92.17% 10,0% Cutting 10 000 Embankment 2 000 Fence 9 000 0,0%

0 200 400 600 800 Punctuality Punctuality dropagainst averageof Whereas the number of infrastructure elements correlates Distance in km strongly with the distance traveled, as trains covering longer distances pass by more infrastructure elements, the density measures instead depend on where the trains travel and how Figure 1. Punctuality and the distance traveled in km dense the infrastructure is there, they are not dependent on %-points lower than the average of 92.17 %, plotted in Figure the distance covered. 1, and this difference is found to be statistically significant We do not study the age or condition of infrastructure using a two-sided Welch’s t-test, with a p-value approaching elements, or reported faults on them, because we do not have 0. From these t-tests we construct one plot per studied access to that data, only their number. Track works and variable, as in Figure 1, with the threshold values on the x- temporary speed restrictions are not covered, for the same axis and the differences in punctuality, compared to the reason. average across all trains in our sample, on the y-axis. To continue the example above, we plot an x-value of 400 and a 2.3. Data Analysis y-value of 12.55 %, and then repeat the same procedure again for the x-value of 450, determine the y-value and check The relationship between punctuality and the studied whether this difference in punctuality is statistically influencing factors is analyzed using a regression, t-tests and significant with a p-value lower than 0.01. visual analysis of plots. Correlation coefficients are also presented. This procedure is repeated at intervals for each studied variable, and the results plotted in scatter diagrams, to which A linear regression is performed over all the studied variables trend lines are fitted. This is done in Microsoft Excel. The found in Table 7, with punctuality as the dependent variable results are robust with regards to the range of the studied and the other 36 variables as independent. The regression is variable and the number of points per plot. performed in R. This is primarily done to see how much of the variation in punctuality is explained by the studied For each plot, we in turn try linear, exponential, logarithmic, variables. second-degree polynomial and power functions, and choose the function which provides the best fit, as determined by the Thereafter the variables are studied individually, with regards R2-value. In some cases, where the difference in R2 was less to their influence on punctuality. This is done by first setting than 0.02, we instead opted for a linear trend line function, up several threshold values, then by performing t-tests to see for simplicity. if the punctuality above a threshold is significantly higher or lower than the average punctuality across all the trains in our 3. RESULTS dataset. See below for an example. Welch’s two-sided t-test is used to allow for the fact that the samples are of unequal The following section first describes the results of the linear size and variances. The threshold is used to distinguish regression containing all studied variables, and how each type between observed trains with higher or lower values of the of variable contributes to the overall picture, then a table studied variable. When the p-value is found to be lower than summarizing the results for each variable separately. The 0.01, the punctuality for trains in the subset, surpassing the results for each studied variable are then described and threshold, is compared to the punctuality across all trains in discussed category by category, to better illustrate the results. our dataset, and the difference is noted. 3.1. Linear regressions For instance, in our dataset there are 41 614 (out of 883 678) trains which have travelled for at least 400 km during their A linear regression across all 37 studied variables listed in journey. They had a punctuality of 79.62 % which is 12.55 Table 7 was performed in R. Due to the large number of variables considered, we omit the estimated coefficients for

6

INTERNATIONAL JOURNAL OF PROGNOSTICS AND HEALTH MANAGEMENT

Table 5. Summary results of the linear regression Table 6. Regressions by type of variable Statistic Value Type of No. Adjusted Relative Residual standard error: 0.2142 on 883641 degrees of variables variables R2-value performance freedom Weather 6 0.02068 0.48 Multiple R-squared: 0.04302 Timetable 7 0.02538 0.59 Adjusted R-squared: 0.04298 Operations 7 0.02943 0.68 F-statistic: 1103 on 36 and 883641 DF Infrastructure 8 0.02355 0.55 p-value: < 2.2e-16 Infrastructure 8 0.00643 0.15 density All of the above 37 0.04298 1.00 space concerns, and only present the summary statistics of the combined regression, which are presented in Table 5, below.

The results of the regression show that all but 5 variables (#2, lists the smallest and largest threshold values of the studied #17, #33, #36 and #37 in Table 7) have significant impact variable included in the plots, that are used to derive the trend with p-values < 0.00001. They also show that less than 5% of line functions. The Pts./plot-column describes how many the variation is explained by this model, as well as a large different thresholds were plotted to make up the diagrams, to residual standard error, showing that it has a low predictive which the trend lines were fitted. The Highest corr. to-column accuracy even if the effect of the factors is shown to be presents the correlation coefficient to the other studied significant. Changing the specification to consider variable which is the highest, as well as the identifier for that polynomial functions only improves these numbers variable. marginally. We also carried out a series of regressions to study the impact 3.3. Weather of groups of variables, and their relative importance in The result show that punctuality falls exponentially as the explaining punctuality variations. Some summary results of temperature drops below 0 ℃. At -5 ℃ the punctuality is these are presented in Table 6. These results show that the about 7.5 %-points lower than average, and at -30 ℃ it is variables we have categorized as operational have the largest about 50 %-points lower. The same pattern is found at high impact on punctuality, whereas the infrastructure density variables we used only contributed marginally. There is also temperatures. At 23 ℃ punctuality is about 5 %-points lower some overlap between the types of variables, in the than average, and by 27 ℃ it has fallen by 26 %-points. In the face of increasing temperatures and more frequent heat punctuality differences they explain. The rightmost column waves, this suggests that more ought to be done to increase is calculated as the Adjusted R2-value for that row, divided the railway systems’ resilience to high temperatures, similar by the corresponding value in the bottom-most row, to the findings of Ferranti et al. (2016), Ford et al. (2015) and containing the combined model of all 37 variables. others. Xia et al. (2013) used a series of dummy variables for temperature, and their plot of the effect on punctuality looks 3.2. Summary table similar to ours. Table 7 summarizes the results for each variable, based on The variation in temperature across the geography that the the method described in section 2.3. The #-column contains train passes through is also significant. We find a logarithmic an identifier. The Variable-column contains brief, descriptive relationship between punctuality and the difference between names of the variables. The Trend line function-column minimum and maximum temperature across the journey. contains the trend line functions from the plots, described in Even a difference of 5 ℃ lowers punctuality by about 9 %- section 2.3, where x is the threshold value of the variable, and points. This variation is highly correlated (a correlation F(x) is the decline in punctuality of trains where the threshold coefficient of 0.47) with the distance traveled, which is to be is superseded, compared to the average punctuality across all expected, as a train that travels longer passes through a larger trains of 92.17 %. Punctuality is measured at all scheduled geography and potentially larger temperature variations. That stops, and cancelations are treated as non-punctual. The R2- the temperature gradient affects punctuality is to be expected, column describes how well the trend line function fits the some of the mechanisms are described in Tahvili (2016), but points plotted in each diagram. The Range plotted-column the effect we find is larger than in Xia et al. (2013).

7

Table 7. Summary of results from the analysis

# Variable (x) Trend line function R2 Range Pts. Highest corr. to plotted /plot 1 Punctuality, at all stops Target variable, y-value N/A N/A N/A -0.16 to #21 Weather 2 Temperature, below 0℃ 0.035e-0.088x 0.97 3 - 32 18 -0.46 to #4 3 Temperature, above 0℃ 1E-04e0.2838x 0.96 15 - 27 12 0.13 to #2 4 Temperature, difference ℃ 0.05041ln(x)+0.0082 0.90 1 - 26 25 0.50 to #26 5 Wind speed, max in m/s 0.0001x2.1182 0.93 5 - 25 16 0.22 to #7 6 Snow depth, average in cm 0.113ln(x)-0.0134 0.92 1 - 14 14 -0.37 to #2 7 Precipitation, sum of mm 0.0002x + 0.0172 0.97 1 - 1000 31 0.29 to #24 Timetable 8 Total margins, % 1.3461x2-0.632x+0.0526 0.94 0.1 – 0.5 9 0.91 to #9 9 Supplement density, s/km 0.0003x2-0.0055x+0.01 0.75 2 - 20 10 0.91 to #8 10 WAD of margins, % 0.5178x2-0.556x+0.1375 0.69 0.35 – 0.80 11 -0.09 to #14 11 Negative margins, binary 0.0277x-0.0116 1.00 0 - 1 2 -0.35 to #8 12 Avg. speed with stops, km/h 0.0003e0.0457x 0.81 40 - 120 10 -0.43 to #9 13 Travel time without margins, h 0.0163x+0.0145 0.95 0.5 - 10 20 0.89 to #21 14 Distance/Scheduled stops, km 0.0013x-0.0199 0.98 10 - 125 12 0.49 to #26 Operations 15 Station interactions, count 0.0104x-0.0208 0.95 2 - 20 10 0.73 to #24 16 Line interactions, count 0.022x+0.0236 0.97 1 - 5 5 0.08 to #24 17 Trains to same station & hour -0.0014x+0.0059 0.92 3 - 30 11 0.53 to #18 18 Movements per vehicle -1E-12x2+4E-07x-0.025 0.98 100k–300k 9 0.53 to #17 19 Trains per day 1E-11x2-2E-06x+0.0819 0.94 70k - 110k 5 0.15 to #17 20 Days run per train number -4E-05x+0.0012 0.99 100 – 350 6 -0.20 to #32 21 Travel distance, km 0.0003x1.0068 0.96 50 – 750 21 0.89 to #13 Infrastructure 22 Signal, count 5E-05x-0.0055 0.93 1 - 4500 10 0.95 to #24 23 Distance/Signal, km 0.0081x2-0.0059x+0.013 0.88 0.1 - 5 16 0.79 to #25 24 Switch, count -6E-08x2+0.0002x-0.027 0.88 1 - 2000 14 0.95 to # 22 25 Distance/Switch, km 0.0033x2-0.012x+0.0154 0.98 0.1 - 10 15 0.82 to #27 26 Bridge, count 0.0007x-0.0017 0.93 1 - 300 19 0.93 to # 22 27 Distance/Bridge, km 0.0029ln(x)+0.0085 0.55 0.1 - 60 18 0.82 to #25 28 Tunnel, count 0.0001x2-0.0006x+0.009 0.94 1 - 40 10 0.75 to #22 29 Distance/Tunnel, km 0.0.0156x0.1501 0.77 0.1 - 160 25 0.44 to #13 30 Level crossing, count 0.0005x+0.0057 0.98 1 - 160 12 0.65 to #24 31 Distance/Level crossing, km 0.0016x2-0.0096x+0.004 1.00 0.1 - 10 12 0.34 to #23 32 Cutting, count 0.0004x+0.0007 0.96 1 - 200 14 0.83 to #36 33 Distance/Cutting, km -8E-06x2-0.0013x+0.004 0.92 0.5 - 100 16 0.19 to #35 34 Embankment, count 0.0053x0.5479 0.87 1 - 160 12 0.48 to #32 35 Distance/Embankment, km 1E-06x2+0.0001x+0.004 0.94 0.1 - 160 25 0.25 to #30 36 Fence, count 7E-07x2+3E-05x+0.002 0.97 1 - 400 21 0.92 to #26 37 Distance/Fence, km -4E-05x2+0.002x+0.009 0.76 0.1 - 60 19 0.48 to #25

8 The result suggests that a power curve best describes the 120 km/h, including stopping times, an exponential curve influence of wind speed on punctuality. When wind speeds provides the best fit. There is a very notable decrease in exceed 10 m/s punctuality is almost 2 % lower than average, punctuality around 120 km/h, as the airport trains have an and about 9 %-points lower when they exceed 23 m/s. This is average speed of 118 km/h and a punctuality which is 5 %- a larger effect than found by Xia et al. (2013), who estimated points better than average, whereas the high-speed trains that wind speeds of 23-26 m/s reduced punctuality by about average 128 km/h (including stops) and 12 %-points lower 3.3 % in the Netherlands. punctuality than average. This makes for a clear break in the plot, and suggests that the speed is perhaps more of a proxy A linear function approximates the relationship between variable than the real issue. precipitation, measured as the sum of precipitation across the train stations passed by the train, and punctuality. A quarter Plotting punctuality against the scheduled duration of of all trains accumulate at least 30 mm of precipitation, journeys, without margins, the result is a linear decrease in associated with a punctuality drop of 1.8 %-points compared punctuality of about 1.6 %-points per hour. The duration of to the average, and the drop increases about 2 %-points per the journey is very highly correlated to the distance expressed 100 mm. Xia et al. (2013) used a slightly different measure, in kilometers, with a correlation coefficient of around 0.87. but found mostly linear effects of a similar magnitude. The average distance between stops also appears to affect punctuality in a mostly linear fashion, by about 1.3 %-point The effect of snow depth on punctuality is best described by for every 10 km. These findings are largely in line with our a logarithmic function fitted to the average snow depth, expectations, and with the fact that long distance trains often recorded at stations across the journey. While less than 6 % perform significantly worse in terms of punctuality. of the observations in our dataset have average snow depths larger than 1 cm, the magnitude of the effect, when it is 3.5. Operations present, is quite large: at an average of 5 cm the drop in punctuality is about 17.5 %-points. These effects are The results in Table 7 furthermore suggest that punctuality substantially larger than the estimates by Xia et al. (2013). drops linearly with the number of interactions at stations and The logarithmic function may suggest an increased on line sections. For interactions on line sections, which are preparedness and ability to deal with snow in the regions rare in our data, the drop is about 2.2 %-points per interaction. where large snow depths are often found, which decreases the For interactions at stations, which are much more common, harmful influence of snow. the decrease in punctuality is approximately 1 %-point per interaction. These findings largely confirm what we have 3.4. Timetable found in earlier research (Palmqvist et al., 2017a), and with the research on congestion by Gorman (2009). The results regarding timetable variables suggest that increasing margins benefits punctuality up to a point, after The number of trains that arrive at a station during an hour is which it begins to decline. That point occurs at around 12 linked to punctuality in a linear manner. Punctuality increases s/km, or 25-30 % of the minimum run time, at which point slightly as the stations are handle more trains. At volumes of punctuality is around 2 %-points higher than the average of at least 20 trains per station and hour, the punctuality is about 92.17 %. These two different ways of measuring punctuality 2.5 %-points higher than average. This is an interesting are very highly correlated with one another: the correlation finding, as increasingly congested stations are often coefficient is 0.91. This is well in line with earlier studies on suggested as a problem for punctuality (Palmqvist et al., margins (Palmqvist et al., 2017a). 2017b). Similarly, increasing the weighted average distance (WAD) We find that vehicles that are used more often have a higher of margins raises punctuality up to a point. The highest punctuality than those that are used less frequently, with a punctuality, almost 1 %-point higher than average, can be quadratic relationship between the two variables. At 125 000 seen when the WAD is around 0.60. This confirms earlier movements punctuality is about 0.5 %-point better than findings by the authors, and shows that the effect exists even average, and at 300 000 it is 5 %-points higher. This is when punctuality is measured at intermediate stops, rather somewhat surprising, but suggests that operators prefer to use than just at the end destination. the more reliable vehicles for more intensive routes, and does a good job of keeping them in a working condition. One The presence of negative margins in a timetable has a possibly confounding factor is that airport train vehicles negative impact on punctuality, lowering it by on average 2.8 seem, in our data, to be utilized much more heavily than other %-points compared to when there are no negative margins types of passenger trains, and the punctuality for these trains present. This figure is slightly lower than what we have found is very good. in earlier research (Palmqvist et al., 2017a), but on the same order of magnitude. How the number of trains per day affects punctuality is best described using a quadratic function, but the effect is Overall, increasing average speeds of trains is linked with relatively small. The largest number of trains operating in a decreasing punctuality. Between average speeds of 60 and

9 INTERNATIONAL JOURNAL OF PROGNOSTICS AND HEALTH MANAGEMENT day we consider is 110 000, which is associated with a with what others have found, though the magnitude of the punctuality drop of 1.2 %-points. effect is larger than in the Netherlands, for instance. Train numbers that are run more frequently are slightly more For timetabling, the highest punctuality is obtained when punctual than those that run less frequently. We find a linear margins are around 25% of the minimum run time, or 12 relationship, with those running almost every day being about s/km, with a slight shift towards the end of the journey and 1.2 % more punctual than average. This is in line with the no negative margins. Punctuality also improves slightly when suggestion earlier, that punctuality is improved when the train numbers are run more frequently and the vehicles vehicles are operated more frequently, or that more reliable less frequently. Traffic volume does not appear to be a major vehicles are used to run the most important routes. concern: variations due to a higher number of trains per day Unfortunately, the risk of the variable working as a proxy for are small, and punctuality is slightly higher at more busy the kind of passenger train is also present, as airport trains stations and times. However, the number of interactions have the most days run and the highest punctuality, followed between trains should still be minimized, as they are shown by commuter trains, and so on for regional, long distance and to lower punctuality. Many of these variables can be affected high-speed trains. directly by planners and managers at train operating companies and infrastructure managers, such that punctuality The single best indicator for punctuality is the distance improves. Even simple measures, such as reducing the traveled by a train. A linear function best fits our number of trains with negative margins, have large impacts observational data, with a decline of about 3 %-points for on punctuality. every 100 km. The correlation coefficient with punctuality is -0.20, which is the highest in our findings. Similar impacts are found with different types of infrastructure elements. Most infrastructure elements are 3.6. Infrastructure highly correlated both with each other, and with the distance traveled, and for that reason it may be appropriate to construct Finally, plotting the number of switches, tunnels and fences a sort of infrastructure complexity index. The overall picture, against punctuality, we find that they fit best to quadratic however, is that a simple infrastructure with less components functions. Bridges, signals, level crossings, and cuttings performs better. This, too, is something that planners can show linear relationships to punctuality. Embankments fit affect over time, as existing infrastructure can be modified, best to a power function. Signals have the largest effects in and new infrastructure can be designed in a manner that terms of magnitude, being associated with punctuality drops supports good punctuality. of around 22 %-points at the most. The method of conducting a series of t-tests and plotting the The quadratic relationships we find between punctuality and results was successful in illustrating the relationship between the number of switches, suggests a potential of gaining punctuality and the studied variables individually. It disproportionately large punctuality benefits by limiting their illuminates and quantifies impacts that are elusive when number. Particularly in large stations, where the current using other methods. However, more work needs to be done numbers are large and even small gains in punctuality are to disentangle the effects of many of these variables from highly valuable. each other. The number of infrastructure elements depends to When considering the distance between elements, the results a high degree on the distance traveled, which was already are broadly similar across the different types of infrastructure, known to be correlated with punctuality. Dense infrastructure and the overall picture is that punctuality is better where the and stops are, on the other hand, associated with local trains infrastructure is dense. An exception is for cuttings, where traveling shorter distances at lower speeds with more margins larger distances between them is associated with higher and higher punctuality. The linear regression of all studied punctuality. Level crossings are another exception, where variables together in this paper was intended to handle the punctuality first improves as the distance between them rises sometimes-significant covariation of different variables. to around 3 km, before it declines rapidly with increasing We believe that one reason for the mediocre R2-value in this distances. regression, compared to earlier ones, is that we study all passenger trains in the national network for one year whereas 4. CONCLUSIONS, DISCUSSION AND FUTURE RESEARCH other studies have looked at a smaller subset of trains, on one In this paper, we have quantified how temperature, or two selected lines or regions, during which the conditions precipitation, snow depth and wind speed affect punctuality. have been adverse, or using a measure that is more tailored to Especially high and low temperatures can have large impacts only find the faults that are under consideration. What we on punctuality, and the effects are exponential. This is an have tried to do is much broader, and it is no surprise that the important finding, which indicates that much more attention degree to which we can explain the variation is smaller, should be given to increasing the resilience of the railways simply because the variation studied is much larger. We also with regards to heat, as the climate changes and temperatures expect that data on passenger volumes, more detailed dwell rise. Overall, our findings with regards to weather are in line times, as well as headway times in both the timetable and in

10

INTERNATIONAL JOURNAL OF PROGNOSTICS AND HEALTH MANAGEMENT realized traffic would help explain more of the variation in Insights and Implications for Heat Risk Management. punctuality. Weather, Climate and Society, vol. 8, issue 2, pp. 177- 191. It is our hope that this research can help both researchers and Ford, A., Jenkins, K., Dawson, R., Pregnolato, M., Barr, S. & practitioners hone in on what can be done to improve Hall, J. (2015). Simulating Impacts of Extreme Weather punctuality. From improved heat-resilience of components, Events on Urban Transport Infrastructure in the UK. to better allocation of margins, more standardized train International Symposium for Next Generation routes, and a simpler infrastructure with fewer components, Infrastructure Conference (ISNGI 2014), pp. 233-238, there are many things that can be improved. The findings in Vienna, Austria. this paper can also be used to assess the possible impacts of Gorman, M. F. (2009). Statistical estimation of railroad different measures, and to help prioritize between them. congestion delay. Transportation Research Part E, pp. 446-456. REFERENCES Haldeman, L. (2003). Automatische Analyse von IST- Andersson, E. V., Peterson, A. & Thörnquist Krasemann, J., Fahrplanen. Zürich. (2013). Quantifying railway timetable robustness in Hansen, I. A. (2009). Railway Network Timetabling and critical points. Journal of Rail Transport Planning & Dynamic Traffic Management. 2nd International Management, issue 3, pp. 95-110. Conference on Recent Advances in Railway Engineering Andersson, E. V., Peterson, A. & Thörnquist Krasemann, J., (ICRARE-2009), pp. 135-144, Tehran, Iran. (2015). Reduced railway traffic delays using a MILP Harris, N. (1992). Planning Passenger Railways: A approach to increase robustness in Critical Points. Handbook. London: Transport Publishing Company Journal of Rail Transport Planning & Management, Limited. issue 5, pp. 110-127. International Union of Railways (2000). UIC Code 451-1 Andersson, E., (2014). Assessment of Robustness in Railway Timetable Recovery margins to guarantee timekeeping - Traffic Timetables. Licentiate thesis. Linköping Recovery margins. Paris: International Union of University, Norrköping. Railways. Brazil, W., White, A., Nogal, M., Caulfield, B., O’Connor, Jaroszweski, D., Hooper, E., Baker, C., Chapman, L. & A. & Morton, C. (2017). Weather and rail delays: Quinn, A. (2015). The impacts of the 28 June 2012 Analysis of metropolitan rail in Dublin. Journal of storms on UK road and rail transport. Meteorological Transport Geography, vol. 59, pp. 69-76. Applications, vol. 22, pp. 470-476. Carey, M., (1998). Optimizing scheduled times, allowing for Kellermann, P., Bubeck, P., Kundela, G., Dosio, A. & behavioral response. Transportation Research Part B, Thieken, A. H. (2016). Frequency Analysis of Critical vol. 32, no. 5, pp. 329-342. Meteorological Conditions in a Changing Climate – Carey, M., (1999). Ex ante heuristic measures of schedule Assessing Future Implications for Railway reliability. Transportation Research Part B, issue 33, pp. Transportation in Austria. Climate, vol. 4, issue 25. 473-494. Kim, H., Kang, J., & Bae, Y. (2013). Types of train delay of Cerreto, F., Nielsen, O. A., Harrod, S., & Nielsen, B. F. high-speed rail: indicators and criteria for classification. (2016). Causal Analysis of Railway Running Delays. Journal of the Korean Operations Research and Paper presented at 11th World Congress on Railway Management Science Society, vol. 38, issue 3, pp.37- Research (WCRR 2016), Milan, Italy. 50. Cicerone, S., D’Angelo, G., Di Stefano, G., Frigioni, D. & Lehtonen, I. (2015). Four consecutive snow-rich winters in Navarra, A. (2009). Recoverable robust timetabling for Southern : 2009/2010-2012/2013. Weather, vol. single delay: Complexity and polynomial algorithms for 70, issue 1, pp 3-8. special cases. Journal of Combinatorial Optimization, Ludvigsen, J. & Klæboe, R. (2014). Extreme weather impacts issue 18, pp. 229-257. on freight railways in Europe. Nat Hazards, vol. 70., pp- Dewilde, T., Sels, P., Cattrysse, D. & Vansteenwegen, P., 767-787. (2013). Robust railway station planning: An interaction Nagy, E. & Csiszar, C. (2015). Analysis of delay causes in between routing, timetabling and platforming. Journal of railway passenger transportation. Periodica Rail Transport Planning & Management, issue 3, pp. 68- Polytechnica Transportation Engineering, vol. 42. pp. 77. 73-80. Doll, C., Trinks, C., Sedlacek, N., Pelikan, V., Comes, T. & Nelldal, B.-L. (2009). Kapacitetsanalys av järnvägsnätet i Schultmann, F. (2014). Adapting rail and road networks Sverige. Delrapport 3: Förslag till åtgärder för att öka to weather extremes: case studies for southern Germany kapaciteten på kort sikt. Stockholm: KTH Royal and Austria. Nat Hazards, vol. 72, pp. 63-85. Institute of Technology. Ferranti, E., Chapman, L., Lowe, C., McCulloch, S., Nelldal, B.-L., Lindfeldt, A. & Lindfeldt, O. (2009). Jaroszweski, D. & Quinn, A. (2016). Heat-Related Kapacitetsanalys av järnvägsnätet i Sverige. Delrapport Failures on Southeast England’s Railway Network: 1: Hur många tåg kan man köra? En analys av teoretisk

11

INTERNATIONAL JOURNAL OF PROGNOSTICS AND HEALTH MANAGEMENT

och praktisk kapacitet. Stockholm: KTH Royal Institute Scheepmaker, G. M. & Goverde, R. M. (2015). The interplay of Technology. between energy-efficient train control and scheduled Nie, L. & Hansen, I. A. (2005). System analysis of train running time supplements. Journal of Rail Transport operations and track occupancy at railway stations. Planning & Management, issue 5, pp. 225-239. European Journal of Transport and Infrastructure Schöbel, A. & Kratz, A. (2009). A bicriteria approach for Research, vol. 1, issue 5, pp. 31-54. robust timetabling. In Ahuja, R. K., Möhring, R. H. & Nikolic, I., Arsovski, S., Eric, M., Vujcic, S., Manojlovic, G. Zaroliagis, C. D. (Eds.) Robust and online large-scale & Jovanovic, J. (2016). Overall railway infrastructure optimization, pp. 119-144. Berlin: Springer Berlin effectiveness as a quality factor for Serbian railways. Heidelberg. Tehnicki vjesnik, vol. 2, pp. 547-554. Stenström, C., Parida, A., Lundberg, J., & Kumar, U. (2015). Nilsson, J.-E., Björklund, G., Pyddoke, R. & Vierth, I. (2015) Development of an integrity index for benchmarking and Regress – en god idé i järnvägssektorn? VTI Rapport monitoring rail infrastructure: application of composite 850. Linköping: VTI. indicators. International Journal of Tail Transportation, Norrbin, P., Lin, J. & Parida, A. (2016). Infrastructure vol. 3, issue 2, pp 61-80. Robustness for Railway Systems. International Journal Swedish Transport Administration. (2017). Tillsammans för of Performability Engineering, vol. 13, issue 3, pp 249- tåg i tid - Resultatrapport 2017. Borlänge: Trafikverket. 264. Tahvili, N. (2016). Winterization of Railways – issues and Olsson, N. & Haugland, H. (2004). Influencing factors on effects. Master thesis, Norwegian University of Science train punctuality - results from some Norwegian studies. and Technology, Trondheim. Transport Policy, issue 11, pp. 387-397. Thaduri, A., Galar, D. & Kumar, U. (2015). Railway assets: Olsson, N., Halse, A. H., Hegglund, P. M., Killi, M., van der A potential domain for big data analytics. Procedia Kooij, R., Landmark, A., Seim, A., Sørensen, A. Ø., Computer Science, vol. 53, pp 457-467. Økland, A. & Østli, A. (2015). Punktlighet i jernbanen - Transport Analysis (2016). Punktlighet på järnväg 2015. hvert sekund teller. Oslo: SINTEF akademisk vorlag. Stockholm: Transport Analysis. Oslakovic, I. S., ter Maat, H. W., Hartmann, A. & Dewulf, G. Veiseth, M., Olsson, N. O. E. & Saetermo, I. A. F. (2007). (2013). Risk Assessment of Climate Change Impacts on Infrastructure’s influence on rail punctuality. WIT Railway Infrastructure. Engineering Project Transactions on The Built Environment, vol. 96, pp 481- Organization Conference 2013, July, 9-11, Devil’s 490. Thumb Ranch, Colorado, USA. Vekas, P., van der Vlerk, M. & Haneveld, W. (2012). Pachl, J. (2002). Railway Operation and Control. Mountlake Optimizing existing railway timetables by means of Terrace: VTD Rail Publishing. stochastic programming. Stochastic Programming E- Palmqvist, C.-W., Olsson, N. O. E. & Hiselius, L. (2017a). Print Series, vol. 8. An Empirical Study of Timetable Strategies and Their Wiggenraad, I. P. B. (2001). Alighting and boarding times of Effects on Punctuality. Paper presented at the 7th passengers at Dutch railway stations. Delft, TRAIL International Conference on Railway Operations Research School. Modelling and Analysis (RailLille 2017), April, 4-7, Vromans, 2005. Reliability of railway systems. Doctoral Lille, France. thesis, Erasmus Research Institute of Management, Palmqvist, C.-W., Olsson, N. O. E. & Winslott Hiselius, L. Erasmus University Rotterdam, Rotterdam. (2017b). Punctuality problems from the perspective of Xia, Y., Van Ommeren, J. N. , Rietveld, P. & Verhagen, W. timetable planners in Sweden. Paper presented at EURO (2013). Railway infrastructure disturbances and train Working Group on Transportation Meeting 2017 operator performance: the role of weather. (EWGT 2017), September, 4-6, Budapest, Hungary. Transportation Research Part D, vol. 18, pp. 97-102. Parbo, J., Nielsen, O. & Prato, C. (2016). Passenger Xu, P., Corman, F. & Peng, Q. (2016). Analyzing Railway Perspectives in Railway Timetabling: A Literature Disruptions and Their Impact on Delayed Traffic in Review. Transport Reviews, vol. 36, issue 4, pp. 500- Chinese High-Speed Railway. IFAC-PapersOnLine, 526. volume 49, issue 3, pp 84-89. Qin, Y., Ma, X. & Jiang, S. (2017). A Stochastic Process Yuan, J. & Hansen, I. A. (2007). Optimizing capacity Approach for Modeling Arrival Delay in Train utilization of stations by estimating knock-on train Operations. Transportation Research Board 96th delays. Transportation Research Part B, vol. 41, pp. Annual Meeting, January, 8-12, Washington, D. C., 202-217. USA. Yuan, J. & Hansen, I. A. (2008). Closed form expression of Rietveld, P., Bruinsma, F. & van Vuuren, D. (2001). Coping optimal buffer times between scheduled trains at railway with unreliability in public transport chains: A case study bottlenecks. 11th International IEEE Conference on for Netherlands. Transportation Research Part A, vol. Intelligent Transportation Systems, October, 12-15, 35, pp. 539-559. Beijing, China.

12

INTERNATIONAL JOURNAL OF PROGNOSTICS AND HEALTH MANAGEMENT

Zakeri, G. & Olsson, N. O. E. (2017). Investigation of punctuality of local trains: The case of Oslo area. Paper presented at EURO Working Group on Transportation Meeting 2017 (EWGT 2017), September, 4-6, Budapest, Hungary.

ACKNOWLEDGEMENT Funding, infrastructure and train control data was provided by the Swedish Transport Administration. Detailed timetable data was provided by Dr. Martin Aronsson at RI.SE. This research was done in collaboration with K2 – The Swedish Knowledge Centre for Public Transport. We thank our two reviewers and the editor for their very insightful and constructive comments.

13

Paper 3

An Empirical Study of Timetable Strategies and Their Effects on Punctuality

Carl-William Palmqvist a, 1, Nils O.E. Olsson a, b, Lena Hiselius a a Department of Technology and Society, Lund University P.O. Box 118, 221 00 Lund, Sweden 1 E-mail: [email protected] Phone: +46 (0)70 438 57 37 b Department of Production and Quality Engineering, Norwegian University of Science and Technology S. P. Andersens veg 5, 7031 Trondheim, Norway

Abstract Punctuality in Sweden has been too low for several years, and there is now a goal in the industry that it must improve by more than five percentage points by 2020. Research has shown that timetable properties affect punctuality, and timetables can be changed relatively quickly and easily, compared to infrastructure and rolling stock. In this paper we study some strategies for assigning margins in timetables, and how they affect punctuality. The aim is to identify changes that can improve the punctuality of passenger trains. The study is based on a dataset containing detailed timetables and records of train movements for all trains in Sweden during the year of 2015, over 1.1 million departures. We find that every additional percentage point of margins improves punctuality by about 0.1 percentage points, and about the same for every percentage point increase in the weighted average distance of margins. It is common for margins on some line sections to be negative, which lowers punctuality by on average 4.1 percentage points. Small time supplements directly following station stops are associated with a punctuality drop of 2.5 percentage points. We also find that the number of interactions between trains at stations, and on line sections, lowers punctuality by 1.2 and 3.9 percentage points per instance, respectively. From these results we propose a package of changes to timetable planning which we estimate would improve overall punctuality by up to five percentage points.

Keywords Railway, timetable, punctuality, time supplements, margins.

1

1 Introduction

Punctuality is an important factor for the attractiveness and efficiency of the railway sector. In Sweden a target has been set and agreed upon by the industry, that by 2020 the punctuality across all trains should be 95%. This is measured as arriving at the final destination with a delay of at most five minutes. Since 2012 the punctuality has been steady at around 90 % for passenger trains and slightly below 80% for freight trains (Trafikanalys, 2016). Large and rapid improvements are thus required, if the target of 95% is to be reached on time. As Parbo et al. (2016) show, timetable characteristics are important influencing factors for delays and robustness in railway traffic. Previous studies indicate that properties of the timetable have an impact on delays and punctuality (Palmqvist, et al., 2017). These often stem from a number of strategic decisions a planner must consider when designing a timetable. On a high level these strategies include the balance between precision and slack (Olsson, et al., 2015), a balance between using headways to assign buffers between trains or using time supplements to assign margins within train journeys (Nelldal, 2009), the degree to which a cyclic timetable is desired, the heterogeneity of traffic and the degree to which homogenization measures are to be employed (Nelldal, et al., 2009), questions of geographical accessibility and the design of the network to be utilized, the balance between stops at end points or intermediate stations, and so on. On a more detailed level that individual timetable planners operate on, the decisions to be made are often about the size of the margins and how they should be distributed. For both of these decisions, there are a number of different approaches or strategies, which can be identified both in the literature and in practice. The punctuality effects of these decisions are often difficult to foresee for the planner, and because of the complexity of traffic, changing conditions, and sheer number of decisions each timetable planner makes, it is difficult to assess their impacts. Because of this difficulty, there has not been a convergence around best practice in timetabling, based on our interviews with planners. Furthermore, there have been no steady improvements in punctuality (Trafikanalys, 2016), which one might expect as planners learn what works and what does not.

1.1 Aim and Delimitations

The purpose of this paper is to study the effects on punctuality of some strategies for allocation of margins in timetables for passenger trains, by analyzing empirical data. The analysis focuses on aspects that relate to the planning of a timetable, i.e. the size and distribution of margins. We also focus on margins “within” train paths rather than those “between” them. As such, this study does not address headway or buffer times. Freight and service trains are not included, neither are strategic dimensions such as cyclicality, homogenization of traffic, frequency and network structure, or the specific causes for and distributions of delays.

1.2 Previous Research

An overview of the state of the art in timetable research is provided by Hansen (2009). The author concludes that the key issue for high quality timetables is a precise estimation of blocking times, taking into account the signals, platforms, train processing, and using

2

realistic run and dwell times. This is often not the case in practice, or when using deterministic models. Similarly, queuing and simulation models inadequately reflect speed variations and the behavior of railway staff. According to Parbo et al. (2016), planners have tools to make timetables robust against delays, for example by adding time supplements, lowering heterogeneity in the timetable by having uniform stopping patterns, finding optimal speed and reducing interdependencies between trains. Carey (1999) discusses a number of different heuristic measures of timetable reliability, with special consideration to knock-on delays. These include probabilities of delays, calculated in a number of ways, and different headway based measures. The latter are found to be easier to use and calculate, because they do not require nearly as much data. Influencing factors on train punctuality in Norway are presented by Olsson & Haugland (2004). In short, the authors find that in congested areas the management of boarding and alighting passengers is the key factor, while on single track lines the management of train crossings is the key success factor. Gorman (2009) used statistical analysis to study which factors contributed the most to delays for freight trains in the US. He found that the number of meets, passes and overtakes consistently had the highest impact, suggesting that congestion was the primary cause for delays. Wiggenraad (2001) studied seven Dutch train stations in detail. He found that dwell times are longer than scheduled, that the dwell times at peak and off-peak were the same, and that passengers concentrated around platform access points. This suggests an improvement potential, of shorter real dwell times if travelers could be distributed more evenly along the platform. Along the same lines, Nie & Hansen (2005) studied trains in the station area of The Hague. They found that trains operate at lower than design speeds, and that dwell times at platforms are systematically extended because of other trains blocking their routes, and because of the behavior of train personnel.

Precision and Slack In the literature on train timetables, delays and punctuality, Olsson et al. (2015) identify two major strategies. One strategy is precision, which focuses on making sure that the trains run when they should by finding what causes variation in the run and dwell times, such as the vehicles, infrastructure, and traveler behavior at stations, and addressing those issues so that the disturbances are reduced. This leads to greater efficiency and shorter travel times, among other things, but it is a difficult to implement and maintain. The other strategy is slack, and consists of introducing buffer times and redundancies to ensure that when a disturbance occurs and delays appear, the timetable has enough margins for the train to recover, and for the delays not to spread throughout the system. This strategy also improves reliability, but reduce efficiency. One way that slack reduces efficiency is discussed in Carey (1998): the commonly observed fact that, as more time is allowed for an activity, there is a behavioral response that makes the activity take a longer time. This response reduces or eliminates the expected benefit from adding the time supplement, in terms of for example increased reliability. This is important to consider when assigning both dwell times and running time supplements. The paper presents a framework which arrives at optimal time allowances based on two ratios: the behavioral response ratio, and the ratio of scheduled time costs to lateness costs. The latter is touched upon by Rietveld et al. (2001), who studied the valuation of certain travel time increases compared to uncertain ones, in the Netherlands. They found that a certain delay of 1 minute was valued at 27 cents, while a 50% chance of a delay of 2 minutes was 64 cents, which implies a strong attitude of risk aversion.

3

Headway Times and Time Supplements One way to introduce slack into the timetable is by using headway times to separate trains from each other, trying to ensure that even if one train is delayed, that delay does not spread to the surrounding traffic. The tradeoff with headway times is that the increased reliability comes at the cost of reduced capacity, as the number of trains running is necessarily decreased. Yuan & Hansen (2008) find headway times to be effective in reducing knock- on delays. In an earlier paper, Yuan & Hansen (2007) propose an analytic model for estimating knock-on delays of trains in complex stations, with the intent to optimize capacity utilization in stations. They find that as the buffer time between trains decreases, the knock-on delays increase exponentially. Similarly, Dewilde et al. (2013) introduce a method to increase timetable robustness in complex stations, basically by maximizing the minimum headway time between trains. They find that this approach improves the robustness in the station zone of Brussels by 8% and reduces knock-on delays in the area by half. The second way of adding slack in a timetable is by adding time supplements to trains, so that if they are delayed for whatever reason, they have some time reserve that they can use to catch up to the schedule. The cost of time supplements is simply that the travel times increase, which in turn consumes capacity, increases costs, and reduces the competitiveness of the trains compared to other modes of travel. This is discussed by Cicerone et al. (2009) and Schöbel & Kratz (2009). To an extent, time supplements are mandatory. The International Union of Railways (2000) requires recovery margins to be a minimum of 1.0 min/100 km for passenger trains that are motor-coach trainsets, and 1.5 min/100 km for other passenger trains. If the margins are expressed as a percentage of journey-time, values from 3 to 6 % are required, based on maximum speed and tonnage. The most commonly occurring combinations in Sweden require recovery margins of 5 %. Also in a Swedish context, Nelldal et al. (2009) performed simulation experiments on high speed trains (speed limit 200 km/h) between and Stockholm, finding that by ensuring that the minimum headway between these trains and the surrounding traffic is at least 5 minutes, punctuality would improve by 5 to 10 percentage points, depending on the direction. Alternatively, additional running time supplements of four minutes would yield the same gains. They conclude that increasing headways is preferable, because it does not extend the journey time.

Distribution of Margins When adding time supplements, we can identify five different strategies regarding their distribution from the literature, see for example Vromans (2005), Andersson (2014), and Palmqvist (2014).

1. A uniform distribution of margins

When timetables are created in Sweden, the runtimes are calculated as if the train had a technical top speed 3% lower than what it does, which practically speaking adds a uniformly distributed margin of 3%. This is often motivated by differences in train driver behavior, and in varying weather conditions. This is often described as the primary source of margins in timetables, but in fact only makes up a small fraction of the total margins. In addition to this, the signaling system permits trains to exceed the speed limit by a percentage point or two, which in effect adds a further margin, that is also uniformly distributed. Using simulation-based methods, Vromans (2005) and Vekas et al. (2012) found that a uniform

4

distribution was sub-optimal for delay recovery, given some assumptions of the delay distributions. However, Scheepmaker & Goverde (2015) show that from an energy- efficiency perspective it is better to distribute the running time supplements evenly.

2. Shifting margins towards beginning or end

If the supplements are not evenly distributed, there must by definition be more either towards the beginning or the end of the journey. The balance is this: If delays happen towards the end of the journey, margins towards the beginning of the journey are “wasted” and can no longer be used. If, however, a delay happens towards the beginning of the journey and there are not enough supplements there, the delay will be carried for a longer part of the journey affecting passengers and surrounding trains. Vromans (2005) uses the measure of Weighted Average Distance, or WAD, to describe how the supplements are distributed along the journey, and attempted to optimize this using both analytical and numerical methods for some hypothetical and real cases, concluding that a slight shift towards the beginning is best. This is further discussed in Andersson (2014).

3. Margins can be placed at or near strategic locations

These could be places where the risk of delays spreading is high, where a lot of travelers would be affected, or where punctuality is measured. In Sweden (Palmqvist, 2014) and Switzerland (Vromans, 2005; Haldeman, 2003), among others, node supplements are intended to increase punctuality into these previously defined strategic locations, called “nodes”.

4. Where disturbances happen most frequently

Attempts can also be made to place the margins where disturbances happen most frequently. This is often seen as the preferred alternative, as it minimizes both the risk of supplements being “wasted”, and the risk of delays being “carried” for long stretches. Naturally, the difficulty is in identifying these places, because disturbances are random to a large degree, and delays that occur systematically should be managed directly as the dwell- and runtimes are calculated instead of by added margins. It is complicated further by the fact that reported delays contain both primary and secondary delays, while only the primary delays are useful when locating disturbances (Vromans, 2005). However, as shown by Wiggenraad (2001), Dewilde et al. (2013) and Olsson & Haugland (2004) among others, there is ample evidence that delays often occur during scheduled stops at stations, and it may be reasonable to add margins directly after these stops.

5. Placing margins at “critical points”

A more advanced version of distributing margins is using the concept of “critical points”, introduced in Andersson et al. (2013), further described and utilized in Andersson et al. (2015). The authors define critical points as points in a timetable when and where two trains are traveling on the same line in the same direction, either when one train enters a line after a preceding train, or when one train overtakes another. These points are especially sensitive to delays, both because existing delays tend to increase in these cases, and because any delays here easily spread to other trains. The first paper introduces the concept and a way to calculate the robustness in critical points, and compares the measure to other reliability measurements. The second paper describes a mixed integer linear programming approach

5

to increase the robustness in critical points, by reallocating the existing time supplements of relevant trains.

2 Method

2.1 Study Design

This study consists of three components carried out in sequence: a literature review, an interview study, and data analysis. Firstly, a literature review was carried out covering the topics of timetabling, robustness, punctuality and delays. The emphasis was especially on the size and distribution of margins, and to factors linked to punctuality. The result of this review was then used in designing the interview study and the data analysis. The interview study aimed to further inform the data analysis, and to help improve the understanding of its results. It contained interviews with all long term timetable planners in Southern Sweden, four in number, and the senior timetable planner for the largest passenger train operating company in Sweden. Four themes were explored: feedback, guidelines, trade-offs, and rules of thumb. Each interview was approximately an hour long, recorded, and transcribed in full, resulting in a written material of around 50 pages. Throughout the paper, we will make reference to statements made by the planners during these interviews, to help inform the results and discussion. We believe that the planners we interviewed, and the issues they mention, are representative for those in the rest of the country. However, the point of the interviews was not to be comprehensive or exhaustive, but to provide some interesting input and understanding into the wider study. The data analysis was carried out on a dataset containing detailed timetables and train control data spanning one year and the totality of the Swedish railway network. The intent behind this analysis is to provide a means of studying the effects of decisions and strategies discussed in both the literature and the interviews. We believe that the empirical aspect is our primary contribution to the research field. The rest of the paper is essentially structured as a presentation and discussion of the data analysis, with reflections and input from the literature and interview studies.

2.2 Dataset

The quantitative part of this study is based on analysis of a dataset containing detailed exports of all timetables used for train traffic in Sweden during the timetable year of 2015, provided by the Swedish Transport Administration. This dataset covers almost 46,000 distinct timetable versions and over 1.1 million departures. Of these, 83 % were passenger trains, 14 % freight trains, and 3 % service trains. 43 % of journeys were longer than 100 km, 31 % shorter than 50, and 25 % between 50 and 100. All of the changes made during the ad hoc-process during the year are included. The detailed timetables are used to identify the size and distribution of margins. We also have data for all train movements in Sweden, for the corresponding period, derived from the track blocking and signaling systems. We use this data to determine whether or not trains were punctual, allowing us to link together timetable properties with real outcomes. A number of filters were the applied on the data, in order to focus on only passenger

6

trains, and to exclude incomplete observations and outliers according to a wide range of parameters. The remaining data covers over 470 000 completed passenger train journeys across Sweden during one year.

2.3 Analyzed Variables

In this paper, we are concerned with the problem of margin allocation in timetables for passenger trains, their size and how they should be distributed. The study seeks to analyze the strategies for answering those questions, and the effectiveness of those strategies in improving punctuality. The strategies are studied through an analysis of the relationship between punctuality and a selection of influencing factors summarized in Table 1. Based on the literature review and interview study, six influencing factors were selected for use in the data analysis.

Variable Definition/Unit Analyzed Proposed by Range Punctuality Fraction 0.00 – 1.00

Influencing Factors Total size of margins Fraction of run time 0.00 – 0.50 The International Union added of Railways (2000) Distribution of time Weighted average 0.00 – 1.00 Vromans (2005) supplements distance, as a fraction Negative time Yes, or no 1, 0 Interview study supplements Average supplement Seconds 0 – 120 Interview study, directly following Wiggenraad (2001) stops Interactions at stations Count 0 – 20 Gorman (2009) Interactions on line Count 0 – 5 Andersson, et al. (2013) sections

Table 1 Studied variables.

Punctuality In this study we use punctuality as the measure against which we compare different strategies and influencing factors. We define punctuality in the following way: trains arriving at their final destinations with a delay not exceeding five minutes are considered punctual and are given a value of 1, while trains arriving at their final destinations with a delay of more than five minutes are not considered punctual and are given a value of 0. In any aggregate of trains, the punctuality is calculated as the average of these values, either 1 or 0, present in the aggregate, and presented as a fraction. To illustrate: in a group of four trains where three are punctual and one is not, the punctuality is 0.75. In some cases, the overall punctuality is discussed. This we define as punctuality across all trains, not just that of the trains in any particular aggregate. For instance, in Sweden

7

during the year of 2015, 83 % of all trains carried passengers. If punctuality for passenger trains were improved by five percentage points, punctuality overall would only increase by 4.15 percentage points, because the punctuality of the other 17 % of the trains would not necessarily increase. In this way we can estimate what impact the punctuality improvements for some passenger trains would have on the overall punctuality.

Margins and Time Supplements Throughout this paper, we discuss margins and time supplements, regarding both their sizes and their distribution. In both cases, we do so in terms of either percent or fractions of minimum run times. We are able to do this because of the detailed timetables we have access to, which contain the minimum run time in one field and various time supplements in other, separate fields. The terms “margins” and “time supplements” are similar in meaning, and could perhaps be used interchangeably, we will try to explain how we interpret them. In Sweden a timetable is made up of a series of line sections and stations, for which there are run times and dwell times, respectively. For every line section there is a non-zero run time, which is in turn made up of a technical minimum, calculated by a physics-based computer program, and time supplements. These time supplements are set manually by the timetable planner and can be zero, positive and even negative. When we write of “time supplements” we mean these specific and individual supplements. When we instead write of “margins” we mean the totality of all time supplements in a given timetable, or across a number of trains. To illustrate both how we use the two terms, and to demonstrate how we calculate them, we will give an example. If a train journey is made up of 10 line sections where the technical minimum runtime is five minutes each, suppose that on two sections there are time supplements of two minutes each. On those two sections, the supplements are 0.4, if expressed as a fraction, while on the other eight sections the supplements expressed as fractions are 0.0. Across the whole journey, the train has margins of 8%. When aggregating margins across trains, there are two conceivable ways of doing so, yielding slightly different results. One is to first calculate the margins for each train individually, and then average across those values. If one train has 10% margins and another 20%, the average across the two is 15%. The other way is to take the sum of all time supplements across the trains, and divide it by the sum of the minimum run time for all trains. If the train that has 10% margins has a technical minimum runtime of 100 minutes, it has 10 minutes in time supplements. If the second train, with 20% margins, only had a technical minimum runtime of 50 minutes, it also has time supplements of 10 minutes. The sum of time supplements for both trains is then 20 minutes, and the technical minimum run time is 150 minutes, which makes for 13.33% in margins, on average. We use the first method in this paper, calculating the margins for trains individually first, and averaging across those individual margins. The second method gives a higher weight to trains traveling longer distances, and at lower speeds, which we do not necessarily consider justified. In this paper, we focus on margins “within” train paths rather than those “between” them. As such, we do not look at headway times, or buffer times as described in Pachl (2002). We know that these are also important factors, and leave them for future work.

Negative Time Supplements Throughout our research, both in the interview study and in the data analysis, we have come across instances of negative time supplements in train timetables. These are cases where the

8

planner has subtracted run time, so that the scheduled time is shorter than the technical minimum. This is described further in section 3.1 below. Assessing the impact of these negative supplements, we have chosen to use a dummy variable with the value of 1 if there are any instances of this in the corresponding timetable, and 0 if there are not. Thus, we have not looked at the number of negative margins in a timetable, the sum, or the average size, but merely identified the timetables that contain them and those that do not.

Distribution of Margins As we saw in section 1.2, there are several different ways of distributing margins when planning a timetable. There are also a number of measures that can be used to describe how the margins have been distributed across a timetable, in retrospect. In this paper, we do this in two ways. The first method we use is the Weighted Average Distance (WAD) measure described in Vromans (2005). This is used to measure how the various time supplements in a timetable are balanced, being more towards the beginning or end of the journey, or in between. It is expressed as a fraction and normally takes values between 0 and 1, with lower values expressing a shift towards the beginning and higher values a shift towards the end of the journey. A value of 0.5 implies a perfect balance. However, there are many different ways one can reach such a balance. Having a single time supplement in the very beginning and an equally large one at the very end will yield a 0.5, as will a single supplement in the middle, or a series of equally large supplements across the whole journey. The measure does not distinguish between these alternatives, although one might expect them to have different effects. We found that the WAD sometimes behaves unpredictably when used with negative time supplements. They can cause the measure to take values both below 0 and above 1, which we do not believe was the intention. Rather than developing our own measure or tweaking the formula for calculating WAD, we have chosen to exclude observations where the WAD is negative or larger than one from our analysis of WAD. The trains thus excluded make up less than 2 % of the sample, which we consider acceptable, and we still believe that the WAD is a useful measure. The other method we use to describe the distribution of margins is to look at line sections directly following scheduled stops. Often there are time supplements on these sections, intended to help recover delays that occurred at the station. To measure the extent to which this has been done consistently, as part of a conscious strategy, we have calculated the average size of time supplements directly following scheduled stops. These time supplements and their averages are given in seconds. To aid in visualization, we have rounded off the averages to intervals of 10 seconds each, then used this value as a grouping variable to create aggregates of trains for which we have calculated the degree of punctuality.

Interactions at Stations and on Line Sections In this paper, we consider and count interactions between trains. We define an interaction as an instance when two trains are at the same place at the same time, which can happen at a station, or on a line section. When on a line section, the trains need to be traveling in the same direction to be counted as an interaction. This happens relatively frequently on double tracks, but is also possible on some single tracks, if there are multiple blocks between two stations. We are well aware that the number of interactions for trains may not be as easily

9

influenced by the timetable planner as the size and distribution of margins. We include them in the paper partially because Gorman (2009) showed that interactions have a large influence on punctuality, and mostly because Andersson et al. (2013, 2015) described using scheduled interactions between trains as a trigger for assigning margins. We count the interaction for both trains, regardless of their order. We do not have any information about the track layout at stations, and we do not use any speed or runtime differentials between trains on a line section in the way that a micro-simulation might, at this stage we simply identify and count the interactions. Interactions have been identified and counted on the unfiltered dataset, including freight and service trains, passenger trains completing only parts of their journey, and so on. Interactions can be counted both in the timetable, and in realized traffic. We have tried both, and found the results to be broadly similar. Instead of presenting all of these results, we chose to use the scheduled interactions at stations, and the realized interactions on line sections, where the correlations are the highest.

2.4 Data Analysis

The relationship between punctuality and the studied influencing factors is analyzed using visual analysis of plots, regression analysis and t-tests. In order to aid the visual analysis, the number of observations of each influencing factor is reduced through an aggregation of observations. In T-SQL, the programming language we use for the most part, we use the “group by” command to aggregate observations according to different values of the influencing variables, and the punctuality is calculated within each of these aggregates. When creating the aggregates, we perform rounding to create neater groups containing more uniform numbers of observations. For instance, we round off the total size of margins so that each aggregate represents one percentage point of added runtime. Without this rounding, there are many aggregates each representing unique fractions with 14 decimal points, containing uneven numbers of observations. This would make any visualization of the results difficult indeed. The regression analysis is carried out in Matlab using Weighted Least Squares. Since the focus of the study is to analyze the general trend in the relationship between punctuality and the studied influencing factors, linear regressions are applied. The number of observations in each aggregate is used as weights, to account for the fact that some observations occur much more frequently than others. For example, the Weighted Average Distance used to describe the distribution of time supplements along the journey, took the value of 0.50 in 113 517 cases while the value of 0.90 occurred only 736 times. For this reason, we use the number of observations as weights. We also use the bisquare method to further increase the robustness of the fit against outliers in punctuality, which are otherwise evident from the plots. T-tests are used in order to study the effect on punctuality by negative time supplements and time supplements directly following station stops.

10

3 Results with Discussion

3.1 Size of Margins

Figure 1 Punctuality and size of margins.

When plotting the size of margins against punctuality, as in Figure 1, we see a positive trend, but also a wide spread. A linear trend line (weighed least squares) is also estimated and results in a positive slope term, see Table 2. The results indicate that margins do seem to increase punctuality, in general, but one should not expect immediate changes in punctuality by slightly tweaking the size of margins in an individual train. Table 2 illustrates the futility of using the size of margins as the only tool for improving punctuality: to improve punctuality by five percentage points, journey times need to be extended by approximately 40%. This is not acceptable in most cases. One possible way of explaining this difficulty is the behavioral response discussed by Carey (1998). Another is the fact that the distribution of margins also matters, and that if the distribution is inappropriate the time supplements are wasted.

Independent Variable Unit Intercept Slope R2 Margins Fraction 0.894 0.117 0.528

Table 2 Summary of linear regression analysis related to size of margins. Punctuality is the dependent variable, expressed as a fraction.

If we only adjust the average size of margins slightly, however, for instance by adding more margins to long distance trains that currently have the lowest margins and punctuality statistics, it is possible to make some gains. The average level of margins across all passenger trains in Sweden is currently 17%. If we increase this by four percentage points to 21%, which is the current average for regional trains, we would expect punctuality to improve by about 0.47 percentage points. While it is not enough of an improvement on its own, it is a start.

Negative Time Supplements While studying timetables in detail, negative time supplements were identified. These are

11

instances where the planners have removed time so that the scheduled run time is in fact shorter than the technical minimum. This practice also came up in our interviews, but not in our review of the literature. The timetable planners say that this practice is at the request of the train operating companies, that it is mainly done for local trains on single track lines, that it is accompanied by positive supplements on later line sections where the trains are intended to catch up, and that it has “always” been thus. Several different rationales are cited by the planners:  It helps timetables fit together, by making it appear as if trains will make it to a station in time for a meeting. It is a way to run more trains than would otherwise be possible and is requested by the train operating companies.  On some lines, the rules state that arrival times should occur only at whole minutes, which requires rounding. If one always rounds up, this can add several minutes to the journey time, which is often undesirable. For this reason, planners sometimes round down, subtracting seconds.  On some stations, there are not many passengers, sometimes none, and then the train operating companies think it is a waste for the train to just stand at the platform and wait for the departure time. By having very short or even non- existent dwell times, or using negative time supplements to ensure that the train is artificially late to the station, these “unnecessary” waits can be avoided.  It has “always” been done this way, despite it being against the rules.

Assessing the effects of this practice by looking at traffic data for a year, we found a t- test, separating those timetables with instances of negative margins from those without, to be more illustrative than a diagram following the count or summed time of such instances. Summary statistics and results of this test are found in Table 3.

Negative Supplements Occur Count Punctuality Variance P-value Yes 212 281 0.887 0.100 0E+00 No 316 228 0.928 0.067

Table 3 Summary of t-test results, negative time supplements.

We estimate a difference in punctuality of 4.1 percentage points, indicating that the practice of using negative time supplements is detrimental to punctuality. With the mix of trains running in Sweden, and the prevalence of this practice, we estimate that by no longer assigning any negative time supplements, overall punctuality would improve by about 1.25 percentage points. Since the negative supplements reduce scheduled run times below the technical minimum, the reduced runtimes are not “real” and eliminating them from timetables would not be expected to extend actual travel times.

3.2 Distribution of Margins

How margins should be distributed along the journey is an important and somewhat controversial topic, within both research and practice. As we saw in the literature review, there are several possible strategies and little empirical evidence on which works best.

12

Independent Variable Unit Intercept Slope R2 WAD Fraction 0.856 0.109 0.402 Average supplement following stops Seconds 0.886 0.001 0.501 Station interactions Count 0.947 -0.012 0.911 Line interactions Count 0.913 -0.039 0.974

Table 4 Summary of linear regression analysis related to distribution of margins. Punctuality is the dependent variable, expressed as a fraction.

One of the more often used measures for describing how margins are distributed is the weighted average distance (WAD), which takes a value of 0.5 if margins are perfectly balanced around the middle of the journey, a value close to 0.0 if margins are heavily shifted towards the beginning, and close to 1.0 if they are shifted towards the end. This is perhaps intended more as a descriptive than prescriptive measure, because there are a number of factors that influence where margins are suitable. All the same, among the planners we interviewed there was an explicit argument of not wanting to shift margins towards the beginning of the journey, since that would risk them being wasted in case a delay occurred later along the journey. Similarly, but coming to the opposite conclusion, both Vromans (2005) and Vekas et al. (2012) performed simulation experiments and concluded that it was better to shift margins slightly towards the beginning of the journey, so that delays that happen early are not carried for too long without being reduced by existing time supplements. Our dataset of detailed timetables and train observations allows us to add an empirical dimension to the question. In Figure 2, WAD is plotted against punctuality. The result indicates that punctuality, measured at the final destination, increases slightly as time supplements are shifted towards the end of the journey. We also note that substantial shifts are required for noticeable gains in punctuality. It should be noted that we suspect that this result depends on the method for calculating punctuality. If it were measured in a more holistic way, such as at every station stop, we expect that peak punctuality will be reached at values of WAD closer to 0.5, but have not yet confirmed this.

Figure 2 Punctuality and weighted average distance (WAD) of time supplements.

13

Figure 3 Punctuality and the average time supplement directly following stops.

As it stands, however, it appears that by shifting margins towards the later part of the journey, some punctuality gains can be made. The slope is 0.109, as can be seen in Table 4, and the average WAD across our sample is 0.56. So by shifting this to 0.65, we would expect punctuality to increase by almost a percentage point. Because this is simply shifting the distribution of margins, the total travel time would not increase. For passengers that do not travel the whole distance, however, there may be small differences.

3.3 Supplements directly following stops

One strategy when distributing supplements is to place them directly following scheduled stops. The rationale being that delays often occur at stations, and that it is good to recover delays as soon as possible. This practice was mentioned in a few of our interviews, but only one of the planners mentioned actually doing it. Analyzing the timetable data, we found supplements following scheduled stops were frequent, and that they were clustered around the “round” numbers of 30, 60, 90 and 120 seconds, although the whole range was used, and we even found some larger supplements. Not having any supplements directly following the stops was the most frequent alternative. Because we are interested in this as a conscious strategy, rather than a chance occurrence, we calculated the average size of these supplements for each train. We then grouped the trains according to their averages, and calculated the punctuality within these groups. We can identify two key results in Figure 3. Firstly, the positive slope of the trend line (see statistics in Table 4) indicating that larger supplements following stops are associated with higher punctuality. Secondly, punctuality is initially high when the average supplement is 0. These two findings combined suggest that if this strategy is to be used, it should be done with supplements of at least one minute per stop. Analyzing further with t-tests, we confirm that the punctuality is lower for trains with small supplements, and higher if it is done consistently with larger supplements, and that these differences are statistically significant. Summary statistics of the t-tests are presented in Table 5.

14

Variable Values Tested Count Punctuality Variance P-value Average time 0 seconds 77 218 0.933 0.062 1 supplement 1-59 seconds 459 416 0.908 0.084 < 0.001 directly 60+ seconds 28 526 0.943 0.054 < 0.001 following scheduled stops Table 5 Summary of t-test results. All values are compared to a baseline with no time supplements following the scheduled stops.

One hypothesis for why the small supplements are associated with a lower punctuality is the argument made by Carey (1998). By allowing more time exiting stations, the train crew takes more time in executing that activity. In this case, they overcompensate. It is possible that this is made worse by the truncation of seconds in some sub-systems in the Swedish railway, so that the driver believes that the supplements given are larger than they really are. If we for a moment disregard the causes for this phenomenon, and take it as a given, we can estimate that by removing all time supplements following scheduled stops for passenger trains, overall punctuality would improve by about 1.7 percentage points. At the same time, on average 2.3 minutes would be freed up per train, which could be used for margins elsewhere on the journey, or to reduce travel time.

Interactions with Other Trains Previous research, like Olsson & Haugland (2004) and Gorman (2009), has shown that congestion can be an important cause for delays and reduced punctuality. One measure of congestion is the number of interactions a train has with other trains: if congestion is low this number will be low, but on highly congested lines the number of interactions will be higher. The number of interactions can also be linked to trains traveling longer distances, being exposed to more traffic as it crosses through various local and regional train systems. Indeed, Harris (1992) showed that distance covered was statistically significant in determining punctuality. Thus, the number of interactions may help explain both the delays associated with congestion, and the delays associated with long journeys. Interactions can also be part of a strategy of how to distribute margins, as in Andersson, et al. (2015). For these reasons, we have looked at the link between interactions and punctuality. We have counted the number of interactions at stations and on line sections, and tracked them separately. Plots of these against punctuality are found in Figure 4 and Figure 5, respectively, while the regression results are found in Table 4.

15

Figure 4 Punctuality and the number of interactions at stations.

At stations, every additional scheduled interaction lowers punctuality by 1.2 percentage points. We have not differentiated between single and double track, considered the layout of the station areas, or any characteristics of the other train, but simply counted the number of times the train is at a station at the same time as at least one other train. There is a wide distribution of trains along this dimension: the single most common number of interactions at stations is one, which occurs for 25 % of the trains in our sample, more than 95 % of trains are covered by the count of eight, and the average number is 2.89. Thus, the largest improvement can be gained by focusing on trains that have few interactions, and either trying to reduce that number further, or improving the probability of the interactions being successful by adding or shifting margins. For instance, if the average number of interactions at stations could be reduced by 0.50 to 2.39, by multiplying with the slope found in Table 4, we would expect a gain in punctuality across passenger trains of 0.6 percentage points. If we factor in other kinds of trains, the gain in overall punctuality would be about 0.5 percentage points. Interactions on line sections are much less common in our dataset. Looking at realized traffic, 98% of the trains have no such interactions, and in the timetable, the corresponding number is 97%. When interactions do occur, however, the effect on punctuality is considerable, as shown in Table 4. Each such interaction lowers punctuality by 3.9 percentage points. Again, we have only counted the number of these interactions, without considering the order of the trains or the speed difference. We would expect the effect to be greater for faster trains following slower ones, and will follow up on this in later research. The low number of these interactions does mean, however, that even if they were to be eliminated entirely, the effect on overall punctuality would be less than 0.1 percentage points. Rather, efforts should be made to ensure that the number of line interactions remains low.

16

Figure 5 Punctuality and number of interactions on line sections.

4 Conclusions and Recommendations

Based on the results of the literature review, the interviews with planners, and our data analysis, we now give some recommendations on how timetables can be planned in the future to achieve a higher level of punctuality.

4.1 Size of Margins

We have found that increased margins are in fact associated with improved punctuality, but not strongly enough that the size alone can be used to reach the punctuality target of 95 %. Our estimate is that punctuality improves by 0.117 percentage points for each additional percentage point of run time extension. We also note that the average size of margins is about 10 percentage points larger in Swedish practice than in the international literature. This increases the downward variability in run times, makes operations less predictable and causes congestion in some places, consumes capacity, and increases travel times for passengers. As we have shown, the benefits in terms of punctuality of these margins are rather limited. But since punctuality in Sweden is currently so far from the target, and rapid improvements are required, we still advise a modest increase in the average of about four percentage points, mostly for long distance trains, rather than a decrease. Hopefully this increase can be phased out as improvements from other sources come into effect. We have found that in practice it is common for timetables to include some negative time supplements, meaning that there are some line sections where the scheduled run time is shorter than the technical minimum runtime, and that this is detrimental to punctuality. Our estimate is that by removing all instances of these from a timetable, the punctuality of the affected trains will improve by 4.1 percentage points and overall punctuality by 1.25 percentage points.

4.2 Distribution of Margins

Our results indicate that shifting time supplements towards the end of the journey slightly improves punctuality at the final destination. The effectiveness is about the same as when increasing the size of supplements. The results indicate that an increase in WAD of one

17

percentage point improves punctuality at the final destination by about 0.1 of a percentage point. We expect that the effectiveness will be lower, if one instead measures punctuality across all scheduled stops. For this reason, we only suggest a modest shift in average WAD from 0.56 to 0.65. One strategy when distributing time supplements, is to place them directly following station stops. The results of this are mixed, but the conclusion we draw is that there should either be no such supplements, or that they should consistently be of at least one minute. If supplements are lower than 60 seconds, on average, punctuality is 2.5 percentage points lower than if there were no such supplements at all. If the average is 60 or over, however, our estimate is that punctuality improves by one percentage point. This is a significant improvement, but then having time supplements of at least one minute directly after every scheduled stop adds up quickly for most trains. If this is considered too expensive in time, it is better to skip these supplements altogether, since there appears to be a threshold effect. Finally, in line with other research, interactions between trains are bad for punctuality and should be limited. Our estimate is that every additional interaction at a station lowers punctuality by 1.2 percentage points, while every additional interaction on a line section lowers punctuality by 3.9 percentage points. Thus, gains can be made by reducing the number of interactions, especially on line sections. This would mean ensuring that the first train is able to clear the line section, before the second train enters it, essentially increasing headways at some locations. If an interaction cannot be altogether avoided, holding back one train so that the interaction happens at a station instead of on a line section is expected to improve punctuality by 2.7 percentage points for both trains.

4.3 Estimated Impact of Recommendations

Throughout this paper, we have given a number of examples of how timetable variables can be tweaked in ways that affect punctuality. These changes and their estimated results are summarized in Table 6. Considered as a package, overall punctuality would increase by five percentage points, while the average size of margins is only increased by four percentage points.

Variable Affected Recommended Action Gain in Overall Punctuality Size of margins Increase average from +0.0047 0.17 to 0.21 Negative margins Eliminate all instances +0.0125 Weighted Average Distance Go from 0.56 to 0.65 +0.0100 (WAD) of margins Supplements directly Eliminate all those +0.0170 following scheduled stops lower than one minute Station interactions Reduce average from +0.0050 2.89 to 2.39 Line interactions Eliminate all instances +0.0008 All of the above All of the above +0.0500

Table 6 Estimated effects of recommendations.

Other combinations of adjustments are possible to imagine. While we have proposed a modest increase in margins of four percentage points, it is possible to imagine options where

18

the average size of margins is unchanged or even decreased down to more internationally comparable levels of 5-7%. We estimate this would lower punctuality by about one percentage point, while travel times would be reduced by 8-10 percentage points from their current levels. Given the need for rapid improvements in punctuality, however, we think a modest increase is more appropriate for the Swedish railway, at the current time. We would caution against shifting the weighted average distance of margins to more than 0.65, because we suspect that if punctuality is measured at more locations than the final destination, the ideal WAD would be closer to 0.50. Along those lines, one might imagine a scenario where the average WAD is maintained at 0.56 rather than increased. But since punctuality is so often measured at the final destination, we opt for the increase. The largest gains in punctuality are estimated to come from eliminating the practice of allowing negative margins and supplements lower than one minute following scheduled stops. These policies appear to be mistakes that should not be allowed to persist. One might further dispute the possibility of decreasing the number of interactions, especially as the capacity utilization and volume of traffic increase. We think some reduction is possible by more conscious planning, but even without the improvements we estimate from these changes, punctuality is set to improve substantially from the changes we propose. Our contacts and funders at the Swedish Transport Administration wish to comment that they are especially in favor of removing negative margins, slightly shifting the WAD of margins, and experimenting with the balance of dwell times and margins after station stops. This fits well together with other ongoing research and development which they finance.

References

Andersson, E., 2014. “Assessment of Robustness in Railway Traffic Timetables”, Linköping University, Norrköping. Andersson, E. V., Peterson, A. & Thörnquist Krasemann, J., 2013. “Quantifying railway timetable robustness in critical points”, Journal of Rail Transport Planning & Management, issue 3, pp. 95-110. Andersson, E. V., Peterson, A. & Thörnquist Krasemann, J., 2015. “Reduced railway traffic delays using a MILP approach to increase robustness in Critical Points”, Journal of Rail Transport Planning & Management, issue 5, pp. 110-127. Carey, M., 1998. “Optimizing scheduled times, allowing for behavioral response”, Transportation Research Part B, vol. 32, no. 5, pp. 329-342. Carey, M., 1999. “Ex ante heuristic measures of schedule reliability”, Transportation Research Part B, issue 33, pp. 473-494. Cicerone, S. et al., 2009. “Recoverable robust timetabling for single delay: Complexity and polynomial algorithms for special cases”, Journal of Combinatorial Optimization, issue 18, pp. 229-257. Dewilde, T., Sels, P., Cattrysse, D. & Vansteenwegen, P., 2013. “Robust railway station planning: An interaction between routing, timetabling and platforming”, Journal of Rail Transport Planning & Management, issue 3, pp. 68-77. Gorman, M. F., 2009. “Statistical estimation of railroad congestion delay”, Transportation Research Part E, pp. 446-456. Haldeman, L., 2003. Automatische Analyse von IST-Fahrplanen, Zürich. Hansen, I. A., 2009. “Railway Network Timetabling and Dynamic Traffic Management”, 2nd International Conference on Recent Advances in Railway Engineering (ICRARE-2009), pp. 135-144.

19

Harris, N., 1992. Planning Passenger Railways: A Handbook, Transport Publishing Company Limited, London. International Union of Railways, 2000. “UIC Code 451-1 Timetable Recovery margins to guarantee timekeeping - Recovery margins”, International Union of Railways, Paris. Nelldal, B.-L., 2009. Kapacitetsanalys av järnvägsnätet i Sverige. Delrapport 3: Förslag till åtgärder för att öka kapaciteten på kort sikt, Kungliga Tekniska Högskolan, Stockholm. Nelldal, B.-L., Lindfeldt, A. & Lindfeldt, O., 2009. Kapacitetsanalys av järnvägsnätet i Sverige. Delrapport 1: Hur många tåg kan man köra? En analys av teoretisk och praktisk kapacitet, Kungliga Tekniska Högskolan, Stockholm. Nie, L. & Hansen, I. A., 2005. ”System analysis of train operations and track occupancy at railway stations”, European Journal of Transport and Infrastructure Research, vol. 1, issue 5, pp. 31-54. Olsson, N. et al., 2015. Punktlighet i jernbanen - hvert sekund teller, SINTEF akademisk vorlag, Oslo. Olsson, N. & Haugland, H., 2004. “Influencing factors on train punctuality - results from some Norwegian studies”, Transport Policy, issue 11, pp. 387-397. Pachl, J., 2002. Railway Operation and Control. VTD Rail Publishing, Mountlake Terrace. Palmqvist, 2014. Gångtidstillägg för snabbtåg, Lund University, Lund. Palmqvist, C., Olsson, N. & Hiselius, L., 2017. “Delays for passenger trains on a regional railway line in Southern Sweden”, International Journal of Transport Development and Integration, vol. 1, issue 3, pp. 421-431. Parbo, J., Nielsen, O. & Prato, C., 2016. “Passenger Perspectives in Railway Timetabling: A Literature Review”, Transport Reviews, vol. 36, issue 4, pp. 500-526. Rietveld, P., Bruinsma, F. & van Vuuren, D., 2001. ”Coping with unreliability in public transport chains: A case study for Netherlands”, Transportation Research Part A, vol. 35, pp. 539-559. Scheepmaker, G. M. & Goverde, R. M., 2015. “The interplay between energy-efficient train control and scheduled running time supplements”, Journal of Rail Transport Planning & Management, issue 5, pp. 225-239. Schöbel, A. & Kratz, A., 2009. “A bicriteria approach for robust timetabling”, Robust and online large-scale optimization, pp. 119-144, Springer Berlin Heidelberg. Trafikanalys, 2016. Punktlighet på järnväg 2015, Trafikanalys, Stockholm. Trafikanalys, 2016. Railway Transport 2015 quarter 4, Trafikanalys, Stockholm. Trafikverket, 2016. Analysmetod och samhällsekonomiska kalkylvärden för transportsektorn: ASEK 6.0, Trafikverket, Borlänge. Trafikverket, 2016. Noder i Järnvägssystemet T17, Trafikverket, Borlänge. Vekas, P., van der Vlerk, M. & Haneveld, W., 2012. ”Optimizing existing railway timetables by means of stochastic programming”, Stochastic Programming E-Print Series- (SPEPS). Humboldt Universität zu Berlin, Berlin. Wiggenraad, I. P. B., 2001. Alighting and boarding times of passengers at Dutch railway stations, Delft. Vromans, 2005. Reliability of railway systems, Rotterdam. Yuan, J. & Hansen, I. A., 2007. ”Optimizing capacity utilization of stations by estimating knock-on train delays”, Transportation Research Part B, vol. 41, pp. 202-217. Yuan, J. & Hansen, I. A., 2008. ”Closed form expression of optimal buffer times between scheduled trains at railway bottlenecks”, 11th International IEEE Conference on Intelligent Transportation Systems.

20

Paper 4

Hindawi Journal of Advanced Transportation Volume 2018, Article ID 8502819, 17 pages https://doi.org/10.1155/2018/8502819

Research Article The Planners’ Perspective on Train Timetable Errors in Sweden

Carl-William Palmqvist ,1 Nils O. E. Olsson,1,2 and Lena Winslott Hiselius1

1 Faculty of Engineering, Lund University, P.O. Box 118, 221 00 Lund, Sweden 2Norwegian University of Science and Technology, Høgskoleringen 1, 7491 Trondheim, Norway

Correspondence should be addressed to Carl-William Palmqvist; [email protected]

Received 20 November 2017; Revised 5 February 2018; Accepted 8 March 2018; Published 12 April 2018

Academic Editor: Lingyun Meng

Copyright © 2018 Carl-William Palmqvist et al. Tis is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Timetables are important for train punctuality. However, relatively little attention has been paid to the people who plan the timetables: the research has instead been more centred on how to improve timetables through simulation, optimisation, and data analysis techniques. In this study, we present an overview of the state of practice and the state of the art in timetable planning by studying the research literature and railway management documents from several European countries. We have also conducted interviews with timetable planners in Southern Sweden, focusing on how timetable planning relates to punctuality problems. An important backdrop for this is a large project currently underway at the Swedish Transport Administration, modernizing the timetable planning tools and processes. Tis study is intended to help establish a baseline for the future evaluation of this modernization by documenting the current process and issues, as well as some of the research that has infuenced the development and specifcations of the new tools and processes. Based on the interviews, we found that errors in timetables commonly lead to infeasible timetables, which necessitate intervention by trafc control, and to delays occurring, increasing, and spreading. We found that the timetable planners struggle to create a timetable and that they have neither the time nor the tools required to ensure that the timetable maintains a high quality and level of robustness. Te errors we identifed are (a) crossing train paths at stations, (b) wrong track allocation of trains at stations, especially for long trains, (c) insufcient dwell and meet times at stations, and (d) insufcient headways leading to delays spreading. We have identifed eleven reasons for these errors and found three themes among these reasons: (1) “missing tools and support,” (2) “role confict,” and (3) “single-loop learning.” As the new tools and processes are rolled out, the situation is expected to improve with regard to the frst of these themes. Te second theme of role confict occurs when planners must strive to meet the demands of the train operating companies, while they must also be unbiased and create a timetable that has a high overall quality. While this role confict will remain in the future, the new tools can perhaps help address the third theme by elevating the planners from frst- to double-loop learning and thereby allowing them to focus on quality control and on fnding better rules and heuristics. Over time, this will lead to improved timetable robustness and train punctuality.

1. Introduction utilised at levels associated with high sensitivity to delays, low average speeds, and little time for infrastructure maintenance Railways are an important part of the transport system. In [2]. On the rest of the network, only about fve percent of Sweden, trains traveled 153 million km during 2015, which the segments are utilised to the same extent. When measured is an increase of 9% over fve years [1]. Passenger trafc during peak loads, these fgures are higher across the board. has increased by 16% over the same period, and in 2015 Te punctuality of trains in Sweden has been close to passenger trains made up 83% of all trains. While freight 90% for the last several years [1], with punctuality measured trafc in 2015 was at the lowest level since 1990, the freight as a maximum delay of fve minutes at the fnal stop. tonnage transported by rail has risen slightly as the loads have Tis is considered too low by the industry, which has set increased. Te capacity is most heavily utilised around the a goal that it should be 95% by 2020. Tis ambition, to three major metropolitan areas of Stockholm, Gothenburg, increase punctuality, is the background for our research and and Malmo-Lund.¨ A quarter of the metropolitan lines is for this paper. Many factors infuence punctuality, such as 2 Journal of Advanced Transportation weather [3–5], congestion [6], other operational factors [7, 8], As the timetable planning process and methods in Swe- and infrastructure [9]. Previous studies also indicate that den are about to undergo signifcant change over the next properties of the timetable can have a large impact on delays few years, this paper (1) presents the state of practice in order and punctuality, that delays ofen occur at station stops, and to establish a baseline and (2) outlines the state of the art in that dwell times are systematically underestimated [10–13]. research, which has inspired and infuenced the development Tus, there is reason to believe that errors in the timetable of the new tools and methods. It also (3) gives a description may afect punctuality. of the timetable planners current situation in Sweden and (4) Te interaction between infrastructure, capacity, and identifes common errors in Swedish train timetables, which timetableplanningisfoundonastrategic,tactical,and infuence the punctuality, as well as the reasons behind them. operational level. Te strategic level is typically long term, Tereby, the paper helps to enable future studies looking to overseveralyears,andcanberelatedtonewinfrastructureor evaluate the efects and efectiveness of the new tools being new timetable structures, while the tactical level is related to developed and implemented for train timetable planning in producing a timetable implemented in a shorter perspective, Sweden and elsewhere. typically one year. Tis paper mainly studies tactical timetable planning. Operational timetabling is related to making short- 2. Background termchangestoatimetable,ofenafewweeksordaysin advance. Timetabling has largely been studied from a technical Timetable planners prepare timetables and other doc- and optimisation perspective (see, e.g., [24]). However, umentations related to planned changes for passenger and timetabling can also be studied from an organizational freight trains. Planners are ofen faced with the challenges perspective, using other methods. For instance, Avelino et of working with complex timetabling [14]. In addition to the al. [25] study the politics of timetabling, comparing the complexityoftheplanningitself,theymustbeabletodeal Dutch and Swiss experiences and illustrating that “timetable with diferent stakeholders in the railway sector and have con- planning is not merely an operational process to be lef to fict resolution skills. A fnal timetable must satisfy the needs engineers or economists” (pp. 20). Even though the Swiss of travelers and public and private parties, while maintaining have many more train operating companies than the Dutch, the fairness, transparency, reliability, and safety of the railway their federal government still takes a much more active and system. Te train service specifcations are passed to the strategic role in ensuring an optimised travel time over the timetable planners, who produce the timetables. However, whole network. Watson [14] found that the privatization of these specifcations can be in violation of guidelines, or there British Rail had a negative efect on the timetabling processes. can be conficting needs of diferent train operators. Watson Teproblemswerearesultofpoorplanningandrushed [14] found that the complexity of the timetabling and capacity implementation of new organization of the British railway allocation processes can hinder efectiveness, highlighting sector. Since then, both the British and Swedish railways have the conficting nature of objectives for timetable planning, gained experiences from the new structures with a division especially in the privatized railway. between infrastructure and train operation. However, the inherent characteristics of the divided structure remain. While extensive research has been carried out on the Te infrastructure manager supplies capacity on the rail- human-machine interface in train trafc control in Sweden way, while train operating companies represent the demand ([15–17]; see also [18] for dispatchers in the US) and some for transport. Timetable planners are squeezed in between work has been done on the integration of timetable planning these needs, which sometimes confict [26]. Watson [14] dis- and trafc control [19], relatively little research has been cusses timetable planning as a process by which a “demand” done focusing on timetable planners and their tools. Watson for rail transport (passenger and freight) is connected to the [14, 20] covered timetable planning in the UK, which has “supply side” constraints (especially available infrastructure many similarities to the Swedish context, and ONTIME [21] capacity) in order to produce timetables that meet the contains some expert judgment on the state of practice in demand. Trough train planning, railway administrators seek Sweden. National interest in this topic has increased, as a to meet the needs of customers while utilising available large project is currently in progress at the Swedish Transport resources as well as possible. Efcient and efective train Administration, seeking to modernize the interface between planning is the key to getting the best possible performance train operating companies and the Transport Administration on a railway network. by developing new tools and routines for timetable planning. Timetabling is governed by several restrictions, such as Tese tools will, among other things, support more fexible safety requirements and organizational policies. A routine or and optimal planning, improve capacity and punctuality, heuristic approach can be applied to timetabling. Routines shorten planning lead times, and improve transparency and can be defned as “a repetitive, recognizable pattern of the handling of engineering works [22]. Since 2011, the interdependent actions, involving multiple actors” [27, pp. situation with the deregulated passenger train market in 96]. Heuristics [28] are cognitive rules of thumb, or short- Sweden is also new and rare in an international context, with cuts, that people apply, consciously or unconsciously [29]. the new divisions of responsibility resulting both in new role Argyris and Schon¨ [30] present learning as understanding conficts and in more collaborative decision-making between and eliminating the gap between the expected result and the stakeholders. To learn more about this, it is useful to talk to actual result of an action. Tis gap can be eliminated either the timetable planners. by making changes (taking corrective measures) within the Journal of Advanced Transportation 3

127 Application Dra of Final for capacity timetable timetable

3 Coordination among TOCs

4 Dispute settlement

56 Application Overutilized of prioritization line criteria

Conduct Improve capacity capacity analysis

Figure 1: Capacity allocation process in Sweden. Adapted from the Swedish Transport Administration [23].

existing values and norms, or by changing the existing values perspective for operational improvements, and Samaetal.` and norms. Te former is called single-loop learning and [40] provide an example of applying optimisation to support the latter is called double-loop learning. Single-loop learning operational management. Roth et al. [18] and Tschirner [41] is connected to doing things right, in accordance with the also studied aspects of operational management for train existing values and norms. Double-loop learning is about dispatching in trafc control centres, while Watson [14, 20] doing the right things, by questioning the existing values and considered the contrasting needs and preferences of timetable norms. Tis is a concept we will return to throughout the planners and their managers. paper. According to Loock and Hinnen [31], organizational 2.1. State of Practice in Sweden. Te following is an outline heuristics are the result of collective learning processes. Tey of the current state of practice in Swedish train timetabling, found that successful organizations refne their heuristics to give the reader a better understanding of the processes over time, as a result of feedback loops. Organizational and tools involved. Rail trafc in Sweden has been gradually heuristicscanalsointeractwithindividualheuristicsand deregulated since competition for some tracks for passenger with improvisation in the decision-making process [32]. trains was allowed 1990, with open access with competition Kirkebøen [33] shows that heuristics can be biased, such as on all tracks since 2011, while freight services have been a bias to rely on the information that is most available or competing on the tracks since 1996 [42]. a bias to search for information that confrms rather than contradicts a decision. 2.1.1. Te Capacity Allocation Process. Te annual capacity Managerial issues in railway planning have been the topic allocation process in Sweden is illustrated briefy in Fig- of a range of publications, notably Vuchic [34] and Profllidis ure 1, which is reconstructed from the Network Statement [35]. Managerial aspects of railway planning include strategic, published by the Swedish Transport Administration. Tis tactical, and operational issues, and timetabling is an impor- corresponds to the tactical level of planning described above. tant factor in all these perspectives. A strategic perspective Te process as described here is based on the Network State- on railway planning includes selecting major investments and ment published by the Swedish Transport Administration positioning the railway within the overall transport system, [23]andanexcellentaccountinHellstrom¨ [19]. First the as described by Harris et al. [36]. In a tactical perspective, train operating companies send in requests for the capacity timetabling is one step in the planning process. Ceder [37] they want during the next year. Te timetable planners at describes the train scheduling process in four steps: network the Transport Administration combine these requests and, route design, setting timetables, scheduling vehicles, and by following their rules and guidelines for how to plan assignment of crew. All of these planning steps have been timetables, come up with a draf that contains all the train of interest to researchers, especially from an optimisation timetables for one year. In case there are conficting requests, perspective [38]. Operational management issues in railway suchthatnotalltrainscanberunwhenandwherethetrain planning have also been studied: Veiseth et al. [39] study operating companies desire, there is frst a step where the how timetable improvements in a Total Quality Management parties are encouraged to coordinate among themselves. If 4 Journal of Advanced Transportation this coordination is unsuccessful, there is a process where the no longer considered viable to operate the train at all. For Transport Administration, together with the parties, tries to commuter and regional trains, this has been set to 15%, so settle the dispute. If these attempts are also unsuccessful, parts that if the travel time of a train had to be extended by more of the infrastructure can be declared to be saturated, and the than15%,itisassumedthatnoonewouldliketousethattrain planners at the Transport Administration use prioritization (thereisnodocumentationordiscussiononhowtheselimits criteria to determine which trains have priority, sometimes have been determined). Te cost of excluding a train is then entirely rejecting the other requests. In yet another step, the settobeequaltothecostofsuchatrainbeingruntothat train operating companies can appeal the decision of the maximum limit, which results in a very high cost of excluded planners to an administrative court, if they are unsatisfed train paths, severely punishing solutions that deny requests with the planners’ decisions. for capacity. A minor detail in the calculation of exclusion Te point of departure compared to other European costs is that a template is applied to estimate a reasonable level countriesiswhenthecoordinationprocessbreaksdown.Te of margins for the trains, instead of using the actual size of United Kingdom [43] has a more qualitative process with an marginsinthetrainpath.Tisisdonesothatcompaniesdo overarching objective “to share capacity on the Network for not act strategically in reducing margins in order to have their the safe carriage of passengers and goods in the most efcient competitors’ trains be excluded instead of their own. and economical manner in the overall interest of current and prospective users and providers of railway services.” Along 2.1.2. Robust Timetables. Robustness in timetables is primar- with this objective it has a list of twelve criteria, on which ily achieved by modifying time supplements, headways, and to assess the fulflment of the objective. In Germany [44], dwell times (see, i.e., [48]). In Sweden, time supplements are priority is given to regular-interval or integrated network added in several ways. Te frst way is included in the runtime services, cross-border trains, and train paths for freight trains. calculation, which is calculated as if the train had a technical If none of these criteria are sufcient, priority is given to the top speed 3% lower than it does. Practically speaking, this train paying the higher track charge. In the Netherlands [45], adds a uniformly distributed margin of 3%. Tis is ofen the track charges for the conficted train path are raised if an motivated by diferences in train driver behaviour and is agreement cannot be found by the parties, to the point that described as the primary source of margins in timetables but only one of the actors remains. Te Dutch also emphasize in fact only makes up a small fraction of the total margins the basic hourly patterns in their capacity allocation process, [49]. striving for a cyclic timetable, something that is not found in Te other explicit way to add time supplements is by the other countries’ process descriptions. assigning node supplements [50]. On each major railway line, How trains are prioritized in the timetabling is regulated between two and four stations have been designated as nodes, in the Network Statement [23]. In the request for capacity, while minor lines instead use the frst and last stations. Te the train operating companies must classify their trains idea is then to add a number of minutes as time supplements according to previously set criteria. For passenger trains, for trains traveling between two nodes. For passenger trains, the expected number of passengers, the share of passengers those that have maximum speeds above 180 km/h need four who are time-sensitive, the share of regional travelers, and additional minutes between each pair of nodes, and those that demands for high speeds are the basis for this categorisation. have lower maximum speeds instead need three minutes for Similar criteria are used for freight trains. Tere are 18 each pair of nodes. Trains that travel shorter distances on the categories for passenger trains, shown in Table 1, and 15 for major lines, not reaching a node, require two minutes. Te freight, which are not shown here. Trains can also have airport express trains are given an exception, only needing associations with other trains: these are categorised into node supplements of one minute. Freight trains require two fve categories each for passenger and freight, based on minutes for each set of nodes, or one minute for shorter thenumberofpassengersortonnageoffreight,andthree distances. categories for the turnaround of vehicles. Each prioritization In addition to these two methods, timetable planners use category is linked to a series of social cost estimates, which their discretion in assigning time supplements. One common arebasedonthemethodologyinASEK[46]andareusedto practice is to add seconds, so that arrival times occur at whole fnd a total solution, which minimises the social welfare costs. minutes. For instance, if a train would arrive at 12:44:27, the Track charges to be paid by the train operating companies planner might add 33 seconds, so that the arrival instead to the Swedish Transport Administration are determined occurs at 12:45:00. Over long journeys, this ofen adds up to using other methods and are not in any way involved in the considerable supplements. Supplements are also sometimes prioritization of trains. given for trains that are scheduled to stop at a platform which Sometimes train path requests, or associations, are not is not on the main track, because this takes slightly longer to possible to fulfl, and they are instead excluded from the get to, and because engineering works are being done on the timetable. While there is no clear guidance from welfare track, requiring lower speeds for part of the journey. Tese economic theory on how to evaluate these exclusions [47], are meant to correct cases where the runtime calculation is a cost is still assigned so that large numbers of trains are known to be wrong; however they are not really margins not simply excluded. Te cost estimate has instead been increasing the robustness. calculated roughly in the following way: for each type of train Headways, the time separation between trains going in (each prioritization criteria) an estimate has been made based thesamedirectiononthesametrack,areanotherimportant on how much the travel time can be extended before it is way to provide robustness. A short separation implies a Journal of Advanced Transportation 5

Table 1: Prioritization criteria for passenger trains and their associations, with social cost estimates. Reproduced from Appendix 4 B in the Swedish Network Statement for 2017 [23]. When alternative timetable solutions are compared to one another, the solution with the lowest total social cost estimate should be chosen. Tis entails frst calculating the social cost for each train, using the estimates above value travel time, travel distance, phasing time (shifing departures from the times requested by the train operating companies), and associations between trains for both passengers and vehicles and then summing up the social cost across all trains in the timetable scenario. Te cost estimates for rejected requests, trains that are not allowed to run in the timetable alternative, are estimated using a slightly diferent methodology outlined briefy in Section 2.1.1. Similarly, there is a threshold at which point an association is no longer considered viable, and instead of basing the cost on the time used, a fxed penalty is applied, as indicated in the column SEK/Assn. Te train operating companies are required to (accurately, to the best of their knowledge) report the prioritization category of each of their trains along with the request for capacity, based on the identifcation criteria, and the computation of the social cost estimate is then done in Trainplan for each timetable scenario. Guidelines are also provided for how to handle new train concepts, where there is no previous knowledge of the number and type of passengers.

Identifcation criteria for prioritization categories Social cost estimate Description of Share of time Share of Demand for SEK/km Number of SEK/min SEK/min category sensitive regional high speed, few transport passengers transport time phasing time passengers passengers stops distance Heavy commuter ≥700 ≥75% ≥75% 1,150 93 784 Regio-commuter ≥300 ≥75% ≥75% 736 93 474 Regio-commuter ≥300 ≥75% ≥75% 736 93 474 Regio-max ≥200 ≥75% ≥75% 499 76 212 Regio-max ≥75 ≥75% High 499 76 212 Regio-standard ≥75 ≥75% ≥75% 240 27 132 Regio-standard ≥25 ≥25% High 240 27 132 Regio-low ≥25 ≥75% ≥75% 170 29 96 Regio-low ≥75 ≥25% 170 29 96 Regio-low ≥25 ≥25% 170 29 96 Regio-mini ≥0 ≥25% 46 22 10 IC-express ≥200 ≥75% High 753 64 429 IC-standard ≥75 ≥25% 484 41 291 IC-low ≥25 ≥25% 253 38 125 IC-low ≥75 253 38 125 IC-mini ≥0801531 IC-mini ≥0801531 Unspecifed 35 11 8 Association Passengers SEK/min SEK/Assn. Conn. pas. max ≥125 647 55,300 Conn. pas. high ≥75 304 26,000 Conn. pas. std. ≥50 190 16,300 Conn. pas. low ≥20 107 9,110 Conn. pas. mini ≥0 30 2,600 Turnaround high 037,300 Turnaround std. 019,300 Turnaround low 0 11,800

high capacity and throughput but also increases the risk Te third key factor for robustness of timetables is dwell that delays spread from train to train. In Sweden, minimum time. If a dwell time is too short, the departure of the train will headways are regulated in a document [51], varying from two be delayed. Correspondingly, if the scheduled dwell time is to seven minutes, depending on the location. Te normal longer than required for the exchange of passengers or goods, range, applying to most of the network, is from three to fve the excess time can be used to make up for any previous minutes. It is unclear from the documentation to what extent delays. Te guidelines [52] state that dwell times for passenger this is a technical minimum and what has been added to trains should, in general, be two minutes long. Sometimes improve robustness, but if headways are higher than required longer times are required and other times, if the number of by the regulations, robustness would be expected to improve. passengers is small and the train and station are prepared for 6 Journal of Advanced Transportation

Table 2: Timetable planning standards for passenger trains in Sweden.

Robustness measure Norm in the Swedish regulations Running time supplement +3% across the board, included in run time calculation 3-4 minutes per pair of nodes passed; 2 minutes if partial; Node supplement 2–4 nodes exist per railway line Dwell time at stations 2 minutes standard; 1 minute if the number of passengers is small Minimum headway 2–7 minutes, most commonly 3–5 minutes a speedier boarding process, one minute can be used instead. also used in the UK railways [14] and is reviewed by Hammer- If the number of passengers is very small, it is possible to ton[57].RailSysisincreasinglybeingusedtoperformlimited schedule a stop without dwell time, merely slowing the train test runs of parts of the timetable using stochastic simulation, down to a stop and then starting again immediately, but if on a more detailed infrastructure model, and can address this is done the runtime on the next line section should be many of the issues presented in this paper. As in the UK, the extended, and if passenger numbers increase, the timetable group of trained users is small and used only as a complement should be redone and longer dwell times set. to the main planning process. Tese sofware packages, their Te norms discussed above are summarised in Table 2. use, and their limits are discussed in depth by Watson [14]. Vromans [53] compiled information on running time sup- One of the most important constraints is that Trainplan does plements used in the Netherlands, the United Kingdom, not provide automatic confict detection, meaning that the and Switzerland, and as required by the UIC which can planners must check for these manually. As each planner be repeated here as a point of reference. While this infor- plans for thousands or tens of thousands of trains per year, mation may appear dated, the timetabling norms have not this is a recurring issue. New tools and processes, intended changed signifcantly in Sweden since at least the 1990s, to improve the quality and efciency of the timetable process and there is reason to believe that the norms would be roughly along the lines discussed in ONTIME [21], are under stable in other countries as well, even though the actual development at the Swedish Transport Administration and planning practice may well develop over time. According to are to be rolled out gradually from 2018 through 2020. Vromans[53],theDutchusearunningtimesupplement With the introduction of information technology and of approximately 7% across the board, for passenger trains. increasingly powerful computers, simulation is gaining an In the United Kingdom, runtime calculations are based on increasingly important role in the railways, both in practice previous performance rather than physics-based methods, at infrastructure managers and in academia. In the early and the running time supplements are not explicitly defned 2000s researchers at the Royal Institute of Technology began [54]. Te Swiss [55] use a supplement of 7% for passenger to model the Swedish railways in the simulation sofware trains, on top of which they add one minute for every 30 RailSys [58]. Afer several years, the Swedish Transport minutes of runtime and additional supplements in some Administrationbegantousethismodeltoperformsimu- locations,usuallyathighlyutilisednodes.Asafnalpoint lations in diferent aspects of its practice. Tis is now an of comparison, the UIC [56] recommends a combination established part of the process, and parts of the annual of time and distance based supplements: between three and timetablearerunthroughsimulationsinseveraliterations seven percentage points are added to the running time, to before they are fnalised. Larger engineering works are also which one should add between one and one and a half simulated regularly in order to estimate the efects on train minutes for every 100 km. Unfortunately, we have not found trafc and to make appropriate changes to the timetable. any norms or standards with regard to dwell times or headway Tis is being developed further, so that alternate plans for times in other countries. engineering works can be tested against each other [59]. Te complexity of trafc, changing conditions, and sheer However,theteamofcapacityanalystswhouseRailSys number of decisions makes it difcult for planners to foresee is still quite small and they are organized as a group of the punctuality efects of individual decisions on the size and experts, which is involved in many projects besides the annual distribution of margins, headways, and dwell times. Because timetable planning. Te timetable planning is still done in of this difculty, there has not been a convergence around best Trainplan,andthetestrunsinRailSysarelimitedinboth practice in timetabling in Sweden, and there has not been number and scope. Tus, while some of the errors produced a steady improvement in punctuality [1], which one might in the planning process are identifed and corrected, the expect if planners were able to learn what works and what workfow of the planners is not really afected and many of does not. Nonetheless, there has been a signifcant drif from the errors still go unnoticed. the norms in how the timetables are scheduled in practice [49], as the running time supplements are routinely much 2.2. State of the Art in Sweden. As a way to utilise the larger in practice than in the norms, while the dwell times infrastructure capacity more efectively, the Swedish Trans- are signifcantly shorter. port Administration is funding research for simulation and optimisation tools for timetabling and rescheduling. Some 2.1.3. Tools. Te Swedish Transport Administration currently of this work is outlined here, to give an idea of the research uses the tool Trainplan to create timetables, a tool which is and development underway in Sweden, without intending to Journal of Advanced Transportation 7 give a comprehensive review. Tis work creates an important asthenumberofrequestedtrainpathsisincreasing,and and interesting backdrop to the ongoing development and especially as the requests are coming from competing actors. impending implementation of new tools and processes for Te welfare cost derived from this framework can also be timetable planning. used to fnd the optimal number of departures, fnd the best departure times, and estimate the economic value of diferent 2.2.1. Te Capacity Allocation Process. Te capacity allocation timetable variants. process itself, the framework for timetable planning, is also being developed and improved spearheaded by researchers at 2.2.2. Robust Timetables. One line of research is directed theSwedishInstituteforComputerScience.Asummaryand at creating timetables that are robust against minor distur- roadmap toward implementation is presented in Aronsson bances. An important prerequisite is developing methods on et al. [60]. Te background for this is twofold. One issue how to measure and quantify this robustness, ideally before is that, with deregulation and the existence of multiple, the timetables are put into operation. Andersson et al. [66] competing train operating companies, the demands put upon show that there is a clear mismatch between where margins the capacity allocation process are fundamentally diferent areplacedandwheredelaysoccur.Teyalsosuggestthat than before. Kreuger et al. [61] list some typical requirements punctualityshouldbemeasuredatallstops,notonlythe from train operating companies and their customers, and fnal destination. Peterson [67] studied two Swedish train based on these they developed a mathematical framework services, fnding that dwell times are usually underestimated for detecting and resolving conficts. Te other issue is that andnotsufcientlycompensatedbymarginsontheline the current allocation process has very long lead times, is and that the precision of the train paths decreases linearly infexible, and leads to an inefcient capacity utilisation of the with the running time. Building on this fnding, Khoshniyat infrastructure. Tis is described by Forsgren et al. [62], who and Peterson [68] modifed timetables so that the assigned found several mathematical opportunities and challenges minimum time slot in the train path is increased linearly with following an alternative process, where the decision-making the service’s travel time. Based on numerical experiments on a is postponed as far as possible into the future. double track segment of the Swedish Southern Mainline, they Gestrelius et al. [63] developed an outline for a more conclude that the modifed timetables perform better when efcient capacity allocation process. In the current process, small disturbances are introduced. train paths are made and used for the entire year. Once Several new ex ante robustness measures have been fnalised, they are not allowed to be modifed; they must proposed. Gestrelius et al. [69] propose that the number of insteadbecancelledandreplacedbyanewtrainpathinthead possible alternative meeting locations between two trains is hoc-process, which is not allowed to disturb the surrounding a robustness measure, as it gives fexibility for rescheduling. train paths. Tis ofen leads to, for instance, a train stopping Khoshniyat [70] developed a headway-based method, which at a certain station every day of the week, to await a meeting can improve robustness without imposing major changes train that only runs on Tuesdays. All other days of the in timetables, and proposes four new measures: Channel week, these scheduled stops are entirely unnecessary. Te Width, Channel Width Forward, Channel Width Behind, suggestion for the new process is to only lock down certain and Track Switching. Andersson et al. [71] also propose key characteristics of train paths, such as departure times a new robustness measure: Robustness in Critical Points from some important stations, retaining greater fexibility in (RCP), which is focused on points in the timetable that later planning, and operational stages without compromising are particularly sensitive to delays. Warg and Bohlin [72] the quality of the train paths. established a timetable performance index to evaluate the Gestrelius et al. [63] also present a method to generate benefts and robustness of a timetable from a passenger these key characteristics based on an annual timetable, using perspective, by combining simulation and socioeconomic rolling horizon planning and a mixed integer programming analyses. model, which optimises the train paths for each individual Once robustness measures have been proposed, it is day, using diferent delivery commitments for diferent oper- possible to optimise timetables around them. For instance, ators.TeythenappliedthemodelonacasestudyinSweden Andersson et al. [73] propose a model which reallocates andshowedthatthisallowsforamoreefcientutilisationof existing margin time to increase the RCP. Tis is tested the infrastructure. Aronsson et al. [64] continued this work by simulation runs of an initial timetable and one with by working to estimate the value of this uncovered capacity, reassigned margins, while introducing small delays. In the showing that a large portion of the available capacity is hidden adjusted case, total delays at the end station are 28% lower. when using the current planning methods and scheduling Solinen et al. [74] also use an optimisation model to increase rules. the RCP throughout a timetable and evaluate the results with Working further on improving the process, Svedberg simulation. Tey found that robustness increased locally, but et al. [65] developed a model to optimise the welfare cost that the relationship between ex post measures and RCP must of diferent timetable variants, containing competing train be studied further. operators,andtheyappliedthemodeltoapartoftheSwedish rail network. A model like this gives the infrastructure 2.2.3. Tools. Oneprototypetoolbasedonoptimisation, manager a more correct and transparent way to rule on which has been developed and evaluated, is called the Maraca which trains should get their requested timetables and which [75]. It is used for nonperiodic timetabling and minimises should be adjusted or excluded. Tis is increasingly important resource conficts, thus enabling the user to focus on the 8 Journal of Advanced Transportation strategic decisions. Based on a trial of this prototype, Forsgren were identifed before the interviews: (1) guidelines and et al. [76] show how computer and optimisation based tools support, (2) rules of thumb, (3) feedback loops, and (4) trade- can provide valuable insights, even before full-scale imple- ofs, with a handful of guiding questions in each area. mentation. Another tool, intended for marshalling yards, is Te process of analysing the transcribed material was designed by Gestrelius et al. [77] who design an integer pro- carried out in sequence. Te frst step was to sort the diferent grammingmodeltoscheduleshuntingtasksandtoallocate interviewer-interviewee exchanges by area, rather than by tracks in both the arrival yard and the classifcation bowl. chronology, what Kvale [80] and Hammersley and Atkinson Tis tool has a planning period of four days and optimises [81] call categorisation. Te second step was to concentrate characteristics like shunting work efort, the number or cost the meaning of the answers [80] by cutting superfuous words of tracks, and the shunting task start times. and sometimes reformulating entire paragraphs into a few Similarly, the rescheduling of trains in disturbed scenar- sentences. Tis was a necessary process, to make it feasible ios is amenable to methods and tools using optimisation. togetanoverviewofwhatwassaid,andthevolumeoftext Krasemann [78] developed an algorithm, which efectively was reduced from 24,500 words to only 4,500. All these steps delivers good solutions within the permitted time, perform- were performed manually. ing a depth-frst search using an evaluation function to Te following has been translated from Swedish and prioritize when conficts arise, and then branches according provides an example of the concentration of meaning: to a set of criteria. Krasemann [79] then shows that the approach is feasible for practical problems, using the case Te transcript oftheIronOreLineinNorthernSwedenandsolvingmany “Unfortunately, the time is short, so we don’t have diferent delay scenarios to optimality within one minute or time to quality control ourselves, rather it’s like: now less. I’ve done that train I hope it’s right. We have two occasions where Trafc Control go in and check, but 3. Methodology they can’t see everything either. So unfortunately, we can’t do the kind of quality of work that we would like To produce the material for this paper, we carried out because the resources aren’t enough, we have to focus semistructured interviews with timetable planners working on getting it done.” at the Swedish Transportation Administration’s ofce in Malmo.¨ Each interview was approximately an hour long, can be condensed to recorded, and transcribed in full, which resulted in a written “We focus on getting it done, and don’t have time for material of around 50 pages. Te results were analysed by quality control. Te Trafc Control try to check, but manual categorisation and concentration of meaning. can’t see everything.” Te Swedish Transport Administration employs about 20 long term timetable planners, who work chiefy in the annual Having concentrated the interview answers, we made a timetabling process. In addition to these, there are short- more detailed sorting, and 16 new subareas were identifed. term timetable planners who work in the ad hoc-process. Following this, we further summarised the contents, reducing Te Swedish railway is divided into eight regions, and the thevolumefrom4,500to500words.Tismadeitmanageable southernmost region is planned from the ofce in Malmo¨ to get an overview of the contents. Section 4 contains by four timetable planners, all of whom we interviewed. Two translations of these summaries, grouped into the four areas of the planners were men and two women. All of them have of the interview guide. An example of this second step of worked in the industry for many years, at least since 2003 and summarising going back as far as 1985, and with timetables for nine or more years. “Trafc Control gives daily feedback: insufcient Tis region is a sort of microcosm of the railway network meeting time, crossing train paths, stopped freight in Sweden, and it contains a very diverse mix of railway lines, train before a slop, train stops on the wrong track.” train trafc, and capacity utilisation. It includes the Southern “Feedback: want the train on another track, crossing Mainline,oneofthemostheavilyusedinthecountry,dense train paths, infeasible timetable. Adjust in ad hoc- commuter systems around Malmo-Lund-Helsingborg,¨ heavy timetable and in dialogue with train operating com- freight trafc mixed with passenger trains on the single- pany, but not always possible.” track Line, sparser passenger trafc around Ystad, “Trafc control usually tells us: trains are too long, or Karlskrona, and Kalmar, and several nonelectrifed lines with always late. Less feedback about punctuality, but there local, manual train dispatching and very low trafc volumes. canbeproblemsaroundengineeringworksoradhoc- Tus,whilewehaveonlyinterviewedplannersinoneregion, trains that only run a single day and make a mess. those planners have been exposed to a wide variety of Wemostlyfocusonthetrainnumbersthatrunmore planning conditions and circumstances. ofen.” We used a qualitative method because this allowed us to efectively study the values and priorities of those involved. In “A new group is looking at the code ‘suspected error this choice of method, we thus applied a qualitative approach in the timetable’, fnding new errors: shouldn’t have on a topic that is typically studied using quantitative methods. a meeting with zero dwell time in Morrum¨ [a small We prepared an interview guide based on four areas which station], because then the trains lose two minutes.” Journal of Advanced Transportation 9

“Easy to miss crossing train paths, because our sys- times, crossing train paths, poor track allocation, trains that tems lack confict management. You learn afer a are too long for the allocated tracks, and freight trains being while how to handle diferent locations, but it’s not stopped at inclines. However, there is no established system written down anywhere.” or routine to keep track of these comments, or to make use canbesummarisedinto of them, other than making a note mentally or on paper “Trafc control has a lot of feedback. Short meeting and trying to remember this information until it can be used times, crossing train paths, bad track allocation, trains next year. Te planners also feel that important preconditions that are too long, and freight trains stopped in slopes.” change from year to year, which makes it difcult to draw comparisons and lessons between years, making it even more Te analysis in Section 5 is based on an alternate reading difculttomakeuseoftheloosenotesthattheymakebased of the interview responses. We identifed instances in the on the feedback. interviews where the planners described feedback from trafc control about errors in the timetable, which can be seen in the 4.2. Support and Guidelines. Te planners describe a some- examples of translated and concentrated statements provided what contradictory experience of the work environment: the above,andwefocusedontheonesthatwerementioned women describe the atmosphere as open and helpful, while most frequently among the planners. We also identifed the male planners describe a lonelier experience. Te more several statements that could explain why errors sometimes occur in the timetable and condensed these into a list of experienced planners work more based on discussions with eleven reasons. At this point, we looked for diferent ways to the train operating companies than on a strict application of group and categorise the answers, looking for themes on a the guidelines. Te one who was newest at the job and who higher analytical level, and came up with the three following had more ofen switched between railway lines stressed the categories, which we use in the analysis: (1) “missing tools and importance of studying the geography and signalling systems support,” (2) “role confict,” and (3) “single-loop learning” extensively. Te guideline document [52] was mentioned (see Argyris and Schon¨ [30] and Section 2 above). Tese three by all the interviewees, but it is interpreted liberally and categoriesareusedtoexplainanddiscussthereasonsbehind was not described as helpful. A new version is said to the errors more deeply. be coming soon, but this has been said for several years. Trainplan is the main tool and while it does many things, 4. Results the version used at the Swedish Transport Administration does not handle track allocation, manage conficts, or provide Te questions in the interview guide were structured around topographical information. Te train operating companies four areas, and we will begin by reporting summaries of ofen agree among themselves and apply detailed timetables. the responses according to each area. Tese summaries add Te planner only adjusts when necessary, and this is done in to the contextual understanding of the timetable planning dialogue with the train operating companies. As a planner, in Sweden and document issues in ways that could not one must know which signal-box models are present and be achieved by studying documents or guidelines and are how they work, which is quite complicated without sufcient a key part in establishing a baseline for later evaluations technical support. and studies, following the implementation of the new tools and processes. Further analysis and discussion of interview 4.3. Trade-Ofs. Although the timetable planners know that responses follows in Sections 5 and 6. trains are sometimes late, they report that they cannot plan a timetable based on the trains running late. Engineering 4.1. Learning and Feedback Loops. Te planners explained worksthatmovealongthelineduringtheyeararedifcult that the timetabling work is split by lines, with some much- to schedule properly, as the location for the delay shifs over needed reinforcement at large stations. Tey state that they time. Creating new train paths for each scenario is chal- have a large individual responsibility, both in learning what lenging and discouraged by the capacity allocation process. is necessary and in performing quality control, that it takes Te lack of capacity, especially at stations, is mentioned by a long time to learn the details of each railway line, and several interviewees as the most difcult problem in their that there is no time for quality control. It is difcult to work. It appears to be a bottleneck-problem, rather than one transfer accumulated experience. Even though there is some education and transfer of knowledge as a line is handed of sheer volume, where past a certain threshold it becomes over, it is insufcient and quite short. Te planning method very difcult. Te planners try to handle the lack of capacity hasbeenlargelythesameforthelast20yearsorso,but in dialogue with the train operating companies, and they all the work is continuously getting more difcult, due to the describe diferent principles for doing so. Te problem of increasing number of trains and engineering works. Problems congestion is exacerbated by the short-term planners bending increasingly occur at the stations, where the capacity is the rules to ft in more trains. Negative margins are ofen insufcient.Inthepast,thereverseusedtobetrue. used for local trains on single-track, so that the scheduled Tere is no systematic evaluation or quality control: it is runtime is shorter than the calculated minimum runtime, at up to the individual planners, and they do not have time to the request of train operating companies. Te explanations perform it. Te feedback loop that exists is from operational forthisvaryfrompersontoperson,buttheplannersstate trafc control, most frequently about insufcient meeting that “it has always been like this.” 10 Journal of Advanced Transportation

4.4. Rules of Tumb. Te planners express that node sup- errors (a)–(d) is associated with this reason. Tese four errors plements are the primary way of assigning margins, but arediscussedfurtherinthetabletext.Tethreerightmost everyone describes a diferent methodology for assigning columns illustrate how the eleven reasons map onto the three them. Some give descriptions that seem to run counter to themes we have identifed. Each of these themes are discussed the few rules that are written down. Another very common in the following sections. practice is to adjust arrival and departure times at control points so that they occur at whole minutes. At some places, 5.1. Missing Tools and Support. One theme running through this is described as required for technical reasons, but it the responses is that the proper tools to perform timetable appears popular even elsewhere. Usually, seconds are added planning are missing. Tis is most clearly illustrated by up to the whole minute but sometimes subtracted. Phasing reasons (2)–(5) in Table 3. Based on the answers given, such supplements are an important tool to make the timetables toolswouldfreeuptimeandallowashifinfocusfrom feasible. Only the most experienced planner adds a minute the details to quality assurance and a better overview of the afer a scheduled stop, as per an old unwritten rule going timetable. Te planners state that they must keep track of back decades, although others were aware of the practice. which model of signalling control is present at each location Te train operating companies generally set the dwell times. and how it works. Tis is complicated and the planners Te planners say that the standard is two minutes, but they described that there is very little support available. give the impression that shorter times dominate. Local trains Te main tool used by the planners in Sweden also lacks areofengiventhesamearrivalanddeparturetimes,with functions for track planning and confict management and no scheduled dwell time, to avoid waiting unnecessarily in does not provide topographical information. Tese functions case the train is delayed or the number of passengers is were intended to be part of the current sofware tool, when it small. Dwell times in excess of two minutes are primary for was procured in the early 2000s, but the implementation of connections and phasing reasons and in rare cases for trains these modules was cancelled, allegedly because the quality of going into the mountains during holidays, when many people infrastructure data was too poor. New tools for both planning bring skis and similar equipment. and control of trafc are under development, and the planners hope that these important functions will be implemented in 5. Analysis the coming years. Te guiding document for timetable planning was men- Here we present the results of an alternate reading of the tionedbyallfourplannersbutwasnotveryhelpfulandis responses, focusing on the errors that occur in timetable plan- interpreted liberally. A new version is said to be arriving soon, ning, the reasons behind them, and three themes running but this has been said for several years. Te planners also through these reasons. state that the problem of insufcient capacity is worsened Te Swedish timetable planners described receiving daily by operational timetable planners being less rigorous in feedback from trafc control; see the excerpts in Section 3, following the guiding documents, when inserting new trains which are primarily centred on four areas: (a) crossing train into the gaps that do exist. paths at stations, (b) wrong track allocation of trains at stations, especially long trains, (c) insufcient dwell and 5.2. Role Confict. Another underlying theme is the inherent meet times at stations, and (d) insufcient headways leading role confict of timetable planners, which is best illustrated to delays spreading. To give a rough idea of the relative by reasons (6)–(9) in Table 3; an overly liberal interpretation frequency of these errors, throughout the transcripts the of the planning rules and guidelines is that train operating planners explicitly mention receiving feedback relating to (a) companies ask for shortcuts to ft more trains into the fve times, (b) nine times, (c) six times, and (d) twice. Te timetable, that train operating companies request short dwell issue of crossing train paths is mentioned frequently as a times to avoid trains waiting, and that there is no clear difcult issue: 15 times throughout the transcripts, suggesting strategy for the location and size of time supplements. that the planners are working hard on fnding and eliminating One example of this confict is how the decision to deny such errors, with partial success. a train request, because of capacity constraints, is described Based on the interviews, we have also identifed eleven as difcult and controversial. Denied requests can be, and reasons why the timetables sometimes lack quality, allowing ofen are, challenged through the formal capacity allocation the occurrence of errors. Tese reasons were not explicitly process described in Section 2.1.1, which leads to a lengthy stated or described as such but were identifed by reading and and difcult legal process of showing that everything was sorting through the transcripts. We have also identifed three done correctly and transparently. Producing a timetable, common themes that run through the list: “missing tools and which cannot in practice be executed as scheduled, and which support,” “role confict,” and “single- rather than double-loop is likely to cause delays, is not subject to the same formal learning” (trying to follow the established norms rather than procedures or complaints. trying to establish the right norms, see Argyris and Schon¨ Another example is how the more experienced planners [30]). Tis is all summarised in Table 3. focus more on discussions aimed at reaching a consensus with TelefmostcolumninTable3containsanidentifying the train operating companies than on a strict application number, used in the following sections. Te second column of the guidelines. In Sweden, the train operating companies from the lef describes the reason for lacking quality, and mostly agree between themselves and apply with detailed the centremost column identifes which of the four common timetables, which the planners only adjust when necessary Journal of Advanced Transportation 11

Table 3: Reasons why errors occur in timetables. Te four types of associated errors are (a) crossing train paths at stations, (b) wrong track allocation of trains at stations, especially long trains, (c) insufcient dwell and meet times at stations, and (d) insufcient headways leading to delays spreading. (a)-(b) make the timetable infeasible without intervention from trafc control, are therefore considered the most critical, and can be described as inadvertent mistakes. (c)-(d) systematically lead to delays occurring, increasing, and spreading and are made intentionally to accommodate the train operating companies, even if the consequences are not fully understood. Note the high extent to which these errors are focused around stations. Teme 1. Teme 3. Associated Teme 2. Role Number Reason for lacking quality, description Missing tools Single- rather than errors confict and support double-loop learning (a), (b), Insufcient time for quality assurance of (1) (c), and X timetables (d) Too many issues to keep in mind manually for (2) (a), (b) X X planners Work is becoming more difcult due to (3) (a), (b) X increasing congestion (a), (b), Congestion on stations, especially large and (4) (c), and X complex stations (d) Missing tools for track allocation and confict (5) (a), (b) X management Liberal interpretation of the planning rules and (6) (c), (d) X guidelines Train operating companies ask for shortcuts to (7) (c), (d) X ft more trains into the timetable Train operating companies want short dwell (8) (c) X times to avoid trains waiting No clear strategy for the location and size of (9) (c), (d) X X time supplements (a), (b), Poor feedback and evaluation of timetables; no (10) (c), and X routine for continuous improvement (d) (a), (b), Poor knowledge transfer to new planners; poor (11) (c), and X documentation (d) and then in dialogue with the companies, doing what they makes it difcult to compare between diferent timetables and cantosqueezethetrainsin.Forinstance,insufcientdwell to transfer the notes made about feedback and errors from times and negative margins are ofen given to local trains on one year to the next one. Tis contributes to the conditions single tracks, on request from the train operating companies. described in ONTIME [21, p. 37] which also commented on Te rationale for this difers from planner to planner, but “it the timetabling process in Sweden: “the accumulated know- hasalwaysbeendonelikethis.” how of train dispatchers and train drivers is not fed back to the timetable construction process to any larger extent.” It is 5.3. Single-Loop Learning. Te last theme is best illustrated also difcult to transfer accumulated experience: even though by reasons (1)-(2) and (9)–(11) in Table 3 and refers to the thereissomeeducationandtransferofknowledgetonew concept described in Argyris and Schon¨ [30] and Section 2. planners, it is described as insufcient and too short. Te timetable planners have a large amount of individual Te planning work in Sweden is centred on fnishing the responsibility: in the number of timetables that they must timetables in time, while keeping in mind all the technical create, in learning the relevant signalling control systems, details and constraints of diferent signalling control systems in applying the rules and guidelines, in assigning margins, androllingstock,thetopography,crossingtrainpaths,track in creating robustness, and in controlling the quality. Tey allocation, and so on. Tis is a direct parallel to the situation show that it takes a long time to learn the geography and that in the UK described by Watson [14, p. 112], where “achieve there is no time for quality control. Te task gets harder and all timetable production timescales” is listed as the number harder because the number of trains and engineering works one priority among timetable planners at Network Rail and are increasing. “error free” only as number six, in a ranked list of eight Tere is no systematic evaluation. Tis is up to each priorities. Te British timetable planners that Watson [20, p. planner, and they say that they do not have the time. 312] interviewed primarily requested “help with elimination Important preconditions change from year to year, which or reduction of the repetitive data manipulation tasks that 12 Journal of Advanced Transportation delaythemfromtacklingthe‘interesting’ confict resolution and it is important to ensure that these important functions are resource minimisation work.” included and that enough high quality data is provided for Te lack of support and proper tools means that there thesystems.Whiletheroleconfictcannotbeentirelyavoided simply is not enough time or energy lef to assess whether the by technology, new tools and processes can diminish the rules and guidelines are the best or most appropriate ones to consequences by no longer permitting things such as negative use. Tere is not enough time to consider what would make margins and very short dwell times. Tus, more egregious the timetable better, or to ensure that the errors from previous errors could, perhaps, be eliminated. Te planners would years are not repeated. If the implementation of new tools then also have more support when denying some requests can help to streamline the workfow for timetable planners, from the train operating companies, but this support could the planners could use more of their time considering how to be given in other, less technical ways. Adjusting the penalty make the timetable better and more robust. associated with excluding trains, which is severe and lacking any theoretical basis, would be one such measure that could 6. Discussion easily be implemented on a policy level. It would also be helpful to clarify the roles and respon- Tis paper is focused on how the timetable process and sibilities between the diferent parties. With a separation the decisions of timetable planners contribute to delays and between the infrastructure manager and the train operating nonpunctuality. Timetable properties have been shown to be companies, it is conceivable that both parties would like to important and that changes to the timetable are relatively assume responsibility for ensuring that the timetable has a easy, quick, and inexpensive to make, in comparison to high quality. Te train operating companies might want to be changes to infrastructure, rolling stock, and maintenance responsible, because timetables are core part of their business practices. While we show that timetable planners make errors with signifcant impacts on their customers’ experience. Te that lead to delays, we do not suggest that this is the most case is also strong for the infrastructure manager to assume important contributor to delays overall. responsibility, because it needs to coordinate trafc from It is also important to distinguish between what timetable many diferent companies, as well as engineering works. planners do, which contributes to nonpunctuality, and what However, the results of this paper indicate that neither they do to contribute to punctuality. In the interviews, it party assumes the responsibility. Clarifying the roles by is made clear that planners are really struggling to create clearlystatingwhichpartyisresponsibleforthequalityand a feasible, error-free timetable and that they need more robustness of the timetable would help make this interaction assistance. Before this is addressed, there is no time lef in more constructive. the process for them to increase robustness. Tere is clearly Te results presented in this paper suggest that both a need for tools to help generate and prove feasible timetables researchers and practitioners should focus more on identi- morequickly.Oncetheplannershavethesetoolstheycan fying and improving the relevant feedback loops, to achieve focusonimprovingtherobustness. a higher level of learning among those involved. Single-loop Te importance of new tools directed at helping generate learning is both a technical and organizational issue. Since timetables more quickly and verifying that they are feasible the tools are lacking, planners are hard-pressed merely to and error-free is also evident from the interviews. As these fnish their work. Te time is simply not sufcient to perform new tools and processes are being developed and imple- qualitycontrol.Becausethereisnosystematicreviewofthe mented, it is interesting to consider and discuss the role of quality and outcome, there is no way to begin to improve timetable planners in the future. One viewpoint that is ofen the rules and guidelines, or to create a better timetable. Tis raised is that the planners will go from drawing timetables supports the fndings in Watson [14], Hellstrom¨ [19], and to monitoring the systems that draw them. Te focus will ONTIME [21]. As the tools do not provide enough assistance, shif from creating � timetable, to creating a better timetable. thefocusis,andmustbe,oncreatingatimetablebefore However, the current system already assumes that this is creating a better timetable. Creating a better timetable is what the case. Te results of this paper suggest that planners are we imagine the timetable planners of the future will be tasked already entrusted with a large degree of discretion and are with doing, when more of the work has been automated and solely responsible for creating good, robust timetables. Te the sofware tools provide far more assistance. Rather than problemisthattheycurrentlydonothavethemeanstomeet trying to manually execute all the details, they will choose that responsibility, because the task of creating � timetable is which heuristics, goal-functions, and constraints to apply in too demanding and time-consuming. If provided with better diferent scenarios to achieve the best overall results. tools and support and if the feedback loops were improved, it Tis study focuses on Sweden. In the literature, we have wouldbepossiblefortheplannerstorisetothechallengein seen large similarities with the United Kingdom, which uses the future. As is, they do not appear to be anxious or worried the same tools and has a similarly deregulated market. Te that they will lose their jobs to automation: the impression is new tools that are currently being implemented in Sweden perhaps more of frustration that the tools do not work as they have recently been implemented, by the same supplier, in should. both Norway and Denmark. Experts from the infrastruc- Conficting train paths, track allocation, and constraints ture manager and largest train operator in the Netherlands due to diferent signalling control systems could all be [82] describe similar issues with errors causing delays and handled well through sofware, but the tools currently used infeasibility there. We believe that the planning process is do not do this. As new tools and routines are implemented, largely similar across most European countries, although Journal of Advanced Transportation 13 the level of deregulation and competition between train charges, while the Dutch successively raise the track charges operating companies for capacity may vary, as will the tools for the conficting trains until only one party is willing to run and contexts. the train. Refecting on the methodology, we found it very reward- Based on interviews with timetable planners in Swe- ingtointerviewthepeopleinvolvedintheactualworkof den, we found that errors in timetables commonly lead to planning in a structured way, and we found them to be infeasible timetables, which necessitate intervention by trafc surprisingly frank and forthcoming. We are also pleased with control, and to delays occurring, increasing, and spreading. how much more information can be extracted from the inter- Te errors we identifed are (a) crossing train paths at stations, views when the transcripts are concentrated, categorised, (b) wrong track allocation of trains at stations, especially for and sorted. It is a very time-consuming process, but in our long trains, (c) insufcient dwell and meet times at stations, experience, it makes the initial investment of conducting and and (d) insufcient headways leading to delays spreading. Te transcribing the interviews even more worthwhile. Being able situation is very reminiscent of the one described by Watson to query the transcribed material from diferent angles, rather [14] in the UK, preceding our study by almost a decade: the than being bound by the initial structure provided by the timetable planners really struggle to create a timetable to interview guide, is also very valuable and one of the key begin with, and they do not manage to produce one without methodological takeaways for us. errors. As a fnal note, the fndings in this study have been Reading through the transcripts of the interviews, we reported to both managers and experts at the Swedish have identifed eleven reasons for these errors, and running Transport Administration on several occasions. Tey have through these reasons, we have identifed and discussed three expressed a keen interest in the research, in its results, and themes: (1) missing tools and support, (2) role confict, and (3) in spreading the fndings deeper into the organization and single-loop learning. Te frst theme, that proper tools and the planning process. Te problems with the current set of support are missing, is mostly a technical issue. Te second sofware are well known internally, which is one of the reasons themeisthatofaroleconfictforplanners,whichismostly for the large and currently ongoing project of replacing it. an organizational issue. On the one hand, they must strive to More surprise has been expressed on the theme of role meet the demands of the train operating companies and, on confict, in the perceived lack of internal support for timetable theotherhand,theymustbeunbiasedandcreateatimetable planners to go against the wishes of the train operating that has a very high quality overall. Te third theme is that companies, for the beneft of the overall timetable. Te planners, both individually and as a collective, appear to be managers realise that this is a question of leadership, where stuck in single-loop learning [30], which is both a technical they can and must improve. Te issues around learning and and organizational issue. feedback also generated extensive interest and discussion on theroleoftimetableplannersinthefutureoncethesofware Appendix provides better support and on how to more systematically implement methods and routines for continuous learning Interview Guide and improvement. Note. Te questions are translated from Swedish. Questions 7. Conclusions in italic were optional, intended to follow up. Te process of timetable planning in Sweden has been mostly Introductory Questions stable since the beginning of the new millennium, with only Can you please tell me briefy about yourself and your minor changes in the processes, norms, and tools used. background? Over the next few years this will change, following a very signifcant investment in developing new tools and routines. Can you please tell me briefy about how you work Te large shif that this is brought about presents a rare with the upcoming annual timetable? opportunity to study a major transition in timetable planning. Any large investment like this should be evaluated seriously, Questions on Feedback toassesstheefectsanddrawlessonsforthefuture.Tispaper helpsestablishthebaselineforsuchanevaluation.Finally, How long have you worked here? a growing research interest into robust timetables, for more punctual train trafc, also justifes studying the timetable Can you please tell me a little about what you have planners and process. learned since you started working here? Studying the Network Statements of European railways, Can you please talk briefy about how the work has we found that the processes are largely similar; the dif- changes since then? ferenceismainlyinhowtoprioritizewhentwoormore train operating companies have requested the same train Approximately how is the work divided between slot. Whereas Sweden uses social cost estimates based on thoseofyouwhocreatetheannualtimetable? the theory of welfare economics, the British use a more Can you please give some examples of how you qualitative assessment based on eleven predefned criteria; exchange knowledge and experiences, between col- Germany selects the train that would pay the highest track leagues? 14 Journal of Advanced Transportation

Can you please explain briefy how those of you who What diference does the train type make? create the annual timetable evaluate it, afer the fact? What diference does the volume of passengers make? Can you please give some examples of how you tie What diference do the punctuality statistics make? back to earlier timetables, coming into the annual timetable for 2017? Can you please give an example of when margins are needed in a timetable? Onwhatlevelofdetailisthisfeedbackdone? Can you please briefy state what the punctuality looks Ifyoubelievetheyarerequired,wheredoyouallocate like on the lines, trains and stations you plan for? the margins? Can you please give examples of how you decide on Questions on Guidelines thesizeofmargins? Can you please talk a little about which factors come Can you please tell me about the guidelines and into play here? support you have available when you create an annual timetable? What diference does the location make? Can you please give an example of when you turned What diference does the train type make? to the guidelines? What diference does the volume of passengers make? Can you please give an example of when you used What diference do the punctuality statistics make? “Olsson’s minute”? [A running time supplement of one minute directly following a scheduled stop, named afer a timetable planner working several decades ago] Questions on the Trade-Ofs in Timetable Planning Can you please give examples of where you allocate Can you please talk a little about the trade-ofs you “node supplements”? make in your work, creating the annual timetable? How ofen do you have to allocate supplements for Can you please give some examples of difcult trade- trains using secondary tracks? ofs you have made recently? Can you please talk a little about what tools you use Trade-ofs between short and reliable journey times? when creating a timetable? Te trade-of between margins at stations or on the line? Can you please give an example of how close you stick to the timetables applied for by the train operating Te trade-of between concentrated and distributed companies? margins? How specifc are the train operating companies’ appli- Can you please give an example of where you willingly cationswithregardstodwelltimes? created a delay for a train? How binding are the train operating companies’ appli- cationswithregardstodwelltimes? Disclosure How specifc are the train operating companies’ appli- Tis is an extended version of a paper presented at the cations with regards to run times? 20th EURO Working Group on Transportation Meeting How binding are the train operating companies’ appli- (EWGT2017) in Budapest, on 4–6 September 2017 [83]. cations with regards to run times? How specifc are the train operating companies’ appli- Conflicts of Interest cations with regards to arrival times? Te authors declare that there are no conficts of interest How binding are the train operating companies’ appli- regarding the publication of this paper. cations with regards to arrival times? Acknowledgments Questions on Rules of Tumb, Heuristics Te authors would like to thank the Swedish Transport What do you typically use as a dwell time for passen- Administration for providing funding for the research, as well ger trains? as providing access to the interviewees and to the governing Can you please give examples of factors that con- documentation. Finally, the authors also thank the project tribute to your allocating a longer or shorter dwell reference group for their continuous assistance, guidance, time? and feedback. Can you please give examples of when you allocate a longer dwell time? References Can you please give examples of when you allocate a [1] Transport Analysis, Rail Trafc 2015,TransportAnalysis,Stock- shorter dwell time? holm, Sweden, 2016. Journal of Advanced Transportation 15

[2] M. Grimm, Jarnv¨ agens¨ kapacitetsutnyttjande 2016,Swedish [21] ONTIME, Benchmark analysis, test and integration of timetable Transport Administration, Borlange,¨ Sweden, 2017. tools, Delf University of Technology, Delf, Netherlands, 2014. [3]Y.Xia,J.N.VanOmmeren,P.Rietveld,andW.Verhagen, [22] Swedish Transport Administration, Improved Capacity,2017, “Railway infrastructure disturbances and train operator perfor- Available at https://goo.gl/mVkSW6. mance: Te role of weather,” Transportation Research Part D: [23] Swedish Transport Administration, Network Statement 2017, Transport and Environment,vol.18,no.1,pp.97–102,2013. Swedish Transport Administration, Borlange,¨ Sweden. [4] W.Brazil, A. White, M. Nogal, B. Caulfeld, A. O’Connor, and C. [24] I. A. Hansen and J. Pachl, Railway Timetabling & Operations, Morton, “Weather and rail delays: Analysis of metropolitan rail Eurailpress, Hamburg, Germany, 2nd edition, 2014. in Dublin,” Journal of Transport Geography,vol.59,pp.69–76, 2017. [25] F. Avelino, M. te Brommelstroet,¨ and G. Hulster, “Te Politics of Timetable Planning: Comparing the Dutch to the Swiss,” in [5] Y. Qin, X. Ma, and S. Jiang, “A Stochastic Process Approach for Proceedings of the Colloquium Vervoersplanologisch Speurwerk Modeling Arrival Delay in Train Operations,” in Proceedings of 2006, pp. 23-24, Amsterdam, Netherlands, November 2006. the Transportation Research Board 96th Annual Meeting,2017. [6] M. F. Gorman, “Statistical estimation of railroad congestion [26] D. Wood and S. Robertson, “Planning tomorrow’s railway - delay,” Transportation Research Part E: Logistics and Transporta- role of technology in infrastructure and timetable options tion Review,vol.45,no.3,pp.446–456,2009. evaluation,” in Capacity Management Systems, Computers in Railways VIII,J.Allan,R.J.Hill,C.A.Brebbiaetal.,Eds.,2002. [7]N.O.E.OlssonandH.Haugland,“Infuencingfactorsontrain punctuality - Results from some Norwegian studies,” Transport [27] M. S. Feldman and B. T. Pentland, “Reconceptualizing orga- Policy,vol.11,no.4,pp.387–397,2004. nizational routines as a source of fexibility and change,” [8] N. Olsson, A. H. Halse, P. M. Hegglund et al., Punktlighet i Administrative Science Quarterly,vol.48,no.1,pp.94–118,2003. jernbanen - hvert sekund teller, SINTEF Akademisk Vorlag, [28] D. Kahneman, P. Slovic, and A. Tversky, Judgment Under Oslo, Norway, 2015. Uncertainty: Heuristics and Biases, Cambridge University Press, [9]M.Veiseth,N.Olsson,andI.A.F.Saetermo,“Infrastructures New York, UK, 1982. infuence on rail punctuality,” WIT Transactions on Te Built [29] M. Bazerman, Judgement in Managerial Decision Making,Wiley, Environment,vol.96,pp.481–490,2007. New York, NY, USA, 4th edition, 1998. [10] J. Parbo, O. A. Nielsen, and C. G. Prato, “Passenger Perspec- [30] C. Argyris and D. A. Schon,¨ Organizational learning II: theory, tives in Railway Timetabling: A Literature Review,” Transport method, and practice, Addison-Wesley, Reading, Mass, USA, Reviews, vol. 36, no. 4, pp. 500–526, 2016. 1996. [11] P. B. L. Wiggenraad, Alighting and boarding times of passen- [31] M. Loock and G. Hinnen, “Heuristics in organizations: A review gers at Dutch railway stations,TRAILResearchSchool,Delf, and a research agenda,” Journal of Business Research,vol.68,no. Netherlands, 2001. 9, pp. 2027–2036, 2015. [12] L. Nie and I. A. Hansen, “System analysis of train operations [32] C. B. Bingham and K. M. Eisenhardt, “Rational heuristics: Te and track occupancy at railway stations,” European Journal of ’simple rules’ that strategists learn from process experience,” Transport and Infrastructure Research, vol. 1, pp. 31–54, 2005. Strategic Management Journal,vol.32,no.13,pp.1437–1464, [13]C.W.Palmqvist,N.O.E.Olsson,andL.Hiselius,“Delaysfor 2011. passenger trains on a regional railway line in Southern Sweden,” [33] G. Kirkebøen, “Decision Behaviour – Improving Expert Judge- International Journal of Transport Development and Integration, ment,”in Making Essential Choices with Scant Information,T.M. vol. 1, no. 3, pp. 421–431, 2017. Williams, K. Samset, and K. Sunnevag,˚ Eds., Chippenham and [14] R. Watson, Train Planning in A Fragmented Railway - A Eastbourne:CPIAntonyRowe,2009. British Perspective, [Doctoral, thesis], Loughborough University, [34] V.R. Vuchic, Urban Transit Operations, Planning And Economic, Loughborough,England,November2008. John Wiley & Sons, New Jersey, NJ, USA, 2005. [15] A. Kauppi, J. Wikstrom,B.Sandblad,andA.W.Andersson,¨ [35] V.A. Profllidis, Railway Management and Engineering,Ashgate, “Future train trafc control: Control by re-planning,” Cognition, Surrey,UK,4thedition,2014. Technology & Work,vol.8,no.1,pp.50–56,2006. [16] S. Tschirner, B. Sandblad, and A. W. Andersson, “Solutions to [36] N. Harris, H. Haugland, N. Olsson, and M. Veiseth, An the problem of inconsistent plans in railway trafc operation,” Introduction to Railway Operations Planning, A & N Harris, Journal of Rail Transport Planning and Management,vol.4,no. London, UK, 2016. 4, pp. 87–97, 2014. [37] A. Ceder, “Public Transport Scheduling,” in Handbook of [17] B. Sandblad, A. W. Andersson, and S. Tschirner, “Information Transport Systems and Trafc Control,J.K.ButtonandD.A. Systems for Cooperation in Operational Train Trafc Control,” Hensher, Eds., pp. 539–558, Elsevier Science ltd., Oxford, UK, Procedia Manufacturing, vol. 3, pp. 2882–2888, 2015. 1st edition, 2001. [18]E.M.Roth,N.Malsch,andJ.Multer,“UnderstandingHow [38] A. Caprara, L. Kroon, M. Monaci et al., “Chapter 3 Passenger Train Dispatchers Manage and Control Trains: Results of a Railway Optimization,” in Handbooks in Operations Research Cognitive Task Analysis,” DOT-VNTSC-FRA-98-3, National and Management Science, C. Barnhart and G. Laporte, Eds., vol. Technical Information Service, Springfeld, VA, USA, 2001. 14, pp. 129–187, 2007. [19] P. Hellstrom,¨ Problems in the Integration of Timetabling And [39] M. Veiseth, P. M. Hegglund, I. Wien, N. O. E. Olsson, and Train Trafc Control, Uppsala University, Uppsala, Sweden, Ø. Stokland, “Development of a punctuality improvement 2013. method,” TQM Journal, vol. 23, no. 3, pp. 268–283, 2011. [20] R. Watson, “Prospects for computer aided railway scheduling: [40] M. Sama,` A. Ariano, F. Corman, and D. Pacciarelli, “A variable Perspectives from users and parallels from mass transit,” Trans- neighbourhood search for fast train scheduling and routing portation Planning and Technology,vol.23,no.4,pp.303–321, during disturbed railway situations,” Computers & Operations 2000. Research,vol.78,pp.480–499,2017. 16 Journal of Advanced Transportation

[41] S. Tschirner, Te GMOC Model: Supporting Development of approach in Sweden,”SICS Technical Report 2012:11, ISSN 1100- Systems for Human Control [Ph.D. dissertation], Uppsala Uni- 3154, Swedish Institute of Computer Science, Kista, Sweden, versity, Uppsala, Sweden, 2015. 2012. [42] A. Gunnarsson, “Swedish Railway Policy in the EU Environ- [63] S. Gestrelius, M. Bohlin, and M. Aronsson, On the uniqueness ment – Railway organization and fnancing,” in Proceedings of of operation days and delivery commitment generation for train the Workshop in Stockholm, Ministry of Enterprise, Energy and timetables,presentedatRailTokyo2015. Communications Sweden, September 2013. [64] M. Aronsson, M. Forsgren, and S. Gestrelius, “Uncovered [43] Network Rail, Te Network Code, Network Rail, London, UK, capacity in Incremental Allocation,” SICS Technical Report 2017. T2017:01, ISSN 1100-3154, Swedish Institute of Computer Sci- [44] D. B. Netz, Network Statement 2018, Frankfurt am Main: DB ence, Kista, Sweden, 2017. Netz, 2017. [65]V.Svedberg,M.Aronsson,andM.Joborn,“Railway [45] ProRail, Network Statement 2017,ProRail,Utrecht,Nether- Timetabling Based on Cost-Beneft Analysis,” Transportation lands, 2016. Research Procedia,vol.22,pp.345–354,2017. [46] Swedish Transport Administration, Analysmetod och [66] E. Andersson, A. Peterson, and J. Tornquist¨ Krasemann, samhallsekonomiska¨ kalkylvarden¨ for¨ transportsektorn: ASEK “Robustness in Swedish Railway Trafc Timetables,”in Proceed- 6.0,Trafkverket,Borlange,¨ Sweden. ingsoftheRailRome2011:BookofAbstracts4thInternational [47] P.Strom,¨ “Interviewed by the corresponding author on Septem- Seminar on Railway Operations Modelling and Analysis, 2011. ber 15, 2017”. [67] A. Peterson, “Towards a robust trafc timetable for the swedish [48] R. M. P. Goverde and I. A. Hansen, “Performance indicators for southern mainline,”in Proceedings of the Computers in Railways railway timetables,”in Proceedings of the 2013 IEEE International XIII: Computer System Design and Operation in the Railway and Conference on Intelligent Rail Transportation, IEEE ICIRT 2013, Other Transit Systems,pp.473–484,2012. pp.301–306,September2013. [68] F. Khoshniyat and A. Peterson, “Robustness Improvements [49] C. W.Palmqvist, N. O. E. Olsson, and L. Hiselius, “AnEmpirical in a Train Timetable with Travel Time Dependent Minimum Study of Timetable Strategies and Teir Efects on Punctuality,” Headways,” in Proceedings of the 6th International Conference on in Proceedings of the the 7th International Conference on Railway Railway Operations Modelling and Analysis - RailTokyo,Tokyo, Operations Modelling and Analysis (Rail Lille 2017),2017. Japan, March, 2015. [50] Swedish Transport Administration, Noder i jarnv¨ agssystemet¨ , [69] S. Gestrelius, M. Aronsson, M. Forsgren, and H. Dahlberg, Trafkverket, Borlange,¨ Sweden, 2015. “On the delivery robustness of train timetables with respect to [51] Swedish Transport Administration, Riktlinjer tathet¨ mellan tag˚ , production replanning possibilities,” in Proceedings of the 2nd Trafkverket, Malmo,¨ Sweden, 2016. International Conference on Road and Rail Infrastructure,2012. [52] Swedish Rail Administration, Riktlinjer for¨ tidtabellskonstruk- [70] F. Khoshniyat, Optimization-Based Methods for Revising Train tion for¨ tag˚ pa˚ statens sparanl˚ aggningar¨ , Swedish Rail Adminis- Timetables with Focus on Robustness, Linkoping¨ University tration, Borlange,¨ Sweden, 2000. Electronic Press, 2016. [53] M. J. C. M. Vromans, Reliability of Railway Syste [Ph.D ERIM], [71] E. V. Andersson, A. Peterson, and J. Tornquist¨ Krasemann, Research in Management 62, 2005. “Quantifying railway timetable robustness in critical points,” [54] R. Rudolph, “Allowances and Margins in Railway Scheduling,” Journal of Rail Transport Planning and Management,vol.3,no. in Proceedings of the World Congress on Railway Research,pp. 3, pp. 95–110, 2013. 230–238, 2003. [72] J. Warg and M. Bohlin, “Te use of railway simulation as an [55] L. Haldeman, Automatische Analyse von IST-Fahrplanen [M.S. input to economic assessment of timetables,” Journal of Rail thesis], Quoted in Vromans (2005),ETHZurich,¨ Zurich,¨ Transport Planning and Management,vol.6,no.3,pp.255–270, Switzerland, 2003. 2016. [56] UIC, UIC Code 451-1 Timetable Recovery margins to guarantee [73] E. V. Andersson, A. Peterson, and J. Tornquist¨ Krasemann, timekeeping - Recovery margins, International Union of Rail- “Reduced railway trafc delays using a MILP approach to ways., Paris, France, 2000. increase robustness in Critical Points,” JournalofRailTransport [57] J. R. Hammerton, “Delivering solutions for a changing business Planning & Management,no.5,pp.110–127,2015. world,” Transactions on the Built Environment,vol.18,1996. [74] E. Solinen, G. Nicholson, and A. Peterson, “A Microscopic [58] H. Sipila¨ and A. Lindfeldt, Interviewed by the corresponding Evaluation of Robustness in Critical Points,” in Proceedings author on April 9, 2015. of the 7th International Conference on Railway Operations [59] M. Backman and J. Mattisson, Interviewed by the correspond- Modelling and Analysis (RailLille 2017),pp.83–103,2017. ing author on May 11, 2015. [75]M.Forsgren,M.Aronsson,P.Kreuger,andH.Dahlberg,Te [60] M. Aronsson, M. Forsgren, and S. Gestrelius, “Te Road to Maraca: a tool for minimizing resource conficts in a non-periodic Incremental Allocation & Incremental Planning,” SICS Tech- railway timetable. Presented at RailRome, 2011. nical Report 2012:09, ISSN 1100-3154, Swedish Institute of [76] M. Forsgren, M. Aronsson, S. Gestrelius, and H. Dahlberg, Computer Science, Kista, Sweden, 2012. “Using timetabling optimization prototype tools in new ways [61] P.Kreuger, M. Aronsson, J. Ekman, and T. Franzen,´ “Rail Trafc to support decision making,” WIT Transactions on the Built Requirements Engineering,” SICS Technical Report 2005:12, Environment,vol.127,2013. ISSN 1100-3154, Swedish Institute of Computer Science, Kista, [77] S.Gestrelius,M.Aronsson,M.Joborn,andM.Bohlin,“Towards Sweden, 2005. a comprehensive model for track allocation and roll-time [62] M. Forsgren, M. Aronsson, S. Gestrelius, and H. Dahlberg, scheduling at marshalling yards,” Journal of Rail Transport “Opportunities and challenges with new railway planning Planning Management,vol.7,no.3,pp.157–170,2017. Journal of Advanced Transportation 17

[78] J. T. Krasemann, “Design of an efective algorithm for fast response to the re-scheduling of railway trafc during distur- bances,” Transportation Research Part C: Emerging Technologies, vol.20,no.1,pp.62–78,2012. [79] J. T. Krasemann, “Computational decision-support for railway trafc management and associated confguration challenges: An experimental study,” Journal of Rail Transport Planning and Management,vol.5,no.3,pp.95–109,2015. [80] S. Kvale, Den kvaliativa forskningsintervjun (Te Qualitative Research Interview, in Swedish), Studentlitteratur, Lund, Swe- den, 1997. [81] M. Hammersley and P. Atkinson, Ethnography: Principles in Practice, Routledge, 3rd edition, 2007. [82] H. Olink and G. Scheepmaker, RailSys use by RailwayLab. Presented at RailSys Swedish and Scandinavian User Group 2017, at the Royal Institute of Technology in Stockholm, Sweden,2017. [83] C. W. Palmqvist, N. O. E. Olsson, L. Winslott Hiselius, and L. Winslott, “Punctuality Problems from the Perspective of Timetable Planners in Sweden,” in Proceedings of the 20th EURO Working Group on Transportation Meeting, EWGT 2017, Budapest,Hungary,September2017.

Paper 5

Dwell Time Delays for Commuter Trains in Stockholm and Tokyo

Carl-William Palmqvist a,1, Norio Tomii b, Yasufumi Ochiaib,c a Department of Technology and Science, Lund University Box 118, 221 00 Lund, Sweden 1 E-mail: [email protected] b Department of Computer Science, Chiba Institute of Technology 2-17-1 Tsudanuma, Narashino, Chiba 275-0015, Japan c Odakyu Electric Railway Co., Ltd. 1-8-3, Nishi-shinjuku Shinjuku-ku, Tokyo 160-8309, Japan

Abstract The paper analyses dwell time delays for commuter trains in Stockholm and Tokyo. In both cities, small dwell time delays of at most five minutes make up around 90% of the total delays. Therefore, it is valuable to understand and deal with these disturbances. To this end, we use high resolution data on dwell times and passenger counts from both countries over the last several years. We find that these data alone can explain about 40% of the variation in dwell time delays and produce simple models which can be used in practice to assign more appropriate dwell times. A change of 15 passengers per car, in Tokyo translates to a delay of about one second. For every 10 remaining passengers per door in Stockholm, the delay increases by about one second, and one boarding or alighting passenger per door corresponds to about 0.4 seconds of delay. We also find that trains in Tokyo are much more congested than in Sweden, and that at most stations in the latter, the exchange of passengers is modest. In both cities, the range of dwell time delays is quite narrow, with between 40 and 50 seconds separating the 5th and 95th percentiles. This indicates further that most delays, by far, are very small, and that even small adjustments to dwell times can make a big difference in the overall picture. To facilitate such improvements, key stakeholders and practitioners are closely involved with the research.

Keywords Commuter train, Dwell time, Delay, Timetable, Station, Passenger, Congestion, Urban transport, Public transport, Rail

1 1 Introduction A problem across the world is that trains are often delayed. In major cities, where the flow of trains and passengers are large, even minor delays cause considerable inconvenience for both commuters and train operating companies. This paper studies commuter trains in Stockholm and Tokyo, two cities where minor delays are ubiquitous and larger ones are not uncommon. In Stockholm, only 56 % of commuters report being satisfied with the punctuality of commuter trains, which is the lowest across all public transport options (Trafikförvaltningen Stockholms Läns Landsting, 2018). This is the single most important influencing factor for their overall satisfaction with the transportation mode. Thus, if the delays could be reduced, there is a clear potential for increasing the attractiveness and ridership of trains in Stockholm. In the Tokyo-area, approximately 38 million train journeys are made every day (Tomii, 2013). During the rush hours, it is common for a single train to carry about 2 000 passengers and for trains to depart every two minutes on each line. Thus, even the slightest delay quickly spreads between trains and affects large numbers of passengers. Because of this, trains in Tokyo are very often delayed by several minutes during the rush hours, and this affects millions of people and causes considerable costs to society. In Stockholm and Tokyo, minor delays of at most five minutes make up 96 and 97 % of delay hours respectively. In both cities, the delays mostly occur at stations in the form of dwell time delays. In Stockholm, 91 % of the total delay time is generated at stations, while the corresponding figure in Tokyo is 88 %. Thus, small dwell time delays make up a clear majority of all delays. The objective of this paper is to study the dwell time delays that occur in both railway systems. Different timetabling policies are discussed, and detailed passenger count data is used to help explain the delays in both cities. The ambition is that the study will lead to better timetable planning, which will in turn lead to a decrease in delays and improved punctuality in both cities. This will improve the daily commutes of millions of people and make public transportation more attractive.

2 Background In an overview of timetabling research, Hansen (2009) concludes that one of the key issues for high quality timetables is using realistic dwell times, and that this is often not the case, neither in practice nor in modelling. Bender, Büttner & Krumke (2013) also state that the time required for passenger boarding and alighting at stations is a critical element of overall train service performance. There is considerable empirical evidence for this across the world, of which we can only cite a few pieces here. In New Zealand, Ceder & Hassold (2015) found that one of the main causes for delays was heavy passenger load, which increased dwell times. In Norway, Olsson & Haugland (2004) studied punctuality and found that the boarding and alighting of passengers was the single most important influencing factor in congested areas. Considering stations in the Oslo area, Harris, Mjøsund & Haugland (2013) found that the delays were often small in nature, poorly recorded and not well understood. In the Netherlands, Wiggenraad (2001) studied seven stations in detail, finding that dwell times are longer than scheduled, that the scheduled dwell times were the same at peak and off-peak, and that passengers concentrated around platform access points. Focusing on the

2 station area of the Hague, Nie & Hansen (2005) also concluded that dwell times at platforms are systematically extended, partially due to the behaviour of the train personnel. Studying punctuality and delays at stations, Yuan & Hansen (2002) found that the mean excess dwell time was around 30 seconds, and that this sometimes depended on the train having arrived early and not being permitted to depart before the schedule, but also due to a lack of discipline with train drivers and conductors. In Sweden, previous research (Palmqvist, Olsson & Hiselius, 2017) shows that most delays happen during scheduled stops. Analyses of such stops reveal a wide variation in dwell times (Heinz, 2000), while interviews with timetable planners (Palmqvist, Olsson & Winslott Hiselius 2018) indicate that dwell times in Sweden are not scheduled to account for the number of passengers, rush hour traffic, and the like. In Japan, Kunimatsu, Hirai & Tomii (2009) describe the mechanism of how a dwell time is exceeded because there are many passengers. This delay increases the distance between trains and allows more people to accumulate at the next platform, so that both the congestion and delay increase further and continue to be amplified, in what they call a “snowball effect” – similar to bus bunching. They then create a simulation model to evaluate timetables, with input from smart card data, which takes these interactions into account. There is also a literature focusing more explicitly on the boarding and alighting of passengers, as well as their behaviour on platforms and in stations. Seriani, Fujiyama & Holloway (2017) have done detailed observations at metro stations in the United Kingdom to examine how queues and lanes form when boarding trains, the densities of passengers and distances between them, in the semi-circular area outside the doors. Several publications (such as Kamizuru, Noguchi & Tomii 2015; Qi, Baoming & Dewei 2008; Seriani & Fernandez 2015; Baee et al. 2012) deal with simulations of passengers moving between the train and platform to study various kinds of passenger management strategies. Fujiyama & Cao (2016) instead used smart-card data to study how long passengers spend at railway terminals, beyond simply walking from the access point to the train, a factor which influences and increases the level of congestion at stations. Finally, there is a literature on modelling dwell times. Buechmuller, Weidmann & Nash (2008) modelled dwell times in Switzerland, breaking them down into five components, and used over three million observations for calibration. D’Acierno et al. (2017) modelled how they depend on the congestion and flows of passengers. Most use track occupancy data, like Pedersen, Nygreen & Lindfeldt (2019) who studied how dwell times vary over time and running direction in Norway, Longo & Medeossi (2008) who applied their dwell time models in micro-simulation, Li, Daamen & Goverde (2015) who focused on short intermediate stops in the Netherlands, and Kecman & Goverde (2015) who were interested in real time prediction. Others, like Li, Yin & He (2018) use a combination of track occupancy data and manual observations, and this is what we do in this paper.

3 3 Method

3.1 Data Four datasets are used and combined in this paper, as outlined below. (1) Train movements in Stockholm. This data is derived from the signalling system and specifies the arrival, departure and passage times at stations. The times are truncated as to only contain minutes and hours, not seconds. Data from seven years, 2011-2017, is used, for a total of 16.6 million train movements. (2) Automatic passenger counts in Stockholm. About one eighth of commuter train cars in Stockholm are equipped with automatic passenger counters, which detect the boarding and alighting of passengers at stations. These cars are circulated among the different branches of the network, to provide data on most trains and stations from time to time. Along with the counts of passengers, the equipment also provides both run and dwell times with a precision of one second. We have 5.5 million such observations spanning the years of 2013- 2017. (3) Train movements in Tokyo. These are also generated from the signalling system from a railway company and specifies times at stations, but with a precision of seconds. The data spans six years, from 2013 to 2018, and contains 63.7 million train movements. (4) Manual passenger counts in Tokyo. At least twice every year manual passenger counts are conducted for trains at stations, to estimate the level of on-train congestion. While this is only a spot check, it is the result of considerable effort and provides valuable insights into how the number of passengers varies between trains and stations. From about 6:30 to 9:00 in the morning, staff at a handful of stations observe all trains that stop or pass by, making note of exact arrival and departure times, the train number and the number of cars, as well as the estimated congestion rate on the train. Approximately 50 trains are observed in this manner, per station, and over the years 2013-2018, over 4000 observations were made.

3.2 Combining the Data As the data from the two countries varied slightly in structure, different methods had to be used to combine them. In the Japanese case, we mainly make us of the manual observations, as these are generally of a higher quality and the more relevant ones to use, and the number of observations is sufficient for our purposes. In the timetable dwell times are set intended to correspond to the times that are measured using the manual methods: from the train stopping to starting its movement at the platform. The automatic system instead uses the occupation of track circuits, which are located further out on the respective sides, and on average registers times that are about 20 seconds longer. However, a problem with manual observations is that mistakes can sometimes be made, and we have used the larger dataset of automatic registrations to be able to detect these errors. Because of how the measurements are made, the manual ones should never be longer than the automatic ones. In such cases, we have simply excluded the observation as faulty. The connection between the two datasets was quite straightforward, as they both contain station names, dates and train numbers that match up very well and, in the end, we produced 2 684 good matches. In the future, it may be possible to do some extrapolation from these onto other train numbers or dates, but for now we have stuck with the observations themselves. For the Swedish case, the process was not so straightforward. The on-board system did

4 not contain train numbers matching those used in the dataset from the signalling system. As a first step, however, observations were merged from being door by door, to a train level, with averages across the different doors. The number of doors per train with the measuring equipment varied from one to eight, with four being the most common. Then, these trains were matched to train numbers in the larger dataset, following some assumptions: that the dates and stations were the same, that the origins and destinations match up, and that the total deviation between the arrival and departure times between the two sets was less than two minutes. We know that the coding of station names, as well as origins and destinations, sometimes differs between the two sets, but because both datasets were quite large the number of matches is quite good, even without considering such cases. Finally, we excluded any cases where there was more than one possible match, focusing on the clear and unambiguous ones. For the Swedish data, we primarily used the connection between the two datasets to filter out any observations with unusually long scheduled dwell times. The on-board system does not state the scheduled times, only the observed ones, while the infrastructure-based system only retains the minutes, not the seconds. However, we know from timetable planners that the scheduled dwell time for commuter trains in Sweden is, in most cases, 42 seconds, with a few exceptions. Some peripheral stations instead have 30 seconds, Stockholm Central Station around two minutes, and a few other cases where trainsets turn around or are either combined or separated also have longer dwell times. To get rid of the shorter times, we asked planners which stations had shorter times, and excluded observations from these. In order to get rid of the stops with longer scheduled times, we used a threshold of one minute in the larger off-board dataset. Finally, to harmonize with the Japanese data which was only collected between six and ten in the morning, on Mondays and Tuesdays in April, May and November, we filtered out any observations outside of these time intervals. This greatly reduced the amount of observations, from about 180 000 to around 3 300, but this later number is more comparable to the number of Japanese observations. The regression results vary slightly between how this filtering is done, with coefficient estimates and the R2 both shifting by a few tenths of a percentage point, but not in any major ways.

3.3 Analysing the Data In the analysis we are primarily concerned with using the information on passenger volumes to explain the variation in dwell time delays. In the Japanese data, this exists in the form of an estimated congestion rate, which is the ratio between the number of people on board a train, to the number of seats and handles for standing passengers to hold. The train cars in Tokyo have four doors and are 20 meters long, each with about 58 seats and 92 handles, and are usually combined in sets of eight or ten cars. In Stockholm, the commuter trains are composed of either six or twelve cars, each of which is about 18 m long with two doors and room for 62 sitting and 71 standing passengers. For these trains we have explicit counts of the number of boarding, alighting, and passing passengers from sensors at between one and eight doors per train. The total capacity for a full-length train is 1600 people, averaging 66.7 per door, compared to 37.5 in Tokyo. In both cities, we also have information of the incoming arrival delay, as well as dummy variables for the different stations, hours, months, and weekdays. The effects are studied using regression analysis with the software R. A number of regressions have been performed, using different combinations of the variables, their squares and interactions, and both with and without intercepts. With the regressions we essentially

5 have two purposes: (1) to estimate the degree to which we can explain the variation in the dwell time delays, using passenger data, and (2) to estimate the impact of passenger volumes on the dwell time delays, in a comprehensible way. The first point relates to the coefficient of determination, the R2 or adjusted R2, while the second has to do with coefficient estimates. To fulfil the first purpose, it is relevant to include both squared variables, as well as variable interactions, as these may well, in fact, have impacts on the delays and add to the share which we can explain. The second purpose, however, is better served by keeping the models simple and straightforward, sacrificing explanatory power for interpretability. In this paper, we will only present the coefficient estimates for the second set of models, but also mention the degree to which more complicated models can add to the picture.

4 Results and Analysis The following sections describe the two cases and presents the results of some regression analyses on the data. The first sections describe how dwell times are scheduled in both cities, what the delays are like, and how the passenger volumes differ. The fourth and fifth section describe how much of the delays can be explained by the passenger volumes and estimates that can be relevant for timetable planning.

Table 1. Percentiles describing the arrival and dwell time delay distributions across the two cities.

Percentile 5% 25% 50% 75% 95% Arrival Delay (s) Stockholm -300 0 60 60 120 Tokyo -20 3 25 63 158 Dwell Time Delay (s) Stockholm -8 -1 6 14 34 Tokyo -21 -4 5 15 30

4.1 Scheduled Dwell Times The railway companies in the two cities have different policies on scheduling dwell times. The default in Stockholm is to use 42 seconds regardless of day, time or station, with a few exceptions. Stockholm Central Station is one such exception, when the dwell times are extended, as are stops where trains are coupled, separated or turned around. At a few small stations in the periphery times are either lowered to 30 seconds or extended to handle meetings on single track sections. In most cases, however, the scheduled dwell time is 42 seconds, and these are the ones that we use in this study. In Tokyo, the dwell times are adjusted to a much greater extent. The range is quite similar to that found in Stockholm, but the variation is greater: the 5th, 25th, 50th, 75th and 95th percentile values are 40, 45, 50, 60 and 115 seconds, respectively. Almost across the board, the dwell times in Tokyo also longer than the 42 seconds used in Stockholm, and adjustments are made in five-second intervals, across different train numbers, stations, and hours.

6 4.2 Arrival and Dwell Time Delays The delay scenarios are quite similar in both cities. Arrival delays are not measured as precisely in Stockholm as in Tokyo, which makes it difficult to compare them in detail. Overall, however, in both scenarios a small share of trains arrives early, while the median is for delays of 60 seconds in Stockholm and 25 seconds in Tokyo. Arrivals are rarely delayed by more than two or three minutes, with the 95th percentile in Stockholm being 120 seconds, and in Tokyo 158 seconds. In both cases, the signalling data contains a larger share of trains arriving early than the count and combined data. Dwell time delays are remarkably similar across the two combined datasets. In Stockholm the median delay is 6 seconds, compared to 5 in Tokyo, and the 95th percentile values are 34 and 30 seconds respectively. About the same share of trains make up time during the stops, in the two cities, but in Tokyo the ones that do make up slightly more time, with the fifth percentile being 21 seconds early there, compared to 8 seconds in Stockholm. The range of dwell time delays, from the 5th to the 95th percentiles, are quite small and comparable: 42 seconds in Stockholm and 51 seconds in Tokyo. As these are commuter trains, however, stops are frequent, and the seconds add up.

4.3 Passenger Volumes Perhaps the biggest difference between the two cases is the number of passengers. In Tokyo, the trains that we study are often very congested, as seen in Table 2. The congestion rate is defined so that 100 represents all 58 seats being taken, as well as all of the 92 standing positions with handles. In the data we study, the 25th percentile has a congestion rate of 105, the median is 130, and the 95th percentile 180. The lowest we see, in the 95th percentile, is 60, corresponding to 90 passengers and a case where all seats are taken, and about an equal amount of people are standing.

Table 2. Percentiles describing the passenger volumes in either city. The figures from Stockholm are passengers per door, with either 12 or 24 doors per train. Congestion rate in Stockholm is defined per door, so that a value of 100 entails 67 passengers (31 sitting, 36 standing). In Tokyo it is normalized per car so that 100 entails 150 passengers (48 sitting, 92 standing), split across four doors.

5% 25% 50% 75% 95% Stockholm Alightings/door 0 0 2 7 21 Boardings/door 0 1 4 9 22 Passing Travellers/door 3 12 25 43 82 Congestion Rate 4 19 43 78 156 Tokyo Congestion Rate 60 105 130 150 180

7 In Stockholm the number of passengers is less extreme. Despite the observations also being made during the morning rush hours, the figures suggest that the median scenario is two people alighting, four boarding, and 25 remaining, per door, corresponding to a congestion rate of 43. When the 95th percentile values are added together, there are 104 passengers on board per door, or 208 per car, compared to 270 in Tokyo. The peak congestion rate of 156 in Stockholm is also lower, but comparable to, the peak of 180 in Tokyo. The relatively low numbers of boarding and alighting passengers depend in part on the fact that we excluded stations with longer dwell times, for reasons described in section 3.2, and thus omit both the central station and the origins of many trains, where the number of passengers is larger. Stockholm also has a more monocentric structure than Tokyo, and generally lower levels of population density and congestion. As the measurements are made automatically in Stockholm, and manually in Tokyo, there may be issues of calibration or measurement error. The engineers and planners responsible in Stockholm say that the measurements are not very accurate for high flows, and systematically gives too low values in such cases. This implies that the figures from that city are underestimated, rather than overestimated, although it is unclear to what degree. As we see in Table 2, the flows are not large at most of the stops we include in this analysis, so this is unlikely to be an issue here, and more likely to cause problems in Stockholm Central Station, which we have omitted from the analysis.

4.4 The Contribution of Passengers to Delays To estimate how much of the dwell time delays can be explained by passenger data, we use linear regression models including both squared and interaction terms. These models are impractical to use and interpret, but their adjusted R2 values provide a convenient estimate of how much information is contained in the data. We also include terms for the arrival delay and, in the case of Tokyo, scheduled dwell time, as these can be expected to interact somewhat with the passenger flows. Dummy variables for stations, weekdays, months, etc. are not used, because they do not help us estimate the influence of the passengers. Summaries of the two respective models are presented in Table 3. The data in Stockholm could explain slightly more of the variation than that in Tokyo, which is to be expected as it also contained the number of people getting on and off, rather than only counting those on- board. In both cases, about 40% of the variation is explained using the full models, and about half as much by the simple models. While the models are in no way complete, and the residuals are not normally distributed, these levels are reasonable. There are many other factors which affect delays, such as weather, train interactions like meetings and overtakes, driver behaviour, technical errors in the train or infrastructure, etc. Thus, there is no reason to

8 Table 3. Summary statistics from linear regressions in Tokyo and Stockholm. The full models include squared and interaction terms, the simple ones do not.

Full models Simple models Tokyo Stockholm Tokyo Stockholm Residual standard error 16.87 16.13 19.53 19.02 Degrees of freedom 2621 3046 2681 3297 Multiple R2 0.3951 0.4687 0.1712 0.2004 Adjusted R2 0.3806 0.4242 0.1703 0.1995 F-statistic 27.18 10.54 184.7 206.6 Coefficient estimates 63 255 3 4 p-value < 2.2e-16 < 2.2e-16 < 2.2e-16 < 2.2e-16 believe that the exchange of passengers, alone, would explain all of the variation in delays, even though it may be one of the more important causes. In this light, levels between 20 and 40% are reasonable.

4.5 Relevance to Timetable Planning To make timetables in practice, simpler models are required. We have thus also performed linear regressions without considering squared terms or interactions. The summary statistics are presented in Table 4, and they had about half as much explanatory power as the full models, with adjusted R2 values of about 0.17 in Tokyo and 0.20 in Stockholm. The coefficient estimates are found in Table 4 and can be used as a reference when allocating dwell times in the timetable planning process. In Tokyo, we find that a rise in the congestion rate of 10% corresponds to an increased dwell time delay of about one second. Discussing this result with timetable planners at the relevant company, they say that this figure is close to their intuitive sense, even though they do not have a formal model. For arrival delay and scheduled dwell time, we find small negative effects. This implies that trains that arrive late have slightly shorter dwell times than otherwise, and that the influence of the staff and driver are somewhat successful in reducing delays. Longer scheduled dwell times see slightly less delays, indicating that they include some margins, not just the minimum required time for the increased congestion rates. In Stockholm, we see a rather large effect from both alighting and boarding passengers. The estimates are about 0.4 seconds per person and door, in either direction, and these are additive. We also see an effect from congestion, as the number of passing travellers further increases the delays. Arrival delays essentially have no impact on the delays in Stockholm, partly because of limitations when combining the data in section 3.3, and the scheduled dwell times in the sample are all the same, so it is not possible to estimate any further impact from them.

9 Table 4. Coefficient estimates on the impacts on dwell time delay, from the simple linear regression models in Tokyo and Stockholm.

Estimate Std. Error t-value Pr(>|t|) Tokyo Arrival Delay -0.0479 0.0057 -8.35 <2e-16 Congestion Rate 0.1051 0.0047 22.30 <2e-16 Scheduled Dwell Time -0.0707 0.0068 -10.40 <2e-16

Stockholm Arrival Delay 0.0001 0.0000 3.970 7.33e-05 Alightings/door 0.3889 0.0438 8.881 < 2e-16 Boardings/door 0.4103 0.0453 9.060 < 2e-16 Passing Travellers/door 0.0968 0.0128 7.573 4.70e-14

5 Conclusions In conclusion, the data we use in this study can explain approximately 40% of the variation in dwell time delays in rush hours. While by no means perfect, this is high compared to other studies attempting to explain delays or punctuality with empirical data. It is also a reasonable number, considering the range of other factors which can cause delays. The number of passengers differs greatly between the two cities, with trains in Tokyo being on average much more congested, and trains in Stockholm often having empty seats even during the morning rush hours. The median values are 58 passengers per car in Stockholm compared to 195 in Tokyo, and the difference in capacity is not nearly so large. The dwell time delay distributions are quite similar across the cities: ranging from -8 to +34 seconds in Stockholm, and -20 to +30 seconds in Tokyo, as we compare the 5th and 95th percentiles. The ranges are narrow, suggesting further that seemingly small delays add up to large numbers, and underlining the importance of well-calibrated high-precision data. Arrival delays are also similar across cases, although the data from Stockholm is less precise in this regard. The early arrivals there are also more extreme, with the 5th percentile being at -300 seconds compared to -20 in Tokyo. Estimates from our simpler models, which can be used in practice, suggest that a rise in the congestion rate of 10% leads to about one additional second of dwell time delay in Tokyo. This value is close to the intuitive sense of the planners there. In Stockholm, we have measurements on the number of boarding and alighting passengers per door and find that every one of these causes about 0.4 seconds of delay, which is a relatively large number. There is also an influence from the number of passengers remaining on-board, due to crowding, and for every ten such people per door, delays increase by about one second. The results we have presented can be used in practice to help assign better dwell times. If planners know roughly how many passengers get on and off at a station, or the level of congestion, these figures can be converted to seconds, with which the dwell times can be extended or shortened. In Tokyo such modifications are common, but they are based on the planners’ experience, informed by the measured congestion rates. A model such as this can

10 help make this implicit knowledge more explicit, and to help planners who do not yet have so much experience. In Stockholm, and many other cities where dwell times are not usually adapted to the passenger flows, our results can help to start and inform such a process. By extension, this work will help to reduce the delays plaguing train operations both in the two cities of Stockholm and Tokyo, and further afield. In Stockholm, 91 % of all delays are small dwell time delays, five minutes or less, and on average they cause about 29 500 train delay hours per year. For the railway company we study in Tokyo, one of many in that city, the corresponding figures are 88 % and 69 000 train delay hours. While it is somewhat difficult to go from train delay hours to person delay hours or punctuality, clearly, the potential for improvement is considerable. Even a small decrease in these delays would have considerable benefits for both the railway operators and the passengers. Adjusting the dwell times to the actual flows of passengers is one way to achieve this.

Acknowledgements

The authors would like to thank the Swedish Transport Administration, Region Stockholm, MTR Pendeltåg and Odakyu Electric Railway Co. for providing data and for answering our questions.

References

Baee, S., Eshghi, F., Hashemi, S.M. & Moienfar, R., 2012. “Passenger Boarding/Alighting Management in Urban Rail Transportation”. Proceedings of the 2012 Joint Rail Conference (JRC2012), April 17-19, 2012, Philadelphia, Pennsylvania, USA. Bender, M., Büttner, S. & Krumke, S.O., 2013. “Online delay management on a single train line: beyond competitive analysis”. Public Transport, 5, pp. 243–266. DOI: 10.1007/s12469-013-0070-z Buchmueller, S., Weidmann, U. & Nash, A., 2008. “Development of a dwell time calculation model for timetable planning”. Presented at Computers in Railways XI, proceedings in WIT Transactions on the Built Environment, 103, pp. 525-534. Ceder, A. & Hassold, S., 2015. “Applied analysis for improving rail-network operations”. Journal of Rail Transport Planning & Management, 5, pp. 50–63. D’Acierno, L., Botte, M., Placido, A., Caropreso & Montella, B., 2017. “Methodology for Determining Dwell Times Consistent with Passenger Flows in the Case of Metro Services”. Urban Rail Transit, 3(2), pp. 73–89. DOI: 10.1007/s40864-017-0062-4. Fujiyama, T. & Cao, B., 2016. “Lengths of Time Passengers Spend at Railway Termini”. Presented at 2016 IEEE International Conference on Intelligent Rail Transportation (ICIRT), Birmingham, 2016. DOI: 10.1109/ICIRT.2016.7588723. Hansen, I.A., 2009. “Railway Network Timetabling and Dynamic Traffic Management”, 2nd International Conference on Recent Advances in Railway Engineering (ICRARE-2009), pp. 135-144. Harris, N.G., Mjøsund, C.S. & Haugland, H., 2013. “Improving railway performance in Norway”. Journal of Rail Transport Planning & Management, vol. 3(4), pp. 172-180. Heinz, W., 2000. Passagerutbyte i tåg - mätningar av av- och påstigningstider samt ansats till modell för att beskriva samband. Royal Institute of Technology, Stockholm. Kamizuru, T., Noguchi, T. & Tomii, N., 2015. ” Dwell Time Estimation by Passenger Flow Simulation on a Station Platform based on a Multi-Agent Model”. Presented at the 6th

11 International Conference on Railway Operations Modelling and Analysis (RailTokyo2015). Kecman, P. & Goverde, R.M.P., 2015. “Predictive modelling of running and dwell times in railway traffic”. Public Transport. DOI: 10.1007/s12469-015-0106-7. Kunimatsu, T., Hirai, C. & Tomii, N., 2009. “Evaluation of timetables by estimating passengers' personal disutility using micro-simulation”. Proceedings of the 3rd International Seminar on Railway Operations Modelling and Analysis (RailZurich), Zurich, 2009. Li, D., Daamen, W. & Goverde, R.M.P., 2015. ” Estimation of train dwell time at short stops based on track occupation event data”. Presented at the 6th International Conference on Railway Operations Modelling and Analysis (RailTokyo2015). Li, D., Yin, Y. & He, H., 2018. “Testing the Generality of a Passenger Disregarded Train Dwell Time Estimation Model at Short Stops: Both Comparison and Theoretical Approaches”. Advanced Transportation, Vol. 2018. DOI: 10.1155/2018/8521576. Longo, G. & Medeossi, G., 2012. “Enhancing timetable planning with stochastic dwell time modelling”. Presented at Computers in Railways XIII, WIT Transactions on the Built Environment, 127, pp. 461-471. Nie, L. & Hansen, I. A., 2005. ”System analysis of train operations and track occupancy at railway stations”, European Journal of Transport and Infrastructure Research, vol. 1, issue 5, pp. 31-54. Olsson, N. & Haugland, H., 2004. “Influencing factors on train punctuality - results from some Norwegian studies”. Transport Policy, issue 11, pp. 387-397. Palmqvist, C.W., Olsson, N.O.E. & Hiselius, L., 2017. “Delays for Passenger Trains on a Regional Railway Line in Southern Sweden”. International Journal of Transport Development and Integration, vol. 1(3), pp. 421-431. Palmqvist, C.W., Olsson, N.O.E. & Winslott Hiselius, L., 2018. “The Planners’ Perspective on Train Timetable Errors in Sweden”. Journal of Advanced Transportation, vol. 2018. DOI: 10.1155/2018/8502819. Pedersen, T., Nygreen, T. & Lindfeldt, A., 2019. “Analysis of Temportal Factors Influencing Minimum Dwell Time Distributions”. Presented at Computers in Railways XVI, WIT Transactions on the Built Environment, 181, pp. 447-458. Qi, Z., Baoming, H. & Dewei, L., 2008. ”Modeling and simulation of passenger alighting and boarding movement in Beijing metro stations”. Transportation Research Part C, 16, pp. 635-649. Seriani, S. & Fernandez, R., 2015. ”Pedestrian traffic management of boarding and alighting in metro stations”. Transportation Research Part C, 53, pp. 76-92. Seriani, S., Fujiyama, T. & Holloway, C., 2017. “Exploring the pedestrian level of interaction on platform conflict areas at metro stations by real-scale laboratory experiments”. Transportation Planning and Technology, 40(1), pp. 100-118, DOI: 10.1080/03081060.2016.1238574. Tomii, N., 2013. ”Beyond the wave of “Big Data” – How we can realize robust train operation using train operation record data?”. Japanese Railway Engineering, 179. Trafikförvaltningen Stockholms Läns Landsting, 2018. ”Årsrapport Upplevd kvalitet – SL och Waxholmsbolaget 2017”. Stockholm: Stockholms Läns Landsting. Wiggenraad, I.P.B., 2001. Alighting and boarding times of passengers at Dutch railway stations. TRAIL Research School: Delft. Yuan, J. & Hansen, I.A., 2002. “Punctuality of Train Traffic in Dutch Railway Stations”. Presented at International Conference on Traffic and Transportation Studies (ICTTS), Guijin, 2002. DOI: 10.1061/40630(255)73.

12