UPTECSTS11033 Examensarbete 30 hp April 2011

A Framework for How to Make Use of an Automatic Passenger Counting System

John Fihn Johan Finndahl Abstract A Framework for How to Make Use of an Automatic Passenger Counting System John Fihn & Johan Finndahl

Teknisk- naturvetenskaplig fakultet UTH-enheten Most of the modern cities are today facing tremendous traffic congestions, which is a consequence of an increasing usage of private motor vehicles in the cities. Public Besöksadress: transport plays a crucial role to reduce this traffic, but to be an attractive alternative Ångströmlaboratoriet Lägerhyddsvägen 1 to the use of private motor vehicles the public transport needs to provide services Hus 4, Plan 0 that suit the citizens requirements for travelling.

Postadress: A system that can provide transit agencies with rapid feedback about the usage of Box 536 751 21 Uppsala their transport network is the Automatic Passenger Counting (APC) system, a system that registers the number of passengers boarding and alighting a vehicle. Knowledge Telefon: about the passengers travel behaviour can be used by transit agencies to adapt and 018 – 471 30 03 improve their services to satisfy the requirements, but to achieve this knowledge

Telefax: transit agencies needs to know how to use an APC system. 018 – 471 30 00 This thesis investigates how a transit agency can make use of an APC system. The Hemsida: research has taken place in where Yarra , operator of the http://www.teknat.uu.se/student network, now are putting effort in how to utilise the APC system. A theoretical framework based on theories about Knowledge Discovery from Data, System Development, and Human Computer Interaction, is built, tested, and evaluated in a case study at . The case study resulted in a software system that can process and model Yarra Tram's APC data. The result of the research is a proposal of a framework consisting of different steps and events that can be used as a guide for a transit agency that wants to make use of an APC system.

Handledare: Georges Couenon & Margaret Hamilton Ämnesgranskare: Arnold Pears Examinator: Elísabet Andrésdóttir ISSN: 1650-8319, UPTEC STS08 000 Sponsor: Yarra Trams & RMIT Popul¨arvetenskapligbeskrivning

En vaxande¨ befolkningsmangd¨ och ett okat¨ anvandande¨ av privata motorfordon har lett till att de flesta moderna stader¨ idag har allvarliga problem med trafikstockningar, ett problem som paverkar˚ bade˚ miljon¨ och ekonomin negativt. For¨ att minska mo- tortrafiken i staderna¨ och eliminera trafikstockningarna behover¨ fler manniskor¨ re- sa kollektivt. Detta foruts¨ atter¨ att kollektivtrafiken ar¨ ett attraktivt alternativ till att anvanda¨ privata motorfordon. Kollektivtrafiken behover¨ uppfylla manniskors¨ behov av rorlighet¨ och anpassas efter deras resebehov inne i staderna.¨

For¨ att kunna anpassa tjanster¨ inom kollektivtrafiken efter det behov som finns, behovs¨ information om hur passagerare anvander¨ kollektivtrafiken. Operatorerna¨ behover¨ snabbt och standigt¨ aterkoppling˚ pa˚ passagerares behov for¨ att dynamiskt kunna an- passa tjanster¨ inom kollektivtrafiken. Ett system som automatiskt kan rakna¨ antalet passagerare ar¨ det som pa˚ engelska benamns¨ Automatic Passenger Counting (APC) system. Genom sensorer monterade ovanfor¨ dorrarna¨ pa˚ ett fordon kan ett APC-system registrera antalet passagerare som stiger pa˚ och av fordonet.

Datan som samlas in av ett APC-system maste˚ dock lagras och behandlas innan den kan omvandlas till information som kan vara anvandbar¨ for¨ operatoren.¨ Detta ar¨ en omfattande process och erfarenheter fran˚ organisationer som tidigare implementerat och anvant¨ sig av APC-system visar pa˚ komplexitet vad det galler¨ att omvandla data fran˚ APC-systemet till anvandbar¨ information. Det har aven¨ visat sig att organisatoris- ka faktorer haft en betydande paverkan˚ pa˚ huruvida operatoren¨ lyckats anvanda¨ sig av APC-systemen. Det finns i dagslaget¨ ingen guide eller ramverk for¨ hur en organisa- tion kan ga˚ tillvaga¨ for¨ att anvanda¨ sig av ett APC-system. Detta examensarbete syftar darf¨ or¨ till att ta fram ett ramverk som kan anvandas¨ som en guide for¨ en organisation som vill anvanda¨ sig av ett APC-system.

En organisation som nyligen borjade¨ implementera APC-system och som idag arbetar med att gora¨ dessa system anvandbara¨ ar¨ Yarra Trams, operator¨ av sparvagnstrafiken˚ i Melbourne, Australien. Detta examensarbete ar¨ utfort¨ i Melbourne och som en del av arbetet har en fallstudie utforts¨ hos Yarra Trams. Ett teoretiskt ramverk har byggts upp av relevanta teorier, dar¨ valet av teorier baserats pa˚ olika organisationers erfarenheter av APC system. De teorier som varit i fokus beror¨ hantering av data, systemutveckling samt manniska-¨ datorinteraktion. Det teoretiska ramverket har testats i fallstudien hos Yarra Trams och utvarderats¨ genom att analysera resultaten och erfarenheterna utifran˚ teorin.

Resultatet av studien ar¨ ett ramverk bestaende˚ av olika steg och riktlinjer som kan anvandas¨ av en organisation som vill borja¨ anvanda¨ sig av ett APC-system. Del av resultatet ar¨ ocksa˚ en mjukvaruapplikation anpassad for¨ Yarra Trams som de kan anvanda¨ for¨ att omvandla data fran˚ APC-systemen till anvandbar¨ information. Acknowledgement

This thesis is the result of a masters degree project on the Master of Science in So- ciotechnical Systems Engineering Program at Uppsala University. The project has been performed in Melbourne and has been carried out by the authors of this thesis.

First, we want to thank Arnold Pears, our subject reviewer at Uppsala University, that introduced us to Margaret Hamilton, our supervisor at Royal Melbourne Institute of Technology. This was the very first seed of our project. Further, we want to thank Margaret Hamilton, that let us be part of RMIT and helped us with all practical things around the project. We also want to thank Georges Couenon, our supervisor at Yarra Trams, that contributed with his knowledge and critically reviewed our work. They have all given us valuable feedback on our work. We also want to thank the staff, both at Yarra Trams and RMIT, that supported and encouraged us in our work.

Finally, we want to thank our families for all their support.

Thank you!

John Fihn and Johan Finndahl Uppsala April 26, 2011 Contents

Glossary 7

1 Introduction 8 1.1 Problem Formulation ...... 8 1.2 Research Objective ...... 9 1.3 Research Approach ...... 9 1.4 Limitations ...... 9

2 Theoretical Background 11 2.1 Automatic Passenger Counting Systems ...... 11 2.1.1 Integration with an Automatic Vehicle Location System ...... 12 2.1.2 APC data ...... 13 2.2 Knowledge Discovery from Data ...... 14 2.3 Cross Industry Standard Process for Data Mining ...... 14 2.4 Software System ...... 15 2.4.1 Software Development ...... 16 2.5 Sociotechnical Systems ...... 16

3 Methodology 18 3.1 Structure of the Research ...... 18 3.2 Information Gathering ...... 19 3.3 Operationalisation ...... 20

4 Theoretical Framework 23 4.1 Business Understanding ...... 23 4.1.1 Establish Objectives and Functionality of the Software System . . 23 4.2 Data Understanding ...... 24 4.2.1 Design of the Software System ...... 25 4.3 Data Preparation ...... 25 4.3.1 Develop Software System for Data Preparation ...... 26 4.4 Modelling ...... 26 4.4.1 Develop Software System for Modelling ...... 27 4.5 Evaluation ...... 28 4.5.1 Evaluate Software System ...... 28 4.6 Deployment ...... 29

5 Case Study 30

4 5.1 Business Understanding ...... 30 5.1.1 The Automatic Passenger Counting System Project ...... 31 5.1.2 The Automatic Passenger Counting System ...... 32 5.1.3 Software System for APC Data ...... 34 5.2 Data Understanding ...... 35 5.2.1 Design of the New Software System ...... 39 5.3 Data Preparation ...... 40 5.4 Modelling ...... 42 5.4.1 Home ...... 43 5.4.2 View Data ...... 44 5.4.3 Reports ...... 47 5.5 Evaluation ...... 50 5.5.1 The Prototype Software System ...... 50

6 Analysis 52 6.1 Business Understanding ...... 52 6.2 Data Understanding ...... 54 6.3 Data Preparation ...... 55 6.4 Modelling ...... 56 6.5 Evaluation ...... 57

7 Conclusions 58 7.1 Future Research ...... 60

Bibliography 62

Appendix 65

A Requirements 65 A.1 Forward Traceability ...... 65 Functional ...... 65 Non Functional ...... 68 A.2 Backward Traceability ...... 69

5 List of Figures

2.1 APC system components ...... 12 2.2 CRISP-DM model ...... 15

3.1 Theoretical framework ...... 21

5.1 Actors within public ...... 31 5.2 Yarra Trams information systems ...... 33 5.3 Explanation of run and trips ...... 36 5.4 Prototype software system design ...... 39 5.5 Screenshot of Home 1 ...... 43 5.6 Screenshot of Home 2 ...... 43 5.7 Screenshot of View Data 1 ...... 44 5.8 Screenshot of View Data 2 ...... 44 5.9 Screenshot of View Data 3 ...... 45 5.10 Screenshot of View Data 4 ...... 46 5.11 Screenshot of View Data 5 ...... 46 5.12 Screenshot of View Data 6 ...... 47 5.13 Screenshot of Reports 1 ...... 47 5.14 Screenshot of Reports 2 ...... 48 5.15 Screenshot of Reports 3 ...... 48 5.16 Screenshot of a report exported to Excel ...... 49

List of Tables

2.1 APC/AVL data sample ...... 13

5.1 Fictive APC data sample with errors ...... 37 5.2 Explanation of the headers in table 5.1 ...... 37

6 Glossary

APC Automatic Passenger Counting system, a system used in public transport for counting passengers boarding and alighting a vehicle. AVL Automatic Vehicle Location system, a system used in public transport that iden- tifies a vehicles position. AVM Automatic Vehicle Monitoring system, a system used in public transport for monitoring a vehicle in service. CRISP-DM CRoss Industry Standard Process for Data Mining, methodology for knowledge discovery from data projects. DoT Department of Transportation, coordinates all public transport in the state of in Australia. ITS Intelligent Transportation System, the umbrella term for information technolo- gies like APC, AVL and AVM systems that is used in public transport. KDD Knowledge Discovery from Data, process for extracting information from large data sets. KDR Operates the tram network in Melbourne under the name of Yarra Trams. A partnership of Melbourne’s train, tram and bus operators. Route A collection of stops that is served by a vehicle, a route starts with a terminus and end with another terminus. Run A collection of planned trips for a certain vehicle. Trip A vehicles journey from one terminus to another terminus where it serves certain stops along the way, usually on one route. UCD Data Control Unit, an onboard computer for an APC system that receives and stores data.

7 Chapter 1

Introduction

Cities around the world have grown with an incredible speed during the last century and more than half of the human population are today living in urban areas. This coupled with an increasing ownership in the number of motor vehicles has meant that most of the modern cities are facing tremendous traffic congestions. This urbanisation trend is not expected to stop in the near future, instead the situation is anticipated to get worse and more widespread. Traffic congestion is a high cost for both the economy and the environment, and has become a major problem in our society. To alleviate the traffic congestion in the cities more people need to be able and willing to travel on public transport; the lifeline of today’s modern cities.

The public transport must be an attractive alternative to using private vehicles. It must meet peoples’ requirements for mobility and suit their travel behaviour. Since urban areas are dynamic, the public transport has to be managed in the same way. For dynamic traffic management, it is important to get rapid feedback from the network and to understand the entire transit system. Increasing demands on public transport put pressure on transit agencies to improve their operations and services. New in- formation technology such as Intelligent Transportation Systems (ITS) can be used to meet higher demands on public transport. One ITS technology with the potential to improve operations and services within public transport is the Automatic Passenger Counting (APC) system. The APC system counts passengers alighting and boarding a vehicle, and can be used to get knowledge about the passengers’ journey. With this knowledge it is possible to understand the demands and make adjustments for the future, but several steps must be taken before the knowledge can be obtained and decisions made.

1.1 Problem Formulation

A study from the Transportation Research Board (2008) emphasises the complexity of getting knowledge from an APC system. It indicates that the main issues are as- sociated with the preparation and modelling of the data from the APC system, and that a processing and reporting software is essential. This is not the whole problem and a software system is not sufficient to get complete knowledge. The complexity of

8 knowledge discovery from the data is broader. Another study from the Transportation Research Board (2006) points out that the APC system poses numerous organisational challenges, and Gonzalez-Aranda et al. (2008) find the underlaying problem in knowl- edge discovery projects as a lack of methodology.

One transit agency currently dealing with the complexity of an APC system is Yarra Trams in Melbourne, Australia. Melbourne has the world’s largest tram network with 250 km of track, almost 500 tram vehicles, and with around 180 million passenger trips taken each year. Up to 80% of the tram network share the road space with other vehicles like cars (Yarra Trams, 2009b). Yarra Trams has recently started to investigate the feasibility with the APC system in the tram network and are now putting a lot of effort in how to utilise the APC system.

1.2 Research Objective

The purpose of the research for this thesis is to propose a framework for how a transit agency can make use of an APC system.

1.3 Research Approach

The research will be approached by the following steps:

• Build a Framework A framework will be built based on theoretical studies concerning characteristics identified in the problem formulation. • Test the Framework The framework will be tested by applying it in a case study at Yarra Trams to find out how it can be used to provide Yarra Trams with useful information from the APC system. • Analyse the Framework Using the results of the case study, the framework will be analysed to make con- clusions about how a transit agency can apply the framework to make use of an APC system.

1.4 Limitations

This research aims to propose a framework for how a transit agency can make use of an APC system. This framework is tested for one transit agency, and concerns one spe- cific organisation. This organisation only operates tram vehicles and the framework

9 will thereby only be tested for conditions in this specific environment. This implies that different conditions can appear when an organisation operates other vehicles in another environment. However, this research focus on problems that can be seen as general for a transit agency that intends to utilise an APC system. The research will not cover details that are specific for Yarra Trams only. System development will be a major component of this research, but since the software solutions are customised for Yarra Trams and may not suit the general case, code and documentation in full will not be presented. Due to the time frame for this research, it will not be possible to confirm whether Yarra Trams can improve future operations and services by applying the proposed framework.

10 Chapter 2

Theoretical Background

This chapter describes Automatic Passenger Counting (APC) systems and relevant Intelligent Transportation Systems (ITS) within public transportation. After this, the chapter introduces theoretical concepts that are relevant for this research and gives a background to chapter four, in which the theoretical framework is composed. The process for obtaining knowledge from data is introduced and the role that software development plays in the process is outlined. Finally the relationship between technology and humans is defined.

2.1 Automatic Passenger Counting Systems

APC systems are used to count passengers in public transport. There exist different APC technologies and they can be divided into two categories: static and on-board systems. The static systems are installed in stations and the on-board systems are em- bedded in vehicles. In technical terms the on-board APC system is a module installed inside a vehicle and a central computer in an office used for storage of the APC data. The on-board module consists of sensors and an on-board computer that converts and stores the information registered by the sensors into passenger counts. An on-board APC system is shown in Figure 2.1. The most common technique for counting in APC systems is infra-red (IR) sensing. Other techniques used are: pressure-sensitive mats, horizontal beams and cameras. The IR sensors are mounted above each door and are only active to register passengers boarding and alighting when the doors are open. The stored counts are transferred from the on-board computer to the central computer at a regular time for storage. Most analysis of APC data requires a moderate data sample, which can be met with APC systems installed on 10-15 % of a transit agency’s total vehicle fleet (Furth, Hemily, Muller and Strathman, 2006, p. 6; 18; 21).

11 Figure 2.1: The figure shows an onboard APC system components and how they are connected.

2.1.1 Integration with an Automatic Vehicle Location System

Some of the usual ITS and technologies within public transportation are: Automatic Vehicle Location (AVL) systems, Geographical Information Systems (GIS), Passen- ger Information Systems (PIS), and Automated Fare Payment systems (Ubaka and Glotzbach, 2006, p. 2-4). An APC system can be integrated with other ITS, which makes the data outcomes richer. The most common is to integrate an APC system with an AVL system (Furth et al., 2006, p. 3).

An AVL system determines the vehicles position at a certain time by the use of ei- ther the Global Positioning System (GPS) or an odometer (Saavedra, 2010, p. 2). De- pending on which level of intensity of position data that is needed, there are different techniques to collect data. The most common is to get polling records or stop records. Polling records is when a central computer continuously requests a vehicle’s position, which mainly is used for getting a high frequency of data points like real-time infor- mation. Stop records provides the position of a vehicle and the time when it arrives at certain stops, this is the type of record that is used when an AVL system is integrated with an APC system (Furth et al., 2006, p. 17-19). An example of a data record from an APC system integrated with the time and stop location from an AVL system is shown in Table 2.1. The table shows the number of passengers boarding (Ins) and alighting (Outs) the vehicle during the time period from 7:06:47 until 7:36:59. This vehicle makes 10 stops during this time with a total of 48 passengers boarding and 48 alighting. The vehicle arrives from a previous trip at the terminus 7:06:47, starts a new trip 7:07:12

12 and arrives at the end terminus of that trip at 7:36:46. A new trip starts 7:36:59.

Arr Time Dep Time Stop Ins Outs Route Number 7:06:47 7:07:12 1 28 0 55 7:13:48 7:14:36 2 4 0 55 7:14:52 7:15:05 3 4 4 55 7:19:13 7:19:22 4 1 1 55 7:20:20 7:20:34 5 0 15 55 7:25:17 7:25:28 6 6 4 55 7:27:07 7:27:16 7 0 2 55 7:32:49 7:35:16 8 3 6 55 7:35:51 7:36:17 9 2 8 55 7:36:46 7:36:59 10 0 8 55 Total 48 48

Table 2.1: This table shows an APC/AVL data sample from a central computer.

APC data combined with AVL data are richer and can be used to analyse delays, pas- sengers waiting time, and passenger load on a vehicle. The data can also be analysed by looking at average load and waiting time to better understand how customers ex- perience the services. This is useful to improve scheduling and customer services. Integrated APC/AVL data is a better base for analysis with aim to alleviate traffic con- gestions. One common problem with an integration is the matching between AVL and APC data, often due to bad quality in the data (Furth et al., 2006, p. 2; 19-20; 25-28).

2.1.2 APC data

The accuracy of an APC system can vary and the data are affected by various type of errors. When it comes to error handling it is necessary to distinguish between sys- tematic and random errors. Systematic errors occurs when an APC system constantly counts incorrectly, which can be predictable after the errors have been identified. Ran- dom errors are harder to identify since they occur randomly. To be able to identify all the errors it is important to validate the APC system by comparing “actual” counts with the counts registered by the APC system. The most common method for collect- ing the “actual” counts is to manually count passengers. It is important to understand that the manual counts also can contain errors, but it is necessary to represent reality as accurate as possible to be able to validate the APC system. A valid APC system counts passengers with an accuracy of ≥ 95%, which is measured by combining the error for one trip of boarding and alighting passengers. Since the accuracy of an APC system is not perfect it is necessary for the APC data to undergo statistical processing (Iris-GmbH Infrared & Intelligent Sensors, 2005). Statistical processing is especially important to apply before calculating the load of a vehicle, as errors tend to accumu- late when aggregating the counts (Furth et al., 2006, p. 55).

The total number of passengers that has boarded a vehicle during a trip must be equal to the total number of passenger that has alighted the vehicle during a trip. This means

13 that the load of a vehicle is supposed to be zero at the end of a trip. In some cases the load calculated from the APC data is not equal to zero at the end of a trip, which usually depends on systematic errors in the APC system. Due to this, the data have to be corrected (Furth et al., 2006, p. 5-6; 54-55).

2.2 Knowledge Discovery from Data

Knowledge Discovery from Data (KDD) is a process used to discover and presen infor- mation from data to achieve knowledge. KDD is usually associated with Data Mining, since both terms refers to information extraction from data. A way of separating the two terms is to use Data Mining to describe a class of methods based on statistics, mathematics, and artificial intelligence that can be used to extract information from data. KDD describes the whole knowledge discovery process (Han and Kamber, 2006, p. 7). Data Mining is with this view a step within the KDD process, where algorithms and techniques are executed with the aim to explore and extract hidden patterns from data to achieve knowledge (Fayad, Grinstein and Wierse, 2002, p. 278; 300).

2.3 Cross Industry Standard Process for Data Mining

CRoss Industry Standard Process for Data Mining (CRISP-DM) is another process for discovering and presenting information from data. It was developed during the late nineties by a consortium of representatives from different companies active within the Data Mining area. CRISP-DM does not just describe a class of statistical methods, it de- scribes the life cycle of a knowledge discovery from data project and is both applicable in general and at a more specific level (Chapman, Clinton, Kerber, Khabaza, Reinartz, Sharer and Wirth, 2000, p. 3-4; 9; 13-14). The different phases in the CRISP-DM process are presented below and in Figure 2.2.

• Business Understanding The first phase aims to give an understanding of the business and the purpose of the KDD project. This phase builds the foundation for the upcoming phases. • Data Understanding The data understanding phase aims to give an understanding of the data and to identify data quality issues. • Data Preparation The data preparation phase aims to make sure that the data are of the right qual- ity before it can be used. • Modelling In the modelling phase various modelling techniques are applied on the data to extract information and knowledge.

14 • Evaluation The evaluation phase is performed to make sure that the outcomes after mod- elling the data achieves the business objectives. • Deployment The deployment phase is performed when the modelling of the data satisfies both the business requirements and the purpose of the project. In this phase the developed model is implemented in the organisation.

Figure 2.2: The CRoss Industry Standard Process for Data Mining (CRISP-DM) model.

The arrows indicate in which order the steps are supposed to be performed. In some cases it can be necessary to go back to previous step, for example when working with Data Understanding it can be necessary to go back to Business Understanding. The outer arrow means that even if the project is deployed it must be maintained and remain dynamic (Chapman et al., 2000, p. 3-4; 9; 13-14).

2.4 Software System

To manage and analyse large amount of data it is essential to have a software system. A software system is a set of instructions that when executed on a computer provides the user of the software system with functions. (Agarwal, Tayal and Gupta, 2008, p. 4) It is possible to acquire a software system by either buying or developing a solution. When acquiring a solution it is important that the solution can fulfil the organisation’s

15 requirements for the software system. If the requirements cannot be fulfilled by a bought solution it is preferable to develop a customised solution (Krishnamurthy and Saran, 2008, p. 155-158). A customised solution can on the other hand put higher expectations on the organisation, as it requires a higher level of expertise in software development (Furth et al., 2006, p. 7).

2.4.1 Software Development

There are many different ways of developing a software system. If the general objec- tive for the system is defined but the requirements for functionalities are less detailed, it is preferable to begin developing a prototype system(Agarwal et al., 2008, p. 41). A prototype system is an initial version of a software system that is used to demonstrate concepts, try out design options, and find out more about the problem and the possi- ble solutions for the real system. An acknowledged benefit of prototypes is that stake- holders and decision makers can experiment with the system and thereby get a better understanding for the final system. This reduces and gives a better control of the costs and involves the users more closely with the design of the system (Sommerville, 2011, p. 45). A prototype is not supposed to be a fully functional system. After the proto- type has been developed it can either be discarded or re-used for the formation of the real system (Krishnamurthy and Saran, 2008, p. 132-134). The prototype development can be performed in four steps (Sommerville, 2011, p. 45-46):

• Establish prototype objectives In the first step the objectives of the prototype are declared. • Define prototype functionality In the second step the functionalities of the prototype are decided. • Develop prototype When the objectives and functionalities are determined, the prototype can be developed. • Evaluate prototype In the last step the prototype is evaluated and the requirements are refined for continuing with the prototype or to start the development of the real system.

2.5 Sociotechnical Systems

The term Sociotechnical Systems (STS) was originally created in the context of organ- isational development and work design. It was introduced in 1960 as a result of the insight that organisational outcomes are affected by both social and technical factors. The new concept challenged the technology determinism, which saw the technology as autonomous and something humans needed adapt to (Mat and Silva, 2005, p. viii).

16 The new comprehension was that an organisation consists of both social and techni- cal systems. The humans in the organisation represents the social system, and the tools and techniques they are using represents the technical system. Those two sys- tems are not independent and must be seen and managed as one system (Griffith and Dougherty, 2002, p. 205). The accumulation of computers at the workplace led to new views and extensions of the STS area. One growing field that considers the STS per- spective is Human Computer Interaction (HCI), which attempts to understand and improve the interaction between human and computers (Mat and Silva, 2005, p. ix).

17 Chapter 3

Methodology

This chapter describes the methodology behind the research of this thesis. It gives a background of the research and explains how the theoretical framework was shaped and why. This chap- ter declares how the theoretical framework has been applied in the case study, and how the information has been collected.

3.1 Structure of the Research

The research has been carried out by the authors of this thesis. It has been a part of one of Yarra Trams ongoing projects that undertakes a feasibility study to determine the viability of introducing an Automatic Passenger Counting (APC) system on the tram network in Melbourne. Feedback to the research have come from the supervisors and the subject reviewer of this thesis.

The initial step in the research was to become familiar with Yarra Trams ongoing APC project. To understand the APC project it was necessary to first study Yarra Trams organisation and their operations and services. Thereafter it was possible to study the APC project, and identify which departments within the organisation that was in- volved in the APC project. A deeper study about APC systems was performed with the aim to get an insight into the complexity of an APC system, and to understand the role of the APC project within Yarra Trams organisation. By this, a better understand- ing of the research was achieved. After the initial step, it was possible to understand how the objectives of the thesis could be approached, and a literature study of the rele- vant fields began. The theoretical framework was shaped from the knowledge gained during the literature study.

With the objectives of the thesis in mind and the theoretical framework as a guide, the case study started. Since this research aims to result in a framework that can be applied in general APC projects, a case study approach was chosen to get an in-depth understanding for a real life phenomenon and the contextual conditions. The case study followed the structure of the theoretical framework were each phase consisted of different events.

18 A software system was developed as part of the case study. This system development was part of the framework and was performed solely by the authors of this thesis. The development could have been performed with more involvement from the stakehold- ers and been more iterative. Requirements for the software system could have been reviewed by all stakeholders. The reason it was not performed this way was a lack of resources and planning.

In the two last phases of the framework, the result from the APC project are supposed to be evaluated and deployed. The evaluation phase could not be performed in full and was only based on feedback from the project manager, which means that feedback from the different departments and users are not included in the report. The coverage of fulfilled requirements compared with gathered and prioritised requirements was considered when the result from the project was evaluated. Because of the status of the project and the timeframe for this research, it has not been possible to perform the deployment phase. The final step of the research was to analyse the theoretical framework from the results of the case study.

Another part of the case study was a manual validation survey of an APC system on a new tram type, which was performed by Yarra Trams with the assistance from the authors of this thesis.

3.2 Information Gathering

The empirical information for this research was collected during the case study at Yarra Trams. The case study was designed to collect information about Yarra Trams as an organisation and not the involved individuals, as the research aims to describe how an organisation can make use of an APC system. Multiple sources of information was chosen and used during the case study to get a better and complete understanding of the case.

• Interviews Interviews were chosen as the main source of information with the aim to get a better understanding for the organisation. Focus interviews are preferable when there is a need for open-ended conversations. If there is a need for more detailed information it is good to perform in-depth interviews (Yin, 2008, p. 107). Be- cause of the complexity of the APC project it was hard to know all the questions in forehand, focus interviews were therefore performed during several meetings at Yarra Trams office. Beside the focus interviews, two in-depth interviews were performed with an Market Analyst and an Research Analyst from the Market- ing Department at Yarra Trams. The purpose of the in-depth interviews was to gather requirements for the software system. The in-depth interview approach was chosen because more detailed information about their requirements on a software tool was needed, whereas it was possible to collect requirements from the other departments through documents and focus interviews.

19 • Documents To corroborate and complement the information from a case study it is important to use organisation specific documents (Yin, 2008, p. 103). Documents from Yarra Trams related to the case study were the second main source of information and was used to provide the case study with complementary information.

• Physical Artefacts Another source of information was physical artefacts in terms of data collected by different IT systems at Yarra Trams. This information was essential, as the understanding for the data that Yarra Trams collects was an important part of the case study.

• Participant-Observation The last source of information was collected with the participant-observation approach, which is when the observer assumes a role and participates in the situation that is being studied (Yin, 2008, p. 112). This was done during a survey together with Yarra Trams. The purpose of the survey was to validate a newly installed APC system, which implied to manually count all the passengers that boarded and alighted a tram during several trips. This was done to get a better understanding of Yarra Trams services and the APC technology, as well as to understand the difficulty of counting passengers.

3.3 Operationalisation

There is no guide for how an organisation should tackle the entire process when they want to make use of an APC system. The goal of this thesis is therefore to propose a framework, which can be used as a guide when an organisation want to make use of an APC system. To approach the research it was important to first get knowledge about the nature of the problems related to an APC system. Only with this knowledge it was possible to know how to tackle the problem. This knowledge was achieved by studying transit agencies’ experiences from earlier APC projects. With this insight it became clear that the APC project has a close relation to the process of Knowledge Discovery from Data (KDD), since the task of an APC system is to collect data and the organisation’s desire is to make use of that data. Research about the KDD process proved to give a better understanding for the characteristics of a knowledge discovery project.

Two other processes considering knowledge discovery from data were also identified: CRoss Industry Standard Process for Data Mining (CRISP-DM), and Sample Explore Modify Modelling Assess (SEMMA) process. When comparing KDD, CRISP-DM and SEMMA, it was CRISP-DM that was the most complete and suitable process for this research. The reason was because CRISP-DM has a wider perspective and takes or- ganisational factors into consideration. CRISP-DM has been adopted with success by various industry companies, which also had an affect on the decision to apply it in this research.

20 It is important to be aware of that CRISP-DM is a methodology for knowledge dis- covery projects, and not specifically adapted for an APC project. This means that CRISP-DM does not focus on specific problems related to an APC system. CRISP- DM was therefore determined to only be used as the base in the framework for this research. Knowing the problems associated with an APC system and the complex- ity of knowledge discovery projects, served as encouragement to add new theory on the foundation of CRISP-DM. The theory that has been used from CRISP-DM is the process structure. The process structure consist of six different phases, each phase representing one step in the process of a knowledge discovery project. The core of each phase has been reused and complemented with additional theory about system development, KDD, Data Mining, Human Computer Interaction (HCI), and STS.

As a software system proved to be an essential component to make use of an APC system, theory about system development was added. One important aspect when developing a software system, especially when the system will mediate complex in- formation, is the interaction between the user and the system. Previous research con- firms that human interaction is an essential part to successfully discover knowledge from data and that visualisation can bring benefits when modelling the data (Fayad et al., 2002, p. 191). The HCI field has complemented the framework with aspects about the design of the interface of the software system.

It has been shown that many factors affect the outcomes of an APC project and hard- ware, software, and personnel issues have to be identified and resolved (Boyle, 2008, p. 4). The interaction between technical and social factors must be taken into consid- eration and theory from the field of STS was therefore added in the framework.

The framework consist of the same six phases as CRISP-DM, but each phase has two layers. The bottom layer includes theory from KDD, Data Mining and STS, and is the foundation in each phase. The top layer represent aspects regarding the software system including HCI, which is an important part and has to be built upon the foun- dation. The first phase in the framework is Business Understanding and the process continue in the order that the arrows indicate. The framework is illustrated in Figure 3.1. Each box in the figure represents a phase in the process of making use of an APC system.

Figure 3.1: The theoretical framework.

21 Findings from earlier APC studies indicates a need for identification of factors that can prevent or ensure success in APC projects. There is a need for examination of success- ful strategies of how to make use of APC systems. (Boyle, 2008, p. 4) By applying the theoretical framework and evaluating the result, it was possible to tell whether it was successful or not to apply the framework. The framework was applied in a case study and the factors that affected the result were identified and analysed. The framework aims to serve as a guide for how an organisation can make use of an APC system.

22 Chapter 4

Theoretical Framework

This chapter describes the theoretical framework, each section representing one phase in the framework. Every section is divided into two parts; one part explaining the fundamentals of the phase, and the other part focusing on the software system.

4.1 Business Understanding

The initial phase in the process to make use of an APC system is to understand the business objectives. This is a fundamental step that has to be performed before defin- ing the objectives of the APC project. Information about the business and its primary objective has to be gathered and understood. This needs to be done to clarify what the business should accomplish with the APC project. The business requirements for the APC project might differ, but the essential is that they all reflects the business objec- tives. When the business requirements are established it is possible to determine the objectives of the APC project. With a clear objectives it is possible to get an overview of the resources needed for the project in terms of personnel, finances, data, comput- ing resources, and software systems (Chapman et al., 2000, p. 13-19; 38). According to Gonzales-Aranda et al. (2008) the most critical factor to successfully extract and pro- vide an organisation with valuable information is to get a clear understanding of the business.

4.1.1 Establish Objectives and Functionality of the Software System

When the project objectives are clear they have to be considered as a foundation on which the objectives and functionalities of the software system needs to be based on. To determine the objectives and functionalities it is necessary to clarify the system requirements. This work is divided into following steps (Agarwal et al., 2008, p. 54- 57):

23 • Requirement Gathering The system requirements need to be gathered from the stakeholders. This has to be a communication process that involves both developers and stakeholders, since all must agree on and understand the requirements. The system require- ments has to be in line with the the business requirements (Krishnamurthy and Saran, 2008, p. 59-61).

• Requirement Analysis All system requirements are analysed to confirm their relevance to the goal and objectives of the APC project. Requirements have to be prioritised and classi- fied into the categories: functional and non-functional. Functional requirements define functionality and behaviour of the system. Non-functional requirements specifies criteria regarding the property of the system (Agarwal et al., 2008, p. 54-57).

• Requirement Documentation The system requirements need to be documented in both a natural language, without technical terms, and a detailed technical specification, including the user requirements translated into system requirements in technical terms. This spec- ification usually includes links between the requirements so it is possible to fol- low their relations (Sommerville, 2011, p. 96).

• Requirement Review Requirement review is a manual process where both stakeholders and devel- opers confirm that the documented requirements does not contain anomalies (Agarwal et al., 2008, p. 58-59).

• Requirement Management After the requirements have been reviewed and agreed it is important that they are managed, in case of changes. If any changes in the requirements are agreed and reviewed, they must be controlled (Agarwal et al., 2008, p. 58-59).

When a prototype system is preferable for the APC project, the prototype objectives and functionality needs to reflect the system requirements that has been prioritised in the requirement analysis step. The prototype system is not supposed to fulfil all the system requirements (Sommerville, 2011, p. 45-46).

4.2 Data Understanding

The data understanding phase aims to give an understanding for all data that are in- volved in the APC project. Without insight in the data it is hard to know which knowl- edge that can be extracted from the data. It is important that the process is adopted in an interactive way, and in this phase the concerned people within the project needs to interact with the data to better understand the property and quality of the data (Han and Kamber, 2006, p. 36). It is initially important to get an overview of the data and

24 confirm that all data needed to satisfy the business requirements are represented. Be- fore proceeding in the process, a deeper understanding for the quality and format of the data must be reached. The quality of the data must be verified, which includes awareness of completeness and errors (Chapman et al., 2000, p. 13-14; 21-22).

4.2.1 Design of the Software System

While an understanding for the data increases and the functionality for the prototype system has been determined, the work with the prototype system design can begin (Sommerville, 2011, p. 46). The system design declares the practical implementation and involves decision making about technologies, modules, and components for the system. The design of the system must cover all aspects related to the required func- tionality (Furth et al., 2006, p. 80). The IT environment in the organisation, with exist- ing systems and skills, has an impact on the options when it comes to system design (Krishnamurthy and Saran, 2008, p. 45; 94-95).

It is the understanding of the business and the requirements on the software system that will guide the design of the database. The database design clarifies the structure of how the data is going to be stored (Davidson, 2008, p. 3). A relational database is to prefer since it is constructed to store data records in different tables, and bind information that belong together. Each table consists of a set of unique columns to separate the different information within the data, this is also known as data attributes (Davidson, 2008, p. 7-11).

4.3 Data Preparation

Before the data are suitable to be modelled it must be of the right quality and format. To success with data preparation it is of great importance that the previous phase has given a complete picture of the data. In this phase the issues identified in the data understanding phase have to be arranged. This usually implies to correct data, integrate data from different sources, transform or reduce data. The data preparation is divided into the following steps (Chapman et al., 2000, p. 24-25) (Han and Kamber, 2006, p. 37; 48-51; 61-62; 67-72):

• Select Data The data with identified issues have to be selected. • Clean Data The second step is to fix quality problems in the data, which involves differ- ent techniques and algorithms depending on the quality problem. Missing val- ues can be fixed by either adding constants for representing an unknown value, adding the most probable value, or adding the mean value of the existing values for that attribute.

25 • Construct Data Depending on how the data are going to be modelled, it can be necessary to transform or complement the data with new attributes. It can also be necessary to aggregate or normalise data.

• Integrate Data In this step, datasets that needs to be integrated are combined and merged to new datasets. As different departments within the organisation can use differ- ent ways to describe items, and the same values can be stored in different data sets, issues like redundancy, inconsistency, and conflicts between values must be identified and arranged. It can be necessary to go back one step with the com- bined dataset.

• Format Data The final step is to make sure that the data have the right format. This can in- clude changing order in the dataset or more syntactic changes such as removing characters or trimming data values.

4.3.1 Develop Software System for Data Preparation

Data preparation can be done manually but as that is a non-effective and time-consuming method, a software system is essential to prepare vast amount of data (Han and Kam- ber, 2006, p. 61-62). The software system is shaped based on the system design, created during the Data Understanding phase.

The selected data can be stored in a database or databases. Thereafter algorithms for both the cleaning and the construction steps can be developed and applied. In the integration step the software system can access, merge, and store data from different sources. When the dataset is stored the reformatting can be done by applying algo- rithms (Han and Kamber, 2006, p. 37; 48-51; 61-62; 67-72).

4.4 Modelling

When all data are prepared it is possible to model the data. The purpose of modelling the data is either to describe or predict situations. Algorithms and techniques to model the data are chosen based on the business objectives (Witten and Frank, 2005, p. 83). To get deeper insight and knowledge about some phenomena, analytical methods can be applied on the data (Haluzov, 2008, p. 24).

26 4.4.1 Develop Software System for Modelling

Beside the software system used for preparing data, it is important to have a software system that can model and visualise the data for the organisation. The formula of how to model the data are derived from the system requirements (Chapman et al., 2000, p. 27). When modelling the data it is important to consider representation, interaction, and integration. The user should be able to interact with the model see it in a context (Fayad et al., 2002, p. 212).

As the outcomes from the modelling usually support decision making, the software system must be accurate and reliable, and present the output in a clear and well- explained way. The software system should improve the users’ task performance and reduce their effort. This can sometimes mean that there has to be a trade-off between a systems functionality and usability. There must be a fit between the information needed and presented, which put a high demand on communication between devel- opers and users. The software system should be designed for an enjoyable and satisfy- ing interaction and represent the real-world, so it provides affordance to capture real- world knowledge. The richness of the visualisation, interactivity, and adaptability of the interface affect the users experience of the system (Te’ eni, Carey and Zhang, 2007, p. 32; 197-201; 240-243).

The information gained from modelling the data is not sufficient to obtain knowl- edge. Visualisation is the graphical communication of information, which plays a cru- cial role in helping the users understand the information and get knowledge (Fayad et al., 2002, p. 6, 23). Visualisation requires a well organised interface with ordered placement, regular patterns, consistent forms, and a balanced distribution of white space and elements on the screen. The idea is to make it easier for the user to un- derstand unfamiliar and abstract things, otherwise it might be hard to get a complete picture of the entire data set and understand the relations within it (Te’ eni et al., 2007, p. 208) (Fayad et al., 2002, p. 88). It is important to present the information with the user in mind. The requirements and the perceptual capabilities of the user must be considered. (Fayad et al., 2002, p. 21; 206) A good way to succeed with this is to in- volve the user in the development of the software system and design the interface in iterations (Sommerville, 2011, p.45). Guidelines for the design of the user interface are presented below (Te’ eni et al., 2007, p. 208; 258-287):

• Colour is effective when designing an interface, but it has to be used with care and conservation. The usage of different colours is a good way to call attention to specific information, differentiating information types, and grouping similarity. Symbolic colours should be used in a way so it reflects its meaning.

• Objects should be represented so the user can interact with them in a way that is related to reality.

• When it comes to data input it is effective to provide the user with predetermined values, but only if the values are known and if they are few. One rule is to

27 minimise the users effort to fill in data, but when the user needs to fill in data, the user should be guided by information about expected values.

• Graphs are powerful communicators for quantitative information to summarise data, show trends, show relationships in data, and show deviations. All this is useful to support decision making.

.

A common way to communicate the knowledge gained from the data to the organ- isation is to use reports. The reports can be designed and delivered by the software system and has to be approached with the same rules as the design of the user interface (Krishnamurthy and Saran, 2008, p. 259).

4.5 Evaluation

When the data have been modelled, the outcomes are a base for the evaluation. The outcomes have to be compared with the business requirements and the purpose of the project (Chapman et al., 2000, p. 14). The outcomes have to be evaluated from a broader perspective and can only be fully understood if social, psychological, environ- mental, and technological systems are evaluated as a whole (Griffith and Dougherty, 2002, p. 205). After the outcomes have been understood and the entire process is evaluated, the next step has to be decided. It can either be to deploy the results from the APC project or to move back in the process to be able to improve the outcomes (Chapman et al., 2000, p. 60).

4.5.1 Evaluate Software System

To decide if the prototype is going to be discarded or re-used for development of the real system, it has to be evaluated (Krishnamurthy and Saran, 2008, p. 134). The prototype evaluation should be done by the prototype objectives in mind and not the objectives for the whole APC project. It is important that the evaluation is given the time needed so the users can learn how to use the system before feedback is given (Sommerville, 2011, p. 46). The feedback should be a guide for future design and development of the system (Te’ eni et al., 2007, p. 135).

After the evaluation is done, the next step has to be decided. It can either be to start de- veloping the real system or to go back in the process and continue with the prototype. (Agarwal et al., 2008, p. 42)

28 4.6 Deployment

When a decision to deploy the outcomes from the project has been taken it needs to be implemented in the organisation. A strategy for the deployment must exist and it is important that the gained knowledge extracted from the data becomes part of the day-to-day business (Chapman et al., 2000, p. 32-33). The deployment involves installing the real software system, transferring data from existing system, establishing communications with other systems, and maintenance of the software system. During this step it is also important to prepare user documentation and training (Sommerville, 2011, p. 281).

29 Chapter 5

Case Study

In this chapter the performed case study at Yarra Trams is described. The case study was performed and structured based on the phases in the theoretical framework. Each section in the chapter represents one phase of the framework and explains how the framework was applied.

5.1 Business Understanding

The Victorian State Government under the Department of Transport (DoT) coordinates all public transport in the state of Victoria, where the capital city is Melbourne (The De- partment of Transport, 2011a). To promote public transport in Melbourne, there exist a partnership named Metlink between all tram, train, and bus operators in Melbourne. The responsibilities of the partnership is to provide customers with information and services, and to instigate research in order to improve public transport (Metlink, n.d.). The partnership has a joint solution for fare collection were the same smart card can be used for paying on trams, trains, and buses.

All tram services in Melbourne’s tram network are today operated under the trad- ing name Yarra Trams, owned by the Victorian State Government (Yarra Trams, n.d.). The operator of Yarra Trams is responsible for operation of trams, customer service, management of staff, and maintenance of vehicles, tracks, and stations. The current operator of Yarra Trams is KDR, a consortium consisting of and Downer EDI Rail. KDR is contracted by the Government since 2009. The partnership is initially agreed for eight years, with the possibility for another seven years of extension (The Department of Transport, 2011b). The relation between the actors in Melbournes pub- lic transport are shown in Figure 5.1.

The motto of KDR is to ”think like a passenger”. This means to see things from the passengers perspective. Only by this approach KDR can offer the best service at ev- ery stage of a journey. KDR thinks that informed passengers use the network more efficient and strive to provide the passengers with information they need to make the right choices (KDR, 2011).

30 Figure 5.1: Actors within public transport in Melbourne.

5.1.1 The Automatic Passenger Counting System Project

Under the franchise agreement in 2009, Yarra Trams committed with the DoT to under- take a feasibility study to determine the viability of introducing Automatic Passenger Counting (APC) systems on trams. APC systems were already installed on 9 trams when the agreement was conducted. Yarra Trams has been through several APC trials during the last 30 years. Different technologies have been tested on single trams, but Yarra Trams has chosen not to go any further than the test phase, due to accuracy and technical issues. It was not until 2008 that APC technology was installed on multiple trams and left continuously in operations. The goal with the APC system is to establish more reliable data of the passenger usage on the tram network to assist Yarra Trams organisation in service, planning, tram procurement, and rationalisation of tram stops. This data aims also to provide Metlink and DoT with better and more accurate infor- mation about the passenger usage of the trams in Melbourne (Yarra Trams, 2010, p. 2; 9-10). Currently there is a manual patronage survey twice a year that collects data about the passenger usage of the tram network (Yarra Trams, 2009a). This informa- tion serves as decision basis for the fundings Yarra Trams receives from DoT for their operations (Couenon, 2010).

The objective of the APC project is to find out how to make use of the APC system. The stakeholders in the APC project within Yarra Trams are the Operations, Marketing, Maintenance, and IT departments. The Operation department is in charge of planning and analysis of the tram services, and management of traffic offences. The Marketing department is responsible for growing the patronage and revenue of Yarra Trams as well as improving the overall customer experience. The IT department is responsible for the development and maintenance of Yarra Trams information systems. The Main- tenance department is responsible for maintaining the tram vehicles and the tram net- work. Whereas the Operations and Marketing departments are interested in the APC

31 system for improving planning and services, the Maintenance and IT departments are responsible for the APC on a hardware and software level (Yarra Trams, 2009a).

The following main business requirements were identified (Yarra Trams, 2009a):

• The Marketing department needs to provide management with quantitative in- formation about the usage of the tram network. • The Operations department needs information about the load on the tram vehi- cles as base for changes in the timetable and fleet allocation. • The IT department needs the APC system and its components to be compatible with the existing IT environment, both on a hardware and software level. • The Maintenance department needs information about the status of the APC sys- tem.

To fulfil these requirements it is not possible to rely on data from the ticketing system since it only shows how many passengers that validate their ticket. Because of the smart card system, a passenger does not need to validate their ticket when changing from one to another vehicle. Another problem with using data from the ticketing system is that it does not contain information about the amount of joyriders (Couenon, 2010).

5.1.2 The Automatic Passenger Counting System

Yarra Trams has a total of 487 tram vehicles in their fleet, which consists of 8 different types of tram vehicles (Yarra Trams, 2009b). Yarra Trams has tested and approved the APC system on 3 different types of tram vehicles, where the latest tram type was validated in the end of 2010. Before the tram type gets approved the data from the APC system must be validated. It gets approved only if the average accuracy is greater than 95%. There is today a total of 10 trams equipped with APC systems in operation, which represents 2% of their tram fleet. If the APC system would be installed on all trams of the type that has been approved, the APC system would cover 230 trams, 47% of Yarra Trams fleet (Couenon, 2010).

The supplier of the APC systems is the French company Acorel. The selection of this company was based on Acorel’s previous successful experiences with APC systems on other tram networks. (Yarra Trams, 2010, p. 12-13) Acorel supplies both hardware and software for APC systems, where the hardware is the physical counting system that is installed in each tram vehicle, and the software is a data analysis tool to be used in the office by the organisation. Yarra Trams has tested the software tool and chosen not to acquire it, because it is not suitable for their organisation (Couenon, 2010).

Yarra Trams information systems related to the APC system are shown and described in Figure 5.2.

32 Figure 5.2: Yarra Trams information systems related to the APC system.

• Automatic Passenger Counting (APC) System The APC system consists of IR sensors built into the tram roof above each door, an onboard computer named UCD (Data Concentration Unit), and a central computer for the APC system at Yarra Trams office. Each IR sensor uses a combination of dual active and passive IR sensing. The active sensing consist of two IR light curtains to be reflected by the passengers as they board or alight and gets registered by the sensor. The pas- sive sensing is used to detect the thermal energy emitted by passengers (Yarra Trams, 2010, p. 40)(Acorel, 2010). The IR sensors are coupled with door switches and are only active to register the passengers boarding and alighting when the doors are open. Each sensor is connected to the UCD, and the registrations are transferred to the UCD and converted and stored as counts. All counts are stamped with additional information from the AVL system, such as time, stop location, route, vehicle number, and service number. Once a day, the onboard data are transferred to a central computer at Yarra Trams office. This is done by General Packet Radio Service (GPRS) technology through a modem. When the central computer retrieves the data it generates Comma- Separated Value (CSV) files and stores them in a database (Yarra Trams, 2009a) (Couenon, 2010).

• Automatic Vehicle Monitoring (AVM) System

33 This system is used for monitoring and communication with trams in service. AVM collects realtime data about the trams from the AVL system and stores it in a database (Yarra Trams, 2009a).

• Automatic Vehicle Location (AVL) System This is a positioning system that provides information about a tram’s position. The position is based on coordinates from the Global Positioning System (GPS) and a distance measured by the odometer on the tram (Yarra Trams, 2009a). When the tram is surrounded by tall buildings it can be problematic to get the correct GPS coordinates. The AVL then uses the odometer to estimate the po- sition based on how far it has travelled. The AVL system receives information about which service a tram operates from the AVM system (Yarra Trams, 2009c) (Couenon, 2010).

• Hastus This is a system used for scheduling of tram services. Hastus gets offline feed- back from the AVM system on how well actual services fitted with the scheduled services (Couenon, 2010).

• Tram Tracker This is a prediction system that predicts when a tram is going to serve a specific stop. The predictions are based on realtime data from AVM and offline scheduled data from Hastus. Each stop in the tram network has an unique number, which are stored in a database (Yarra Trams, 2009a).

• Maximo This is an asset management system used to keep track on information about a tram vehicle. The system gets information from the AVM system and it can include the status of a vehicle. Based on this, the system generates work orders for maintenance (Couenon, 2010).

5.1.3 Software System for APC Data

To make use of the data collected by the APC system, it is for Yarra Trams necessary to have a software system (Couenon, 2010). The different needs for a software sys- tem that was agreed after communication between stakeholders, developers, and the project manager are summarised below. The requirements that were prioritised for the prototype system by the project manager are marked in bold (Yarra Trams, 2009a) (Chabas, 2010) (Holmes, 2010) (Couenon, 2010) (McRobbie, 2010). General function- ality are prioritised before specific functionality, and functionality less resource de- manding with high relevance for the business are also prioritised. For the complete document over the software requirements that were confirmed, classified, and priori- tised after being analysed, see appendix A.

34 • System requirements from both Marketing and Operations:

– Provide information about which tram stops that serves the most passen- gers. – Provide information extraction for alighting, boarding, and passenger load, by filtering of date, time, route, stop, vehicle, and run/trip. – Provide visualisation in form of graphs to the extracted information. – Provide a comparison between the actual load of a route with the load ca- pacity for that route.

• System requirements from both Marketing and Maintenance:

– Provide the quality of the data to see accuracy and track the status of the APC system.

• System requirements from Marketing:

– Provide extracted information to be exported into different formats. – Provide information in reports with data and graphs, generated by the software tool. – Provide the coverage of the actual data sample compared to the total data sample.

• System requirements from Operations:

– Provide a forecast of the future passenger load on the tram network. – Provide information about service punctuality according to the scheduled time.

• System requirements from IT:

– Be accessible for all users through a web based user interface. – Be developed in the Microsft .NET platform.

5.2 Data Understanding

The tram network that Yarra Trams operates consists of almost 1800 unique stops, which are served by 28 routes. All routes have one up and one down direction with different stops. Yarra Trams is planning the tram services in “runs”, where one run represents the schedule for one tram. The run schedule declares at which time the tram is in service and which route it operates. One run consists of several trips, were

35 Figure 5.3: One run consisting of six trips taken by one tram.

one trip represents a tram’s journey from a start terminus to an end terminus of a route (Couenon, 2010). The relation between run and trips is illustrated in Figure 5.3.

Yarra Trams approved APC systems count passengers with a maximum error of ± 5% for each sensor per stop. This means that the total error on one stop can contain a larger error than ± 5% when the data from all sensors are summarised (Yarra Trams, 2009a). It is only when the average error is calculated for a longer time period that the total error is within ± 5% per stop (Yarra Trams, 2010).

The validation of the APC system that was conducted in the end of 2010 for a new tram type was done by comparing the counts from a manual survey with the counts from the APC system installed on that tram vehicle. The result from the validation showed that the total error of the installed APC system on stop level ranged from - 7% to +16%, and the average error for the whole validation survey was +3%. This means that the APC system is approved for the tram type as it is within ±5%, but a statistical approach of correction must still be taken before the data are useful (Yarra Trams, 2011).

Table 5.1 shows a fictive example of APC data merged with AVL data, representing one trip taken by one tram. The data in the selection represents some of the identified issues, which are described further down. Explanation of the headers in Table 5.1 can be found in Table 5.2.

36 Vehicle Time Stop Ins Outs Route Stop Type Direction Run 002122 9:04:48 0001 0 1 055 1 0 E007 002122 9:05:19 9999 2 1 055 0 2 E007 002122 9:07:42 0003 0 1 055 0 0 E007 002122 9:09:50 0004 2 0 055 0 0 E007 002122 9:10:19 0004 0 1 055 0 0 E007 002122 9:10:58 0004 0 0 055 0 0 E007 002122 9:12:20 0006 1 0 0 0 0 E007 002122 9:13:05 0007 0 1 055 0 0 0 002122 9:13:40 0008 0 1 055 0 0 E007 002122 9:17:02 0009 3 1 055 0 0 E007 002122 9:18:51 0010 1 0 067 0 0 E007 002122 9:21:08 0011 0 1 055 0 0 E007 002122 9:22:16 0012 0 1 055 0 0 E007 002122 9:22:51 0013 8 6 055 1 1 E007 Total: 17 15

Table 5.1: Fictive APC data sample with errors.

Name Description Vehicle Tram identification number. Time Time of arrival. Stop Stop identification number. Ins Number of passengers boarding the tram. Outs Number of passengers alighting the tram. Route Route identification number. Stop Type Indicates whether it is a normal stop or a termi- nus stop. Stop Type = 0 is a normal stop and Stop Type = 1 is a terminus stop. Direction Gives the direction of the trip. Direction = 0 is the up direction and Direction = 1 is the down direction. Run Run identification number.

Table 5.2: Explanation of the headers in table 5.1.

The quality issues illustrated in Table 5.1 are described below:

• The sensors may fail to register passengers at stops. The sensors may also count passengers several times if they are standing in the doorway when the doors are opened. This results in a difference between the total Ins and Outs in the end of a trip.

• The APC system may not recognise at which stop the tram has stopped, this results in Stop = 9999.

37 • The APC system may not recognise which route it is serving, this results in Route = 0. • The APC system may not recognise which route it is serving and assumes that it is serving a different route than it actually is, this results in incorrect route number. • The APC system may not recognise the ongoing direction of the trip, this results in Direction = 2. • The APC system may not recognise which run it is operating, this results in Run = 0. • The APC system creates one data record for each time the doors are opened, this may result in multiple records per stop.

The following lists are identified quality issues resulting in incorrect data that is not possible to see in Table 5.1.

The following are identified as factors that results in incorrect data:

• A malfunction of the doors may prevent the APC from knowing that the doors are closed, the result is that the APC system continues to count movements near the doors while the tram is moving. • The APC system does not always have up-to-date information about the routes regarding which stops that belong to a route, this results in data records with incorrect stop code.

The following have been identified as factors that affect the completeness of the APC data:

• The APC system only registers stops where the doors are open, this means that the stops the tram did not serve are missing in the data record, which leads to incomplete history of the tram’s journey. • The APC system only registers the stop code for the end terminus and not the start terminus of a route, as the end terminus in one direction is at the same ge- ographical location as the start terminus for that route in the opposite direction. The end and start terminus have two different stop codes, which leads to missing start terminus. • The APC system does not get any information about the trip number from the AVL system, this makes it problematic to divide the tram’s activity into trips.

To fulfil the system requirements and objectives of the APC project, more data needs to be collected from other sources and combined with the APC data. The following information is identified as missing:

38 • All the stop identification numbers on the network with corresponding name and locations.

• Complete information about routes, served by the particular tram, with corre- sponding names and stops for both directions.

• Tram vehicle information including vehicle type and passenger capacity for all the trams that have an APC system installed.

• Complete schedules for the runs that are operated by the tram.

• Trip number for all trips that have been served by the tram.

The following discrepancy is identified in data from the APC, AVM and TramTracker system:

• Run Number is stored in the AVM system and the Tram Tracker system as “E-7” but the APC system stores them as “E007”.

• Yarra Trams has two different stop codes for all stops, one old and one new. The AVM system uses the old, the APC system uses the new, and the Tram Tracker system uses both the old and new.

5.2.1 Design of the New Software System

The overview design of the new prototype software system is shown in Figure 5.4. It consist of two modules, one for data preparation and one for data modelling. The platform for the system is Microsoft .NET and the different technologies and program- ming languages for each module are shown in the Figure 5.4.

Figure 5.4: Prototype software system design.

39 The data preparation module consists of two different databases, one for storage of raw data integrated from different sources, and one for storage of prepared data. The database for prepared data is a relational database, where data from different data sources are stored to keep track on the relations within the data.

The data modelling module is based on the Model-View-Controller (MVC) architec- ture. By using the MVC architecture the system is structured and developed into the three separate layers: model, view, and controller. The model represents the domain for the system and holds all the business logic such as items, operations, and rules that are meaningful for the application. The model is the connection to the data sources. The view is the user interface where the information from the model is presented. The controller layer is the link between the model and the view, it holds all the application logic such as processes for when the user makes a request. With the MVC architecture it is possible to modify one layer without affecting the other layers, which encourages to iterative development (Sanderson, 2008, p. 56-58).

The platforms and tools for the development are presented below:

• Microsoft SQL Server is a platform for data management and analysis (Microsoft, 2011a).

• SQL Server Reporting Services (SSRS) is a tool for create and manage reports (Microsoft, 2011c).

• SQL Server Integration Services (SSIS) is a platform for integration and transfor- mation of data (Microsoft, 2011b).

• Microsoft Visual Studio (MSVS) is an integrated development environment (Microsoft, 2010).

5.3 Data Preparation

The data preparation module is designed to extract data from both the APC system and the Tram Tracker system, prepare it, and then store it in the database for prepared data. The identified missing data needed for the software system is secured by the integration with the Tram Tracker system. One problematic piece of information that remains unsolved is the trip number. The trip numbers that the Tram Tracker system contains, have been stored for scheduled services and not for the particular individual trams that actually performed a service. Because of that, it is not possible to use those trip numbers in this solution, and the trip number that is added during the preparation is a unique number generated by the database. The discrepancy with the run number between the Tram Tracker system and the APC system is solved by replacing the zeros in the middle of the run number from the Tram Tracker system with a dash to get the same format as from the APC system.

40 The solution for data preparation is built in SQL Server and SQL Server Integration Services (SSIS), and uses Transact-SQL.

The data preparation module performs the following steps:

1. Extract data from the Tram Tracker database and store it in the database for pre- pared data.

2. Extract APC data from the CSV files that the APC system produces and store it in the database for raw data.

3. Extract APC data from the database with raw data, and create data sets that rep- resent complete trips for the trams and store them in the database for prepared data. The trips are created by dividing the data from one tram into different trips by adding a trip number to all the stops the tram made between a start terminus and end terminus on a route. To divide the data into trips, the trips first have to be identified. Trips are identified by the following steps:

(a) Identify the start terminus of a trip and retrieve the route information. Store the stop in a new data set, with the stop code for the start terminus. (b) Loop through the data records in sequential order and control if the stop record: • matches the next stop on the route – then add it in the data set. • is the same as the previous stop – then merge the stops to one stop. • is later on than the next stop on the route – then add all the stops that are missing before adding the stop. • has 9999 as stop code – then add the stop code for the next stop accord- ing to the sequential order of the route. • has a stop code that does not belong to the route – then discard the data set and search for a start terminus, go to step a. • is the end terminus for the route – then add it to the data set and add the trip number for all added stops. (c) After the trip has been identified, store it in the database for prepared data and restart the loop. As the end terminus also is the start terminus for a new trip, go to step b.

4. Correct errors that can occur when the APC system counts passengers boarding and alighting a tram. This is done by assuming that the load of the tram should be zero at the end of a trip. If the load of the tram is not zero at the end termi- nus, the APC system has missed to count boarding or alighting passengers. The missed counts are therefore distributed to the stops that the tram served. The correction is done in the following way:

(a) Remove the alighting passengers from the start terminus of the trip, as pas- sengers alighting the tram on the start terminus belong to the previous trip.

41 (b) Remove the boarding passengers from the end terminus of the trip, as pas- sengers boarding the tram on the end terminus belong to the next trip. (c) Calculate the total number of passengers on the trip, which corresponds to the mean value of counted boarding passengers and alighting passengers. (d) Correct the counted boarding passengers by distributing the difference be- tween the total number of passengers and the sum of all counted boarding passengers to individual stops. Stops with more boarding passengers are more strongly corrected. (e) Correct the counted alighting passengers by distributing the difference be- tween the total number of passengers and the sum of all counted boarding passengers to individual stops. Stops with more alighting passengers are more strongly corrected.

After the data have been prepared in these steps it is in average 40% of the APC data that are useful for modelling.

5.4 Modelling

Once the data have been prepared and stored in the database for prepared data, the data can be modelled. The modelling module is a web-application built on the MVC architecture and the .NET platform. It can access the prepared APC data and provide information to different users.

The interface of the modelling module is the arena where the interaction between the user and the software system takes place. Referring to the MVC architecture of the modelling module, the View represents data from the Model on the interface. The ac- commodators between the Model and the View are the Controllers. The Controllers handle the inputs from the user through the interface and instruct the Model to per- form tasks based on those inputs. The Controllers receive the outcomes from the Model and hand it over to the View, which presents it to the user on the interface.

The View is built on following languages and libraries: JavaScript, CSS, HTML, AJAX, JQUERY, SSRS. The Model and the Controllers are built on the C# language.

The interface of the modelling module consists of three different parts:

• Home This is the first page the user sees. It aims to give the user an overall picture of the usage of the tram network and present key information.

• View Data This page provides the user with the ability to request information through var- ious filtering options.

42 • Reports This page enables the user to create and view reports.

5.4.1 Home

Figure 5.5 and 5.6 shows screenshot images of the start page (Home). The Home page presents a snapshot over the 10 top stops across the tram network, where the stops are presented in a data table and visualised with bar chart and on a map.

Figure 5.5: Screenshot of Home showing the distribution of the passengers on the ten top stops.

Figure 5.6: Screenshot of Home displaying the location of the ten top stops on a map.

43 5.4.2 View Data

In View Data, the user has the ability to filter by route, stop, date and time. The filtering options works from left to right and by choosing route/routes the stops belonging to that route/routes are shown in the next field, see Figure 5.7. After stops, date (see 5.8), and time segment are chosen, the submit button to the right submits the request.

Figure 5.7: Screenshot of View Data showing route filtering options.

Figure 5.8: Screenshot of View Data showing date options.

44 The result from the filtering can be viewed in four different tabs.

• The first tab presents the data in full, which means that if a tram has served a stop twice during a run it is presented twice.

• The second tab, see Figure 5.9, is similar to the first but it presents the aggre- gated data, which means that if a tram has served a stop twice it is merged and presented in total.

• The third tab presents the data in bar charts, see Figure 5.10. The user can interact with the bar charts and by clicking on a bar, additional information is shown.

• The fourth tab presents the data visualised on a map, were the stops are coloured with difference depending on activity (total ins + total outs), see Figure 5.11 and 5.12. The user can interact with the map by zooming, navigating, chang- ing theme, and clicking on stops to see additional information.

Figure 5.9: Screenshot of View Data showing the data after filtering.

45 Figure 5.10: Screenshot of View Data showing ins and outs for filtered stops in a bar chart.

Figure 5.11: Screenshot of View Data showing filtered stops on a map with different colours indicating the difference of passengers.

46 Figure 5.12: Screenshot of View Data with detailed information about a stop that oc- curs when a user has clicked on a stop.

5.4.3 Reports

The page Reports allows the user to create reports. The possible reports the user can create are: 10 top stops, 3 top routes, see Figure 5.13, total ins on a route, see Figure 5.14, and load of a route. The user can after creating the report view it on the inter- face. It is also possible to export it into different formats and save it locally on their computer, see Figure 5.15 and 5.16. Both tables and graphs are used to present infor- mation. Some of the reports can be created through input from the user. The expected format of the input is shown beside the input field.

Figure 5.13: Screenshot of Reports showing a report over the 3 top routes.

47 Figure 5.14: Screenshot of Reports showing how the user can create a report over total ins of a route by choosing a route and direction.

Figure 5.15: Screenshot of Reports showing how a user can export a report.

48 Figure 5.16: Screenshot showing a report that has been exported into Excel. The report is showing the load of a route.

49 5.5 Evaluation

The objective of the APC project was to find out how to make use of the APC system. It was to get insight in the potentials of the APC system and to understand how the APC system can be used within Yarra Trams. Moreover, it was to identify and resolve gaps and constraints that the APC system implies before it is useable.

The outcomes of the project shows the potential of the APC system and illustrates the different possibilities when using the APC system. Some of the gaps and constraints the APC system causes have been identified and resolved (Couenon, 2010).

The data from the APC system are integrated with data from the Tram Tracker sys- tem, which also includes data from the AVL system. The data are corrected and pre- processed in the preparation module, and the modelling module is modelling the data and presenting information to the users of the software system, which makes it possi- ble to use the data from the APC system (Couenon, 2010).

The next step in the APC project has several dimensions. One thing is to continue the experimentation of modelling the data in the software system. This will continue showing the potential outcomes of the APC system and give Yarra Trams further un- derstanding for the data and constraints. Yarra Trams is also going to install APC systems on more trams so they can reach a percentage of 20% of their total tram fleet (Couenon, 2010).

5.5.1 The Prototype Software System

The prototype system for making use of Yarra Trams APC data is seen as a success for several reasons. The first reason is that it integrates the APC system and the Tram Tracker system and creates usable data sets. The second reason is that the software tool demonstrates the different uses for the APC data in a good way. The last reason is that the system is developed in the .NET platform, which is familiar to Yarra Trams and gives them the flexibility that they requested (Couenon, 2010).

The prototype system fulfills 70% of the prioritised requirements that were gathered from the stakeholders, see Appendix A. Even if all functionality was not fulfilled be- cause of the time frame for development, it was important that Yarra Trams got a system that the IT department now can maintain and continue to develop. The risk of not choosing to develop the system in the existing .NET environment would have been to end up with a system that no one would have maintained or continued using and developing. It is now possible to re-use the prototype and continue to develop it into the fully functional system that the organisation needs. This is planned to be achieved by the following steps (Couenon, 2010):

50 1. Continue to test and validate the output information from the prototype system to identify and fix limitations and quality issues with the system before the in- formation is used.

2. Introduce the system to the users/stakeholders so they can get an understanding for the potential of the APC system, and so they can be part of the evaluation process.

3. Resolve hardware issues in terms of server allocation and storage capacity to ensure the serviceability of the system before it can be evaluated by the whole organisation.

4. Evaluate the system by introducing it into the day-to-day operations in the or- ganisation.

5. Start develop new functions and reports from feedback that is received from the evaluation.

6. Explore more complex analysis techniques for new functionality in the system when the organisation has accepted the system and are using it as a natural part of the day-to-day operations.

7. Set up communications towards external partners that also are interested in the APC data that Yarra Trams collects.

The prototype system that has been delivered is flexible and easy to manage for Yarra Trams. The software system will only be used as a real system when Yarra Trams can be sure that it delivers the expected quality (Couenon, 2010).

51 Chapter 6

Analysis

6.1 Business Understanding

The main purpose of Yarra Trams business is to provide public transportation in Mel- bourne, which means a range of different responsibilities. To be able to offer the pas- sengers the best service at every stage of a journey, it is important that Yarra Trams understand how the passengers use their services, which they strive after by trying to see the tram network from the passengers point of view. The ongoing work and projects within Yarra Trams must be in line with the objectives of the business. The goal of introducing Automatic Passengers Counting (APC) systems on the trams is in line with the objectives of Yarra Trams business. The goal of the APC project is to es- tablish more reliable data regarding the amount of passengers travelling on the tram network. This data could be transformed into information describing how the pas- sengers are using the services and give Yarra Trams knowledge about their customers need. This could be useful to improve services, planning, and to meet the responsibil- ities it means to operate the tram network in Melbourne. Before the APC systems can be implemented, integrated, and used by the organisation, it is important to undertake a feasibility study. The ongoing feasibility study at Yarra Trams aims to determine the viability of introducing APC systems on the trams, which together with the purpose of using the APC systems, is in line with the objectives of Yarra Trams business. This thesis project is closely tied to the feasibility study as it investigates how an organisa- tion can make use of an APC system. The result of this thesis aims to support Yarra Trams in how to make use of the APC systems, so they can get more reliable data on how the tram network is used. Therefore, this thesis project is also in line with the objectives of Yarra Trams business.

The fact that passengers not necessary need to validate their smart card ticket when they change from another vehicle (tram, bus, train), together with the fact that there exist joyriders, makes it difficult to rely on data from the ticket system to get informa- tion about the amount of passengers on a tram. It is possible to obtain more reliable data from the APC system regarding the amount of passengers on a tram. By compar- ing data from the APC system with data from the ticket system, Yarra Trams could get information about how many passengers that validates their tickets, which could be an indicator of the amount of joyriders or people boarding from a connected vehicle.

52 This information would bring more knowledge about how the passengers are using Yarra Trams services, and by understanding the decisions taken by the passengers, Yarra Trams will approach their motto; think like a passenger.

The APC feasibility study involves more actors than just Yarra Trams, and it is im- portant to understand which commitments and relations Yarra Trams have with those actors. Information about the usage of the tram network is the decision basis for the fundings Yarra Trams gets from DoT. This information is today established from the manual surveys, performed twice a year, but it could be extracted and established more often from the APC systems. This relation to DoT, and interchange of informa- tion means that DoT can put high demands on that the information coming from Yarra Trams based on the APC systems is reliable. Yarra Trams is also in relation with actors involved in the partnership Metlink which means that Yarra Trams, depending on the commitments, needs to provide Metlink with information based on data from the APC systems.

Based on the objectives of this thesis project, most of the resources for the project were identified. The stakeholders for this thesis project were identified as the Marketing, Operations, IT, and Maintenance departments at Yarra Trams. They were reasonably identified because of their function within Yarra Trams and their relevance to the APC feasibility study. Even if actors outside the Yarra Trams organisation are indirect in- volved in the APC project, it is actors within the Yarra Trams organisation that direct will use information based on data from the APC system. Therefore, it is reasonable to involve actors from Yarra Trams organisation from the beginning but still consider how the other actors may affect or be affected by the APC project. A main resource in the APC project is data, which from the beginning only was identified as data from the APC system. It was not before an understanding of the APC data was achieved that it became clear that data from more systems were required. It was not sufficient to just understand the business and its objectives to be able to identify all resources needed.

The decision to develop an in-house software system to handle all data was taken to get a software system suitable for Yarra Trams. The software system provided by Acorel did not fulfil Yarra Trams requirements, and was therefore not acquired. With an own software system Yarra Trams gets more freedom in terms of system manage- ment. Since the software system is part of a feasibility study, it is good to have an in-house software system so Yarra Trams easier can experiment with different func- tionalities and find out the potential of the APC system.

The main requirements from the Marketing, Operations, Maintenance, and IT depart- ments are in line with the business objectives, and are therefore a suitable base for gathering of the more specific software system requirements. The software system re- quirements, see appendix A, gathered and confirmed through a communication pro- cess between the stakeholders, project manager, and system developers can direct or indirect be bound to the business requirements. The software requirements reflects the objective to get more reliable data that can describe the usage of the tram network, and give Yarra Trams the possibility to experiment with how data from the APC sys- tem could be used and presented.

53 The software requirements are prioritised after how many of the user requirements that could be satisfied by a function, which reflects the relevance of the requirements to the objective of the project as well as it depends on resources in terms of data and time. This way of prioritising increases the possibility to fulfil user requirements and eliminates the risks that functional requirements that are less relevant to the goal of the project are developed before requirements that are more relevant. The non-functional requirement, to develop the software system in the Microsoft .NET platform, is well prioritised. Development of the software system on another platform than the other systems at Yarra Trams could have resulted in integration problems, both from a tech- nical and a social aspect. From a technical aspect, it could have been hard to integrate different systems because of incompatibility. From a social aspect, it could have been hard to use and maintain the software system because of lack of knowledge in the technical environment. The software system was integrated with Yarra Trams existing systems and the IT department can now continue the development since they have the in-house skills in the technical environment they need. Documentation of require- ments have been used as basis when reviewing and managing the requirements. This was only done by the system developers and the project manager but should have involved the stakeholders to make sure that the requirements did not contain any anomalies.

6.2 Data Understanding

The knowledge that was gained during the business understanding phase contributed to an understanding of Yarra Trams business objectives, Yarra Trams organisation and relevant actors, the tram network, and the information systems. This knowledge made it easier to understand which data sources that were needed for the project, and im- plies that the data understanding needs to be preceded by the business understanding phase.

Outcomes from the APC system in terms of data can differ depending on how the data are collected, which implies that it is necessary to validate the APC system and the outcomes before making use of the data. New conditions can affect the APC system even after the system has been approved in a validation. The organisation needs to be aware of this and it is important that they are alert on anomalies in the data. Each time the conditions are changed, some kind of validation needs to be performed to be able to understand how the new conditions affect the data. To understand the involved data and which information that can be extracted from the data, it is important to understand how the data are collected. The validation that was performed in this project, as a manual survey, can be seen as an interactive approach to get a better understanding of how the data are collected. The participation contributed to a better understanding of the problematics with the APC system and the errors in the data. The survey was also a good complement to get a better understanding for the data and the quality issues.

54 Since the prototype software system is designed as a module-based solution, it is a clear distinction between the data that are presented to the user and the data that con- tain errors. The design of the prototype software system has in several ways opened up for Yarra Trams to continue the development. The software system was designed in the .NET framework to be compatible with the existing information systems and to fit the existing development skills. The design of the modelling module in the software system is based on the MVC structure, which aims to make it easier to maintain and add new functionality to the software system. The database is designed as a relational database, which reflects the real world relationships that exists in the tram network. As the design of the software system makes it easy to implement new functions, it makes it easier for Yarra Trams to experiment with new functionality.

6.3 Data Preparation

The previous phase identified quality issues in the data, which was necessary to be able to fix all quality issues needed to fulfil the requirements. By preparing data in the preparation module of the software system, manual steps of preparation are elim- inated. The automatic preparation process makes the data preparation faster, and re- duces the risk for human errors when preparing the data.

The data preparation phase uses the different steps that the theoretical framework suggests for preparing data, but it was shown that the order of the steps needed to be changed. Data from both the APC system and the Tram Tracker system needed to be selected, constructed, formatted, and integrated before the data could be cleaned. This indicates that the steps taken to prepare the data in this phase do not always allow or need to be in the same order that the framework indicates.

The two data attributes, trip number and passenger load of a vehicle, are constructed in the preparation phase since they are necessary to fulfil the system requirements. The preparation module integrates data from the APC system with data from the Tram Tracker system to be able to calculate and fill the data attributes. The format data step handles data inconsistency in the run number between these two different data sources. Without this solution, the integration step would not have been possible to perform, and it would not have been possible to provide the required information.

The clean data step is performed by applying two different algorithms that handles the errors and completeness issues of the data, which is necessary before it can be used in the prototype software system. The first algorithm divides the data from trams equipped with an APC system into trips, and corrects the unidentified stops and adds the unserved stops. The second algorithm handles the accuracy errors from the APC system by correcting the number of boarding and alighting passengers on the trip, and constructs the passenger load. With the use of these two algorithms, the prototype software system can provide higher quality data for Yarra Trams organisation. The result after preparing the data shows that some data are discarded because of quality

55 issues. This indicates either that the cleaning step is not optimal, or that the errors from the APC system are not possible to handle in the data preparation phase.

6.4 Modelling

The separation between the preparation module and the modelling module reduces the risk that unreliable and incorrect data are presented for the user. All quality issues with the data are fixed in the preparation module, before it is stored in the database for prepared data. The prepared data stored in the database are still raw material and needs to be converted into information before it is useful.

The modelling module of the software system converts the data into descriptive in- formation about how the passengers are using Yarra Trams services. This information will be base for decisions, which implies high demands on that the information is reliable, correct, and presented in a way to support the user to interpret the informa- tion and take decisions. This information can be base for decisions regarding tram procurement, rationalisation of stops and trams, and supports Yarra Trams to comply their responsibilities.

The modelling module consists of a web based user interface that provides the users with a representation of the activity on the tram network. The interface is organised by different tabs and uses the same structure for all pages, which makes it easier for the user to navigate and find the required information. The different elements and patterns on the different pages are regular and presented in a consistent way, which also support the user to navigate and understand where the relevant information is lo- cated. Much white space together with information presented and explained in tables, charts, and maps helps the user to understand the information. The user can interact with the user interface, which increases the user’s freedom of choice and allows the user to see detailed information only when it is necessary.

The user interface supports the user to interpret information by the use of colours. The colours are used with conservation and are mainly used as an distinctive indicator for the activity on the stops. The colours make it easier to see the breaches on the network, which was one of the business requirements. The visualisation of stops on the map is a representation of the reality, and supports the user to understand how the informa- tion is related to the reality. The filtering functionality allows the user to choose data, without requiring that the user know all things about the tram network. The stops that belong to selected routes are shown, and the user does not need to know this in- formation themselves. It is only when the users want to create reports that they need to fill in information themselves, but then they are guided by information about the expected format. The user interface illustrates the relations between variables by using graphs, which makes it easier for the user to understand relations between boarding, alighting, and total passenger activity on a stop or a route. Showing quantitative in- formation in graphs makes it easier to compare different stops and different routes. As the Marketing department required, the modelling module provides the user with an

56 option to generate and export different reports. Reports can be useful when the Mar- keting department needs to provide other parts of the organisation with information, or if they need to provide DoT and Metlink with information about the usage of the tram network. The reports are designed by the same principles as for the interface, where graphs and different colours are used to make distinctions.

6.5 Evaluation

The last phase aimed to evaluate if the outcomes of the project fulfilled the project objectives. An evaluation has been performed, but only a smaller evaluation with the project group and the project leader. This means that a larger evaluation with all stake- holders involved in the APC project needs to be performed before Yarra Trams can take a decision about the next step for the project. The prototype software system demon- strates that it is possible to make use of the APC system to get knowledge about the usage of Yarra Trams services. Even if the future of the software system has been out- lined a the smaller evaluation, a full evaluation of the software system with involve- ment of the Marketing, Operation, IT, and Maintenance departments still needs to be performed. The prototype software system fulfils 70% of the prioritised requirements, more prioritised requirements could therefore be developed before the full evaluation is performed. It is important that the evaluation consider social, psychological, and technical aspects to fully understand the outcomes of the project. Feedback from a larger evaluation will guide Yarra Trams in the continuation of the development of the software system and the continuation of the APC project.

57 Chapter 7

Conclusions

Since the tram network in Melbourne share 80% of the road space with other vehicles, Yarra Tram’s services plays a crucial role to eliminate traffic congestions in Melbourne. If more people choose to travel by the trams instead of cars, the traffic congestion could be reduced since the same amount of passengers would be transported on less road space. This obviously requires a well functional tram network with excellent services. The result of this project shows how Yarra Trams can make use of an APC system. The software system demonstrate possible outcomes from the APC system. It provides Yarra Trams with information about the usage of the tram network, which can be useful to improve services and future planning. As mentioned earlier in this thesis, at least 10-15% of the fleet should be equipped with APC systems, before an organisation can get a moderate data sample for analysis of the tram network. This means that Yarra Trams needs to equip more trams with APC systems to collect reasonable amounts of data samples. The prerequisites are good, and if Yarra Trams install APC systems on all the trams of the types that have been approved, it would cover 47% of Yarra Trams fleet. If Yarra Trams install APC systems on 50 trams, they end up with 10-15% of their fleet equipped. This together with a complete software system for analysis of the data could help Yarra Trams improve their services.

The aim of this thesis is to propose a framework for how a transit agency can make use of an APC system. This research has proved that it takes time to understand how to make use an APC system. It is time-consuming to identify and resolve all gaps and constraints involved in an APC project, and it requires experimentation to realise how an APC system can be used within an organisation. It is therefore important to have a framework that describes and guides a transit agency in how to make use of an APC system.

The following conclusions are important to have in mind when using the framework in an APC project:

Business Understanding

• Before starting an APC project, it is necessary to clarify what objectives the busi- ness have and how an APC project could satisfy these objectives. This should permeate the whole APC project.

58 • The purpose of an APC project needs to be clear from the beginning, as require- ments from different parts of the organisation need to be in line with it.

• It has been proved to be hard to understand the range and possibilities of an APC project beforehand, but at the same time it has shown that it is important when identifying resources for the project. It is therefore necessary to be flexible and open minded to allocate new resources to the project if it is needed.

• An understanding for all information systems within the organisation is neces- sary, as data from those in combination with APC data can create useful infor- mation. Even if all information systems not are involved in the project, it is still good to have this understanding to be able to design the software system with possible future features in mind.

• It is important to consider external actors that can be involved in and affected by the APC project.

• There is a lot to win if the software system is designed and developed on a plat- form that already is used in the organisation.

Data Understanding

• It is necessary to identify which data that is needed to achieve the goal of the APC project.

• Without a complete understanding for the data involved in the project, it is hard estimate the quality of the data and choose a good quality assurance solution.

• It is important to understand how the involved information systems collect and store data to be able to fully understand the data.

• It is important to get a complete picture of all quality issues in the data and iden- tify the source of the problem. Errors can be handled in a preparation module, but it might be worth fixing the source of the problem to avoid errors.

• Before an APC system is used it is necessary to validate it. Data from the APC system can be affected by external factors.

Data Preparation

• An APC project means large amounts of data that needs to be prepared and stored in a good way before it can be used.

• APC data by itself is not sufficient, it needs to be complemented with other data to create useful information. It is important to place the APC data in a temporal and spatial dimension.

59 • Good data preparation algorithms are necessary to ensure the quality of the data, and to make as much data useful as possible.

• It is important to discard data with errors, otherwise it may result in misleading information.

Modelling

• Visualisation can help the user to understand the data. It is important to put the data in a context by using graphs or maps, to make it easier for the user to interpret the information and see patterns and trends.

• Functionality in the software system for creating reports is important to make it easier to spread information.

• The gap between the information needed and the information presented should be as small as possible.

• The user interface of the software system needs to be user friendly, and it is important to consider interaction and visualisation.

Evaluation

• An evaluation of the APC project needs to be performed to certify that the pur- pose and goal of the project has been achieved.

• The software system and the project needs to be evaluated as a whole before it is possible to determine the next step of the project.

• Both organisational and technical aspects need to be considered during the eval- uation to get a complete understanding of the results.

Another aspect that is important to have in mind when using the framework is that there is not always a clear boarder between the phases of the framework. Each phase has its own purpose, but the work in a phase is tightly coupled with the work in other phases. It is therefore good to be flexible when using the framework and let some work in different phases overlap to increase the possibilities of achieving the goal of the project.

7.1 Future Research

As the framework only was tested for one transit agency, the framework needs to be tested in more organisations to verify its validity. A continuation on this research could be to apply the framework in more cases. This would verify the validity of the

60 framework and show how the framework could be modified to help transit agencies make use of APC systems in their business. Another future research could be to focus on the passengers perspective and investigate which information from an APC system that could bring value to the passengers. By providing the passengers with informa- tion about the usage of the tram network, it might be possible to affect their travel behaviour and avoid overcrowded tram vehicles.

61 Bibliography

Printed

Agarwal, B. B., Tayal, S. P. and Gupta, M. (2008). Software Engineering & Testing, Jones and Bartlett Publisher, Boston, United States of America.

Boyle, D. (2008). TCRP Synthesis 77: Passenger Counting Systems A Synthesis of Tran- sit Practice, Transportation Research Board, Washington, United States of America.

Davidson, L. (2008). Pro SQL server 2008 relational database design and implementa- tion, Apress, Berkley, United States of America.

Fayad, U., Grinstein, G. G. and Wierse, A. (2002). Information Visualization in Data Mining and Knowledge Discovery, Morgan Kaufmann Publishers, San Francisco, United States of America.

Furth, P. G., Hemily, B., Muller, T. H. J. and Strathman, J. G. (2006). TCRP Report 113: Using Archived AVL-APC Data to Improve Transit Performance and Management, Transportation Research Board, Washington, United States of America.

Gonzalez-Aranda, P., Menasalvas, E., Millan, S., Ruiz, C. and Segovia, J. (2008). To- wards a methodology for data mining project development: The importance of ab- straction, Studies in Computational Intelligence (118): 165-178.

Griffith, T. L. and Dougherty, D. J. (2002). Beyond socio-technical systems: introduc- tion to the special issue, J. Eng. Technol. Manage. (19): 205-216.

Haluzov, P. (2008). Effective data mining for a transportation information system, Acta Polytechnica 48(1): 24-29.

Han, J. and Kamber, M. (2006). Data Mining, Concepts and Techniques, Morgan Kauf- mann Publishers, San Francisco, United States of America.

Krishnamurthy, N. and Saran, A. (2008). Building Software: A Practitioner’s Guide, Auerbach Publications, Taylor & Francis Group, Boca Raton, United States of America.

Mat, J. L. and Silva, A. (2005). Requirements Engineering for Sociotechnical Systems.

62 Saavedra, M. (2010). An Automated Quality Assurance Procedure for Archived Tran- sit Data from APC and AVL Systems, PhD thesis, University of Waterloo, Waterloo, Ontario, Canada.

Sanderson, S. (2008). Asp.Net Mvc Framework Beta Preview, Apress, Berkley, United States of America.

Sommerville, I. (2011). Software Engineering, 9 edn, Pearson Education, Addison- Wesley, Boston, United States of America.

Te’ eni, D., Carey, J. and Zhang, P. (2007). Human Computer Interaction: Developing Effective Organizational Information Systems.

Ubaka, I. and Glotzbach, G. (2006). ITS/APTS Architecture Guide: To Help FLORIDA Transit Systems Comply with the National ITS Architecture Consistency Policy, Florida Department of Transportation, Tallahassee, United States of America.

Witten, J. H. and Frank, E. (2005). Data Mining, Practical Machine Learning Tools and Techniques, Morgan Kaufmann Publishers, San Francisco, United States of America.

Yarra Trams (2009c). Vehicle location maintenance system - specification. Yarra Trams (2010).

Automated passenger counting system feasibility study. Yarra Trams (2011). Auto- matic passenger counting system trial on b class.

Yin, R. K. (2008). Case Study Research: Design and Method, 4 edn, SAGE Publications, Inc, Los Angeles, United States of America.

Electronic

Acorel (2010). Automatic people counting system, Available at: http://www.acorel.com/wp-content/uploads/ ACOREL-Automatic-People-Counting- System-Retail.pdf. [Accessed 3 February 2011].

Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Sharer, C. and Wirth, R. (2000). Crisp-dm 1.0 step-by-step data mining guide, Available at: http://www. crisp-dm.org/CRISPWP-0800.pdf. [Accessed 14 January 2011].

Iris-GmbH Infrared & Intelligent Sensors (2005). Irma infrared motion analyzer auto- matic passenger counter (apc) system: Technical documentation, Available at: http://www.irisgmbh.de/iris-GmbH us/pdf/KunDok IRMA4-Accuracy-20060912-e.pdf. [Accessed 15 January 2011].

KDR (2011). Think like a passenger, Available at: http://www.kdrmelbourne.com.au/ index.php?id=46. [Accessed 1 February 2011].

63 Metlink (n.d.). About metlink, Available at: http://www.metlinkmelbourne.com.au/ about-metlink/. [Accessed 1 February 2011].

Microsoft (2010). Microsoft visual studio 2010, Available at: http://www.microsoft. com/visualstudio/en-us/. [Accessed 7 February 2011].

Microsoft (2011a). Business intelligence, Available at: http://www.microsoft.com/ bi/productsbi/. [Accessed 7 February 2011].

Microsoft (2011b). Sql server integration services, Available at: http://msdn. microsoft.com/en-us/library/ms141026.aspx. [Accessed 7 February 2011].

Microsoft (2011c). Sql server reporting services, Available at: http://msdn. microsoft.com/en-us/library/ms159106.aspx. [Accessed 7 February 2011].

The Department of Transport (2011a). Public transport, Available at: http://www. transport.vic.gov.au/doi/internet/transport.nsf. [Accessed 1 February 2011].

The Department of Transport (2011b). Public transport partnership agreements, Avail- able at: http://www.transport.vic.gov.au/DOI/Internet/transport. nsf/AllDocs/ 19D8E6C7F444848DCA256E3E00162879?OpenDocument. [Accessed 1 February 2011].

Yarra Trams (2009a). Automated passenger counting system - requirements. Yarra Trams (2009b). Facts & figures, Available at: http://yarratrams.com.au/ desktopdefault.aspx/tabid-47//74 read-117/. [Accessed 16 January 2011].

Yarra Trams (n.d.). History of Yarra Trams, Available at: http://www.yarratrams. com.au/desktopdefault.aspx/tabid-60/38 read-103/. [Accessed 1 February 2011].

Interviews and Meetings

Chabas, D. (2010). Research analyst at Yarra Trams. Interview, 2010-10-12. Yarra Tram Head Office.

Couenon, G. (2010). Project Manager - ICT at Yarra Trams. Regular meetings every second week, 2010-09-07 - 2011-02-07. Yarra Tram Operation Centre.

Holmes, J. (2010). Market analyst at Yarra Trams. Interview, 2010-10-18. Yarra Tram Head Office.

McRobbie, M. (2010). Application development team leader at Yarra Trams. Meeting, 2010-10-01. Yarra Tram Operation Centre.

64 Appendix A Requirements

Forward and backward tracing between user and system requirements. User require- ment identifiers begin with ”U” and system requirements with ”S”. The number in- dicates an ID for each requirement. The requirements that are prioritised are marked with ”1”, and the others with ”0”. The source of the requirements are marked with Main for Maintenance, Mark for Marketing, Oper for Operations, and IT for the IT department. If the requirement is fulfilled it is marked with ”1”, otherwise with ”0”.

A.1 Forward Traceability

Functional

ID User requirements Forward Priority Source Fulfilled traceabil- ity

U1 Provide the TramTracker ID for S1, S2 1 Mark, 1 each stop in the tram network. Oper, Main U2 Provide all the tram numbers S3, S4 0 Mark, 0 with related tram class. Oper, Main U3 Provide all tram numbers in se- S5 0 Mark 0 quential ranges. U4 Provide pre-defined time seg- S6, S7 1 Mark 1 ments categorised by early, am peak, mid morning, lunchtime, afternoon, pm peak, evening, late. U5 Provide the stops allocated to S8, S9, S12 1 Mark, 1 each route in both directions. Oper U6 Provide the date of installation S10, S11, 1 Mark, 0 or removal of stops in the list of S12 Oper stops.

65 U7 Provide all the routes with as- S13, S14 1 Mark, 0 signed sub routes. Oper U8 Provide the capacity for each S15, S16 0 Mark, 0 tram class. Oper U9 Provide filtering of ins, outs and S17, S18, 0 Mark, 0 load by Date (everyday, week, S19, S70 Oper weekday, weekdays, weekend, month, quarter). U10 Provide filtering of ins, outs and S20, S21, 1 Mark, 1 load by Time (half-hours, hours, S70 Oper peak, off-peak, early, am peak, mid-morning, lunchtime, after- noon, pm peak, evening, late). U11 Provide ability for the users to S22 0 Mark 0 change and define their own time segments. U12 Provide filtering of ins, outs and S23, S24, 1 Mark, 1 load by Route with choice of se- S70 Oper lecting one or multiple routes. U13 Provide filtering of ins, outs and S23, S24, 1 Mark, 1 load by Directions (up, down) S70 Oper with choice of selecting one or both. U14 Provide filtering of ins, outs and S25, S26, 1 Mark 1 load by Stops with choice of se- S70 lecting one or multiple stops on single or combined routes. U15 Provide filtering of ins, outs and S27, S28, 0 Mark, 0 load by Vehicles (tram number, S29, S70 Oper tram class). U16 Provide filtering of ins, outs and S30, S31, 0 Mark, 0 load by Run and Trip ID. S32, S70 Oper U17 Provide comparison of the ac- S15, S16, 0 Mark, 0 tual load on routes with capac- S33, S34 Oper ity to see which routes that are overcrowded and at which loca- tion. U18 Provide comparison of the num- S35, S36, 0 Mark 0 ber scheduled trips with the S37 number of APC trips and calcu- late the coverage. U19 Provide the capacity breaches S34, S70 0 Mark, 0 on the network. Oper U20 Forecast the load in the future S38, S39 0 Oper 0 based on growth rate.

66 U21 Provide information about S35, S40, 0 Oper 0 punctuality, how many stops S41 that have been served on the scheduled time. U22 Provide the 10 top stops (ins, S42, S43, 1 Mark, 1 outs, load) across the network. S48 , S70 Oper U23 Provide the 10 top stops (ins, S44, S45, 1 Mark, 0 outs, load) of a specific route or S48, S70 Oper routes. U24 Provide the 5 and 10 top routes S46, S47, 1 Mark, 0 (ins, outs, load) on the network. S48, S70 Oper U25 Provide the status of the sys- S49, S50, 0 Mark, 0 tem, i.e the amount of unknown S70 Main stops and unknown locations. U26 Provide all data set with a qual- S51, S52, 1 Mark, 0 ity tag that describes the confi- S70 Main dence and accuracy of the data. U27 Provide information about how S53 1 Mark 0 many samples of data that is used and from which samples. U28 Provide visualisation in both bar S54, S55 1 Mark, 1 graphs and line graphs. Oper U29 Provide visualisation of the ins, S56, S70 1 Mark, 1 outs and load on a map in form Oper of a load snake. U30 Allow the users to export data S57, S58 1 Mark 1 and information into PDF, Excel and CSV format.

Allow the users to create follow- ing reports:

U31 Total ins by route with bar graph S59, S60, 1 Mark 1 and table. S70 U32 Total ins by route for previous S59, S61, 1 Mark 0 week with bar graph. S70 U33 Sum of off-peak ins for a defined S59, S62, 1 Mark 0 period with bar graph and line S70 graph. U34 Top routes of the previous 12 S59, S63, 1 Mark 1 months with bar graph. S70 U35 10 top stops of the tram network S59, S64, 1 Mark 1 with bar graph. S70 U36 10 top stops by route with bar S59, S65, 1 Mark 0 graph. S70

67 U37 On board load on routes with S59, S66, 1 Mark 1 line graph. S70 U38 Capacity versus on board load S59, S67, 1 Mark 0 on routes with line graphs. S70 U39 Provide automatic sending of S68, S69 0 Mark 0 reports through e-mail.

Non Functional

ID User requirements Forward Priority Source Fulfilled traceabil- ity

U40 Accessible by multiple users 1 IT 1 through a web-interface U41 Developed in .NET environ- 1 IT 1 ment, SQL Server, C# and AJAX

68 A.2 Backward Traceability

ID System requirements Backward Priority Fulfilled traceabil- ity

S1 All the stops with the corresponding U1 1 1 TramTrackerID must be available in the database. S2 The user interface must provide the U1 1 1 user to see all the stops with Tram- TrackerID. S3 All tram numbers and related tram U2 1 1 classes must be available in the database. S4 The user interface must provide the U2 0 0 user to see all tram numbers and re- lated tram classes. S5 The user interface must provide the U3 0 0 user with a list of all the tram numbers in sequential ranges. S6 All predefined time segments must be U4 1 1 available in the view of the software tool. S7 The user interface must provide the U4 1 1 user with all the predefined time seg- ments. S8 All routes in both directions with al- U5 1 1 located stops must be available in the database. S9 The user interface must provide the U5 1 1 user to see all routes in both directions with allocated stops. S10 Each stop must have an corresponding U6 1 1 installation date in the database. S11 If a stop has been removed that stop U6 1 1 must have a removal date in the database. S12 The user interface must provide all U5, U6 1 0 stops of a route, even the ones that has been removed. S13 If a route is a sub route of another route U7 1 0 the database must contain that infor- mation.

69 S14 The user interface must provide the U7 1 0 user with information if a route is a sub route of another route. S15 All tram classes must have information U8, U17 0 0 about the capacity in the database. S16 The user interface must provide the U8, U17 0 0 user with the capacity for each tram class. S17 The date of when the ins, outs, and load U9 1 1 was collected must exist in the database together with the data itself. S18 The date in the database must contain U9 1 1 weekday, e.g. Monday, Tuesday. S19 The user interface must provide the U9 0 0 user ability to choose date by everyday, week, weekday, weekdays, weekend, month, quarter. S20 The time of when the ins, outs, and load U10 1 1 was collected must exist in the database together with the data itself. S21 The user interface must provide the U10 1 1 user ability to choose time by half- hours, hours, peak, off-peak, early, am peak, mid-morning, lunchtime, after- noon, pm peak, evening, late. S22 The software tool must have a function U11 0 0 where the users can change and define own time segments. S23 The ins, outs, and load must be con- U12, U13 1 1 nected with a route in the database. S24 The user interface must provide the U12, U13 1 1 user ability to choose data by route with choice of selecting multiple routes. S25 The ins, outs, and load must be con- U14 1 1 nected with a stop in the database. S26 The user interface must provide the U14, U15 1 1 user ability to choose stop with choice of selecting multiple stops. S27 The ins, outs, and load must be con- U15 1 1 nected with a tram number and tram class in the database. S28 The user interface must provide the U15 0 0 user ability to choose tram number with choice of selecting multiple tram numbers.

70 S29 The user interface must provide the U15 0 0 user ability to choose tram class with choice of selecting multiple tram classes. S30 The ins, outs, and load must be con- U16 0 1 nected with a runID and tripID in the database. S31 The user interface must provide the U16 0 0 user ability to choose runID with choice of selecting multiple runIDs. S32 The user interface must provide the U16 0 0 user ability to choose tripID with choice of selecting multiple tripIDs. S33 The user interface must provide the U17 0 0 user ability to choose to see the capacity compared to the actual load. S34 The user interface must provide the U17, U19 0 0 user ability to see which routes and at which stops the capacity breached. S35 All the scheduled trips must exist in the U18, U21 0 0 database. S36 The software tool must have a function U18 0 0 that counts the amount actual APC- data and the amount actual non APC trips to calculate the coverage. S37 The user interface must always show U18 0 0 the coverage of the selected data. S38 The software tool must have a function U20 0 0 that calculates the predicted load based on historical data/growth rate (regres- sion). S39 The user interface must provide the U20 0 0 user ability to choose to see the pre- dicted load. S40 The software tool must have a function U21 0 0 that compares the actual stop time with the scheduled stop time to see punctu- ality. S41 The user interface must provide the U21 0 0 user ability to choose to see punctual- ity. S42 The software tool must have a function U22 1 1 that gets the 10 top stops across the net- work based on ins, outs or load.

71 S43 The user interface must provide the U22 1 1 user ability to see the 10 top stops across the network with choice of ins, outs or load. S44 The software tool must have a function U23 1 1 that gets the 10 top stops based on ins, outs or load of a route. S45 The user interface must provide the U23 1 0 user ability to see the 10 top stops of a route or routes with choice of ins, outs or load. S46 The software tool must have a function U24 1 0 that gets the 5 top routes based on ins, outs or load of a route. S47 The user interface must provide the U24 1 0 user ability to see the 5 top routes on the network with choice of ins, outs or load. S48 The database must pre calculate a sum- U22, U23, 1 0 mary table with the 10 top stops across U24 the network, 10 top stops of a route, 5 and 10 top routes of the network based on ins, outs or load. S49 The software tool must have a function U25 0 0 that gets all stops that has been fixed (stop code=9999). S50 The user interface must provide the U25 0 0 user with ability to see the status of the system based on stops that has been fixed. S51 The user interface must provide the U26 1 0 user with a quality tag for each data sample. S52 Each table in the database must have a U26 1 0 quality column that indicates the qual- ity of the data. S53 The user interface must provide the U27 1 0 user ability to see how many samples of data that is used and from which sam- ples. S54 The software tool must use visualisa- U28 1 1 tion libraries so the information from the data can be showed in bar graphs. S55 The software tool must use visualisa- U28 1 1 tion libraries so the information from the data can be showed in line graphs.

72 S56 The software tool must use visualisa- U29 1 1 tion libraries so the ins, outs and load can be showed on a map in form of a snake. S57 The software tool must have functions U30 1 1 that exports the data samples into Ex- cel, PDF or CSV files. S58 The user interface must provide the U30 1 1 user ability to choose to export the data samples into Excel, PDF or CSV files. S59 The user interface must provide the U31, U32, 1 1 user ability to choose to create reports. U33, U34, U35, U36, U37, U38, U39 S60 The user interface must provide the U31 1 1 user to create a report of total ins by route with bar graph and table. S61 The user interface must provide the U 32 1 0 user to create a report of total ins by route for previous week with bar graph. S62 The user interface must provide the U33 1 0 user to create a report of sum of off- peak ins for a defined period with bar graph and line graph. S63 The user interface must provide the U34 1 1 user to create a report of top routes of the previous 12 months with bar graph. S64 The user interface must provide the U35 1 1 user to create a report of 10 top stops of the tram network with bar graph. S65 The user interface must provide the U36 1 0 user to create a report of 10 top stops by route with bar graph. S66 The user interface must provide the U37 1 1 user to create a report of on board load on routes with line graph. S67 The user interface must provide the U38 1 0 user to create a report of capacity ver- sus on board load on routes with line graphs. S68 The software tool must have a function U39 0 0 that sends the created report to a spe- cific mail address.

73 S69 The user interface must provide the U39 0 0 user ability to send the created report to a specific mail address. S70 The database must fix or reject data U9, U10, 1 1 samples based on trip validity rules to U12, U13, make sure that the data presented for U14, U15, the user are valid. U16, U19, U22, U23, U24, U25, U26, U29, U31, U32, U33, U34, U35, U36, U37, U38

74