Route Choice Behaviour at Mass Events Stated- versus Revealed Preferences of Pedestrian Route Choices at SAIL 2015 I.M. Galama University of Technology Delft

ROUTE CHOICE BEHAVIOUR AT MASS EVENTS STATED- VERSUS REVEALED PREFERENCESOF PEDESTRIAN ROUTE CHOICES AT SAILAMSTERDAM 2015

by

I.M. Galama

in partial fulfilment of the requirements for the degree of

Master of Science in Transport, Infrastructure and Logistics

at the Delft University of Technology to be defended publicly on 18th of February 2016

Supervisor: Prof. dr. ir. S.P.Hoogendoorn TU Delft Thesis committee: Dr. ir. W. Daamen TU Delft Dr. J.A. Annema TU Delft Drs. M. Hünneman SAIL

PREFACE

This master thesis is conducted on behalf of the Amsterdam Institute for Advanced Metropolitan Solutions (AMS Institute) and as the final part of the master Transport, Infrastructure and Logistics at the Delft Univer- sity of Technology. This thesis is based on both an online survey and revealed data, which is collected at SAIL Amsterdam 2015. The aim of this thesis is to gain more knowledge about route choice behaviour of pedestri- ans at mass events.

Before proceeding with the findings, I would like to thank my supervisors Serge Hoogendoorn, Winnie Daamen, Jan Anne Annema and Mark Hünneman for their feedback and help during our regular meetings and the fruitful discussions at the more informal moments. The AMS Institute for facilitating the research at SAIL, and Edwin and Peter for their help during the event itself. The members of the PED-meetings for their substantive feedback. Furthermore, I would like to thank everyone who has actively contributed to the con- tent of my thesis. Special thanks to Fieke, Hidde, Margot, Anouk, Matthijs and Lidewij for giving their valuable contribution in the brainstorm session. Many thanks to all respondents who filled in the online survey and those who participated in the GPS-tracker research at SAIL -fortunately too many to mention by name.

Besides, I would like to thank those who helped me relax and put things in perspective, alongside the hard work of writing this thesis. Dear friends, family, Mmilka, team-mates and fellow graduation-students, thanks for the lovely moments, coffees, matches, drinks and dinners together. Finally, special thanks to my parents, my sisters -Josien and Nienke- and Willem, who truly helped me by their unconditional support and with finalizing the last bits and pieces.

Hopefully you will enjoy reading this thesis, and get inspired for your own future explorations.

I.M. Galama Delft, February 2016

iii

SUMMARY

Nowadays, mass events are a returning phenomena all over the world. But it is shown that information about the behaviour of pedestrians during these mass events is scarce. The years of experience of crowd managers is currently essential for managing events. This information, however, is not always adequate and is missing quantitative measures. As is experienced during previous mass events, they can bring huge safety risks and can cause major incidents (e.g Love Parade in ). Therefore, more knowledge is required about crowd behaviour to manage mass events and to prevent such disasters from happening in the future. The aim of this master thesis was to provide more insight into the behaviour of pedestrians at mass events. The focus has been their route choice behaviour, as this plays a major role in the management of crowds. This insight was provided by both stated preference (SP) and revealed preference (RP) research, since both have their pros and cons. Besides, the comparison of both research methods was expected to give interesting re- sults. SAIL Amsterdam 2015 was used as a case study and as the base for the research. SAIL has grown to become the largest public event in the and the largest free nautical event in the world.

The objective of this thesis was twofold. On the one hand, based on two methods of data collection, SP (online survey) and RP (GPS-trackers), factors of influence on pedestrian route choices were studied and their corresponding parameter values were estimated with Multi Nominal Logit (MNL) models. On the other hand, since the two methods are both analysing SAIL, the findings of the methods -and thus the quality of SP and RP methods- were compared. The objective formed two research questions:

1. Which attributes -and to what extend- influence the stated- (SP) or the revealed preferences (RP) of pedes- trian route choice behaviour at the mass event SAIL?

2. How do the attributes that influence the pedestrian route choice behaviour at the mass event SAIL, corre- spond for stated- (SP) and revealed preference (RP) studies?

Firstly, a literature review was conducted. It showed that pedestrian behaviour can be assessed by means of their choice behaviour, categorized in the strategic, tactical and operational level. Route choice behaviour is part of the tactical level and depends on the choice maker, choice set (alternative routes), attributes of alter- natives and the decision rules of the choice maker. Multiple attributes of influence were found in literature, but the influence on mass events was -to the knowledge of the author- not studied thoroughly. For research- ing choice behaviour of pedestrians discrete choice models are widely used. In this thesis a MNL model was be used, as this is a reliable model for pedestrian route choice behaviour and it is a good model to give a first insights into the attributes. Secondly, a brainstorm session was held, in order to detail which attributes are of influence at mass events, resulting in a list of attributes. These attributes, combined with those resulting from the literature re- view, were assessed in a Multi Criteria Analysis (MCA). This led to nine possible attributes, of which six were selected for further analysis, as they were usable in both the SP and RP research. These attributes are attrac- tions (tall ships, music stages), crowdedness, signs (or following the main route), road size, trees and water.

At the start of the SP research, a pilot survey was conducted. The main target was finding prior values for the attributes and test if the respondents understood the choices they were asked to make. Respondents had to select their preferred route based on two photos, each containing different attributes. After the pilot, an efficient final survey was designed, containing eight choice sets. The final SP survey had 177 respondents. The average age was relatively low, education level relatively high and household income low again, if compared to the Dutch average. Presumably, most respondents were students from the Delft University of Technology. It is not known if this sample is representative for the visitors of SAIL. Within this homogeneous group of respondents, significant correlations between their characteristics and experience at mass events were found. However, due to the missing variety in the sample, these characteristics were not considered in the route choice models.

v vi 0.S UMMARY

The SP results show that the extend of influence of each attribute differs per MNL model. The most ac- curate model (ρ2 of 0.189), showed that five out of six attributes were significant: attractions (1.160), crowd- edness (-1.870), road size (0.253), trees (-0.193) and water (0.334). The attribute crowdedness was significant in all MNL models, and always has the highest repellent value. Much variation was found for the models in- cluding attractions and signs. In which signs was most regularly found negative, as was unexpected based on literature. These varying results might be mainly driven by the inaccurate design of the survey choice sets. A more thorough pilot survey could circumvent this issue in the future.

During SAIL, the RP research was held. One hundred GPS-trackers were distributed daily at Amsterdam Central station to visitors of SAIL. A total of 322 trips were collected over five days, of which 155 took place in the main area of interest for this thesis (station to Kop van Java). The participants were mostly elderly couples. At SAIL most trips (60-70%) followed the main indicated route when they started their trip at the station. More differentiation was found in the routes when the pedestrians returned to the station. At the Java-eiland the return trips are somewhat equally distributed over the alternative routes: pedestrians go either back via the Javakade (this is the main route up), keep following the main route, or take a short cut somewhere on the island. From the Verbindingsdam to the station most trips (> 50%) take a short cut somewhere at the Veemkade. The other trips either take public transportation (> 10%), return via the Veemkade, keep following the main route, or take a specific short cut at the Vriesseveem.

The attributes of influence were estimated in multiple MNL models, distinguished by four different Origin Destination (OD)-pairs in the area of interest. For each alternative route the six attribute values were deter- mined via two methods: binary and scalar. The binary method assigned zeros and ones to the alternatives, but was found to be too arbitrary and gave no significant parameter values in the models. The scalar method estimated the share of the attributes on the routes’ total length. For crowdedness the WiFi sensors were used to estimate the pedestrian splits per route. A large variation was found between the result which captured the way up and the way back. This change in behaviour cannot be explained fully in this study, and is interesting to research in future RP studies. Crowd- edness was the only attribute which was significant in most models, particularly on the way up, and had a highly attractive value (opposite as was found for the SP data). It was found that some of the attributes seem to have a strong correlation. In the studied area at SAIL the tall ships (defined as attractions) and water are correlated as they are always on the same route, and thirdly crowdedness was often seen along these attrac- tions. Logically it is difficult to explain the pedestrian choice based on these attributes. Besides, the Alternative Specific Constants (ASC) values were perceived to be very dominant in the model estimations, due to the lack of variety in the data and the alternatives. More attributes are significant when the ASC are not considered in the models. Since the ASC explains the unobserved attributes of each alterna- tive, there might be other attributes of influence on the route choices. Especially on the way back influences, such as the previous taken routes, might be of influence on the route choices.

When the results of two methods are aligned, strong conclusions may be drawn. Unfortunately, no di- rect answer could be given on the main research questions in this thesis, due to the variety in SP and RP results. However, it may be concluded that new insights were generated for pedestrian route choices, based on these differences. It is of interest to note that SP and RP data collection methods can vary this much in their outcomes. The method which is used as a regulatory decision tool, might influence the final decision completely. To regulatory bodies it is advised to critically look at the methodology of data collection before it can be trusted fully. The main driver for this difference was expected to be the design of the SP survey and the lack of variety in the RP data. This shows and emphasizes the difficulty of collecting valuable data with RP or SP methods.

The discussion on the SP and RP data has been done extensively. The advantage of a SP over RP study is the possibility to design variation in the alternative choices. This is an advantage as the variety experienced in RP data limited the results that were obtained. However, for SP studies the way how the respondents interpret the design of the survey can vary, which could bias the results. For example in the SP survey it was difficult to portrait a realistic sense of the crowd. Perhaps new technologies such as virtual reality may give a more realistic view. For RP study the level of detailing for the Origin Destination (OD)-pairs affects the results. Currently the four pairs were fairly coarse, resulting in a low level of detailed outcome for route choice behaviour. For future research it is suggested that smaller parts of the trips are analysed. CONTENTS

Preface iii Summary v List of Figures ix List of Tables xi 1 Introduction 1 1.1 Problem statement ...... 1 1.2 Scope ...... 2 1.3 Methodology ...... 2 1.4 Report Structure ...... 4 2 Literature Review 5 2.1 A Journey ...... 5 2.2 Pedestrians ...... 5 2.3 Pedestrian Behaviour ...... 6 2.4 Discrete Choice Models ...... 9 2.5 Data Collection Methods ...... 11 2.6 Conclusions...... 11 3 Design of the Stated Preference Route Choice Survey 13 3.1 Pilot Survey ...... 13 3.1.1 Brainstorm Session ...... 13 3.1.2 Selection of the Attributes and Levels ...... 15 3.1.3 Design of the Pilot Survey ...... 15 3.1.4 Results of the Pilot survey ...... 16 3.2 Final Survey Design ...... 17 3.3 Conclusions...... 18 4 Results of the Stated Preference Survey 19 4.1 Quantitative Analysis of the Survey ...... 19 4.2 Choice Model Estimation of the Survey Data ...... 22 4.3 Conclusions...... 25 5 Case Study: SAIL Amsterdam 2015 27 5.1 Before the Case Study ...... 27 5.2 Expectations of the Case Study ...... 28 5.3 Overview of the Days at SAIL ...... 29 5.4 Conclusions...... 32 6 Data Processing and Quantitative Analysis of the GPS Data 33 6.1 Processing of the GPS Data From SAIL ...... 33 6.2 Statistical Analysis of the GPS Data ...... 35 6.3 Conclusions...... 43 7 Factors of Influence on Route Choices at SAIL 45 7.1 Alternative Routes in the SAIL Area ...... 45 7.2 Quantify and Estimate Attributes at SAIL ...... 45 7.2.1 Binary Method ...... 46 7.2.2 Scalar Method ...... 49 7.3 Conclusions...... 53

vii viii CONTENTS

8 Conclusions, Discussion and Recommendations 55 8.1 Conclusions...... 55 8.2 Discussion ...... 56 8.2.1 Stated Preference Survey ...... 56 8.2.2 Revealed Preference Data ...... 56 8.3 Recommendations ...... 57 Bibliography 59 A Brainstorm Session 61 B Multi Criteria Analysis 65 C SP Pilot Survey 67 C.1 Ngene file: Orthogonal Design ...... 67 C.2 Biogeme Model: Pilot Survey ...... 67 C.3 Online SP Survey ...... 69 C.4 Results Biogeme ...... 75 D SP Final Survey 77 D.1 Final Survey: Efficient Design ...... 77 D.2 Eight choice sets ...... 77 E Addition to: quantitative analysis stated preference survey 81 F Data Processing: Matlab 83 G Revealed Preference additional information 87 LISTOF FIGURES

1.1 Visualisation of the research approach of this master thesis ...... 3

2.1 Basic example of a fundamental diagram for pedestrians...... 6 2.2 Hierarchical representation of route choice behaviour of pedestrians ...... 7

3.1 Photo fragments of the introduction movie for the brainstorm session ...... 14 3.2 Route choice question 2 of the final survey...... 18 3.3 Route choice question 6 of the final survey...... 18

4.1 Nationality distribution of the survey respondents...... 19 4.2 Age and gender distribution of the 177 survey respondents...... 20 4.3 Education level and annual household income distribution of the survey respondents...... 20 4.4 Statistics of the answers to the mass event related questions 1 and 2...... 21 4.5 Statistics of the answers to the mass event related questions 3, 4 and 5...... 21 4.6 Respondents route choice question 5 of the survey...... 23 4.7 Respondents route choice question 6 of the survey...... 25

5.1 Map of the SAIL area where the locations of the WiFi-sensors are indicated...... 28 5.2 Veemkade (photo by author, August 19, 2015) ...... 29 5.3 Temporary bridges on the Javakade (photo by author: August 20, 2015) ...... 29 5.4 Pawns divide multiple modalities (photo by author, August 20, 2015) ...... 30 5.5 A crowded Veemkade on Saturday afternoon (photo by author, August 22, 2015) ...... 31 5.6 Trees at the Javakade provide shade for the pedestrians (photo by author, August 22, 2015) . . . 31 5.7 Different type of staff members at SAIL (photos by author, August, 2015) ...... 31 5.8 Overview of the different types of signs at SAIL (photos by author, August, 2015) ...... 32 5.9 Overview of the matrix signs at SAIL (photos by author, August, 2015) ...... 32

6.1 Flow diagram of the process done in Matlab ...... 34 6.2 Origin (O) and Destination (D) numbers...... 34 6.3 Overview of the number of trips recorded with the GPS-trackers...... 35 6.4 Age, gender and group size distribution over the 155 trips sample...... 35 6.5 Alternative routes for OD12 and OD23 (station to Kop van Java)...... 36 6.6 Alternative routes for OD21 and OD32. (Kop van Java to station)...... 37 6.7 Route shares of OD12 and OD21...... 37 6.8 78 Trips at main route OD12 divided by their turning points...... 37 6.9 Route shares of trips at OD23 and OD32...... 38 6.10 Trips from participants younger than 45 years...... 40 6.11 Trips from participants who are 45 years or older...... 40 6.12 Trips on Thursday August 20th ...... 41 6.13 Trips on Friday August 21st ...... 41 6.14 Trips on Saturday August 22nd ...... 41 6.15 Trips on Sunday August 23rd ...... 41 6.16 Trips where the total time is less than 3 hours ...... 42 6.17 Trips where the total time is equal or more than 3, but less than 6 hours ...... 42 6.18 Trips where the total time is equal or more than 6, but less than 9 hours...... 42 6.19 Trips where the total time is equal or more than 9 hours ...... 43

7.1 Attraction zones at the SAIL-area...... 46 7.2 Main routes which are indicated by the signs in the SAIL-area...... 47 7.3 Narrow streets are indicated with blue arrows in the map of the SAIL-area...... 47

ix x LISTOF FIGURES

A.1 Factors of influence by participant 1 ...... 61 A.2 Factors of influence by participant 2 ...... 62 A.3 Factors of influence by participant 3 ...... 62 A.4 Factors of influence by participant 4 ...... 62 A.5 Factors of influence by participant 5 ...... 63 A.6 Factors of influence by participant 6 ...... 63 A.7 White Board with the retrieved attributes of the brainstorm session...... 63

C.1 Overview of the pilot online SP survey made in GoogleForms...... 71

D.1 Choice set 1 ...... 78 D.2 Choice set 2 ...... 78 D.3 Choice set 3 ...... 78 D.4 Choice set 4 ...... 78 D.5 Choice set 5 ...... 79 D.6 Choice set 6 ...... 79 D.7 Choice set 7 ...... 79 D.8 Choice set 8 ...... 79

F.1 Points Of Interest (POI) at the SAIL-area ...... 83 F.2 Visualization of the removal of outliers in Matlab...... 84 F.3 Visualization of multiple smoothing methods for trip 300...... 84 F.4 Trip Plotter layout ...... 85 F.5 Overview of the Matlab structure per Trip i ...... 85

G.1 78 Trips at main route OD12 divided by their turning points...... 87 G.2 Start trip in the morning, end trip in the afternoon ...... 89 G.3 Start trip in the morning, end trip in the evening...... 89 G.4 Start trip in the afternoon, end trip in the afternoon...... 90 G.5 Start trip in the afternoon, end trip in the evening...... 90 G.6 Start trip in the evening, end trip in the evening...... 90 G.7 Position of the WiFi-sensors...... 92 LISTOF TABLES

2.1 Attributes found in literature which could be of influence on the route choice of pedestrians. . . 9

3.1 Attributes which are used for the SP survey...... 16 3.2 Ngene Pilot Survey, orthogonal design...... 16 3.3 Statistical information of the respondents of the pilot survey...... 16 3.4 Parameter estimates of all attributes ...... 17 3.5 Ngene final survey, efficient design...... 17

4.1 Correlations table of the socio-demographic- plus mass-event characteristics of the respondents. 22 4.2 Correlations table of the socio-demographic, mass-event and 8 route choice questions...... 23 4.3 The distribution of answers on the 8 route choice questions (177 respondents)...... 23 4.4 Parameter estimates of ten MNL choice models in Biogeme...... 24

6.1 Cross table of trips which walked from the station to the Verbindingsdam and back...... 38 6.2 Cross table of trips which walked from the Verbindingsdam to the Kop van Java and back. . . . . 38 6.3 Correlations between the routes of each OD-pair...... 39 6.4 Ferry North > Sumatrakade: Verbindingsdam > station (OD21) ...... 39 6.5 Ferry North > Sumatrakade: Kop van Java > Verbindingsdam (OD32) ...... 39 6.6 Correlations of socio demographic characteristics versus route choice at each OD-pair . . . . . 39 6.7 Correlations between start- and end-day part of the trip versus route choices...... 40

7.1 The discrete values of the attributes. They are specified for each route at each OD-pair...... 46 7.2 Parameter value estimates, wherefore the binary data is used to estimate MNL choice models. . 48 7.3 The scalar values of the attributes (minus crowdedness)...... 50 7.4 Parameter value estimates for the scalar estimates...... 51 7.5 Parameter value estimates without ASC values...... 53

A.1 General information about the brainstorm session ...... 61 A.2 Overview of the ranking of the attributes...... 64

B.1 Multi Criteria Analysis of attributes...... 65

C.1 Results of the pilot survey...... 72 C.2 Results of the pilot survey which were estimated in Biogeme...... 75

D.1 Ngene final survey, efficient design...... 77

E.1 Cross table of gender and highest education level...... 81 E.2 Cross table of gender and frequency of event visits per year...... 81 E.3 Cross table of household income and age of the participants...... 81 E.4 Cross table of household income and group composition of the participants...... 82 E.5 Cross table of frequency of mass-event visits per year and age of the participants...... 82

G.1 Correlations of Full trips at OD12 (station all the way to the Verbindingsdam) ...... 87 G.2 Full trips: frequencies of chosen routes ...... 88 G.3 Correlations half trips OD12 ...... 88 G.4 Half trips: frequencies of chosen routes ...... 89 G.5 Explanation of the scalar attributes...... 91 G.6 The splits which are used for the crowdedness split estimations...... 92 G.7 Crowdedness attribute values, per trip and per OD-pair...... 93

xi

1 INTRODUCTION

During mass events within city centres of the Netherlands thousands of visitors gather in and around the city centres. Examples of these events are the yearly returning Kings Day and the succession of Willem-Alexander on the 12th of September 2013, both attracting approximately 700,000 visitors to the city centre of Amsterdam [1,2]. From the 19 th until the 23th of August 2015 a similar mass event was organized in the city centre of Amsterdam: SAIL Amsterdam (SAIL). Since its first edition in 1975, SAIL has grown to become the largest public event in the Netherlands and the largest free nautical event in the world. Every five years, about 600 ships navigate along the North Sea Canal before mooring in and around the IJ-haven in Amsterdam [3]. This year the organisation of SAIL was expecting approximately 2 million visitors, distributed over the five days of the event [4]. The municipality of Amsterdam decided to start a project for the crowd behaviour throughout its cities network. This is done in collaboration with the Amsterdam Institute for Advanced Metropolitan Solutions (AMS Institute), the organisation of SAIL and DAT.Mobility. SAIL has been used as a test case for a real-time crowd monitoring system. The aim of this test case is to in the end, expand it to a decision support system for crowd management during future events. This thesis uses parts of the collected data (GPS and WiFi) which is gathered at SAIL as the base for its research.

1.1. PROBLEM STATEMENT Mass events are being received warmly by the inhabitants of the Netherlands. However, they also bring safety risks for the visitors and can cause major incidents as is experienced in the past. Recent examples are Pukkelpop (Keiwit, Belgium) in 2011 [5] and Love Parade (Duisburg, Germany) in 2010 [6], which caused respectively 5 and 21 deadly injuries. These incidents in Duisburg and Keiwit, but also at other disasters [7], involved large crowds and lacked sufficient information for adequate measures. Nowadays the years of experience of the crowd managers is the major source of information. This information, however, is not always adequate and is missing quanti- tative measures. Sufficient information could be obtained via multiple methods. Using a system such as the real-time crowd monitoring system, which was tested at SAIL, could help to gain real-time information to distribute flows more equal over the available network. To make this real-time monitoring system more ac- curate and more useful in practice, new knowledge of the behaviour of the crowd (or pedestrians) is needed. Especially knowledge of how pedestrians choose their routes is lacking. Besides, this new knowledge could also be used for measures beforehand -before the mass event takes place.

Knowledge gap The influence of different factors on the route choice behaviour of pedestrians is studied thoroughly in various studies. In these studies the methods of collecting data differ (as sometimes their out- comes). Stated preference (SP) and revealed preference (RP) are the most widely used methods and they are commonly used to predict the influences on route choice behaviour of pedestrians. Influences of the built environment on route choices is for example done by Bafatakis [8], Korthals Altes and Steffen [9] and Borgers and Timmermans [10] analysed the route choices of pedestrians in inner-city shopping centres. On the other hand, Zomer [11] investigated during the Vierdaagsefeesten in Nijmegen, what the influence of information measures were on the activity choice behaviour during a mass event.

1 2 1.I NTRODUCTION

However, knowledge about route choice behaviour at mass events in city centres is not yet studied thor- oughly. And since different methods could provide differing result, both methods could be used to compare their outcomes. To the knowledge of the writer, the comparison between the results of both SP and RP data is not yet done for pedestrian route choice models, or in a similar field of study. That knowledge gap will be filled in this thesis.

Objective This thesis analyses both the SP and the RP for attributes of influence on pedestrians’ route choices at the mass event SAIL in the city of Amsterdam. So for example the influence of varying situations of crowdedness (different Level Of Services [12, 13]) on the route choice behaviour will be analysed. On the one hand, for both methods -SP and RP- influences are studied and their corresponding parameter values are estimated in a MNL model. On the other hand, since the two methods are both analysing the same case (SAIL), the findings of the methods will be compared.

Research questions The two research questions that are answered in this thesis are:

1. Which attributes -and to what extend- influence the stated- (SP) or the revealed preferences (RP) of pedestrian route choice behaviour at the mass event SAIL?

2. How do the attributes that influence the pedestrian route choice behaviour at the mass event SAIL, correspond for stated- (SP) and revealed preference (RP) studies?

1.2. SCOPE This thesis was conducted to find attributes of influence on pedestrians’ route choice behaviour at mass events and explored if SP or RP studies give varying results for these attributes. Given the complexity and extend of the total network at SAIL, and to find the value of influence on the route choices within the given time of this thesis, the orange route area was selected for further analysis. This was the main area at SAIL: the Java-eiland and streets around the Veemkade were part of this area. Other parts of the SAIL area are not considered for this thesis. For pedestrians behaviour a deviation can be made between three different levels [14], enumerated below. This thesis focuses on the tactical level and therein the route choice behaviour of pedestrians. It should be known for the reader that measures on these other levels could also influence the pedestrians behaviour. For example on the strategic level, initially deciding to go to SAIL influences the pedestrian behaviour. However, this thesis does not take this into account.

• Strategic: activity set choice • Tactical: activity scheduling, activity area choice, route choice • Operational: walking, waiting, performing activity, trajectory choice

Contributions to science and society Both scientifically as socially this thesis have made contributions. Scientifically more insight in the route choice behaviour of pedestrians at mass events is obtained. But most of all in terms of methodology this thesis has succeeded to shows the comparison between SP and RP studies and revealed their pros and cons. Socially the results of this thesis could be deployed for crowd management measures at the next edition of SAIL in 2020. In addition, the findings of this thesis could be a resource for measures at other mass events and to eventually provide more safety within crowds.

1.3. METHODOLOGY Figure 1.1 gives a visual overview of the research approach of this thesis. In the following paragraphs the figure is explained more in detail.

• Trigger: In this first phase the problem is identified, the objective and the research questions are for- mulated. It indicates the purpose of this master thesis.

• Literature Review: The literature review gains information about the thesis topic. It gives insight in the knowledge gabs, methods and techniques and hypothesis are formed. Furthermore, the attributes of the alternatives should be defined and a selection should be made of which attributes are within the scope of this thesis. 1.3.M ETHODOLOGY 3

1. Trigger Problem identification

2. Literature Review

Identify gaps & Feedback Method

3. Brainstorm, design (pilot) survey & collect data 8. Conclusions & recommendations 4. Analyse data & estimate models Stated Preference

Parameters Parameters

Revealed Preference

7. Analyse data & 6. Collect 5. Set-up estimate models GPS data case study SAIL

Figure 1.1: Visualisation of the research approach of this master thesis

• Stated Preference: The main goal of this SP (by means of an online survey) was to find out the stated reasons -for pedestrians- to chose their routes at a mass events. A brainstorm session was held to determine the main attributes of influence on route choices at mass events. Based on this brainstorm and literature, six attributes are selected for the design of the pilot survey. The prior estimates of this pilot survey were used to design the final survey. In a choice model the parameters for each attribute of influence are estimated.

• Revealed Preference: The main goal of this RP is to find out what the relative distribution of the crowd is and what therefor the reasons are: what percentage of the pedestrians decides to choose route a and what percentage decides to choose route b? Once the data is collected the data can be processed. A few values need to be retrieved: First the route splits should be found for multiple OD-pairs. Simultane- ously the attributes of influence need to be quantified at these routes. The given attributes can define what influence they have on the route choice of pedestrians. This can again be concluded in a choice model, wherefore the parameters need to be estimated.

• Conclusions: Finally, the conclusions are drawn. And once both the data from the SP and the RP are reviewed they could be compared. This chapter answers the research questions, elaborates on the discussion and gives future recommendations. 4 1.I NTRODUCTION

1.4. REPORT STRUCTURE The report structure is based on the methodology cycle in Figure 1.1. Each block represents one chapter. Chapter 2 focusses on the literature and methods related to this thesis. Chapter 3 and 4 describe how the stated preference will be designed and analysed. The 5th chapter gives an overview of the case study and the preparations which were done beforehand. In chapter 6 and 7 the collected data is processed and analysed and choice models are estimated. Finally, the 8th chapter shows the final conclusions, discussions, recom- mendations and herein compares the results of both data collection methods. 2 LITERATURE REVIEW

This chapter describes the existing literature on the current theories on the influence of on route choice be- haviour of pedestrians at mass events. It starts, in Section 2.1 with the journey itself, and how this is explained in theory. Then, the main theories on pedestrians and their walking and/or route choice behaviour are stated in Section 2.2 and 2.3. Section 2.4 gives an overview of the current route choice models and Section 2.5 explains SP and RP data collection methods. Relevant information for this thesis is concluded in the final Section 2.6.

2.1. AJOURNEY A commonly used urban transportation model is the four-step model by De Dios Ortuzar and Willumsen [15], which consists of four main steps which influence the choices of people during transportation. Conceptually this model consists of nodes and links, wherein each node has a certain demand and supply. All these nodes are connected via links, allowing the needed demand and/or supply to be transported. Below these four steps are enumerated and explained further in detail.

1. Trip generation: This determines the frequency of trips made by people who originate or depart from a zone. The purpose of each trip is also defined in this first step; 2. Trip distribution: This matches the origins and destinations of all trip and it shows where the major connections are; 3. Mode choice: This determines the modality which is used to full fill each given trip; 4. Route choice: Once the modality is known for each trip, a route can be chosen which is linked to the given transport network.

For this thesis not all steps of the four-step model are taken into account. Although it is useful to know what the influence of different steps is on the route choice. For this thesis, the mode choice can be left aside, because it focuses just on pedestrians. Trip generation determines where the pedestrians originate from. The purpose of the trip is in this case leisure, because a case study is done at an event, where people do not com- mute but enjoy the event. Therefore different locations at the event could be functioning as an intermediate destination for leisure purposes (for example tall ships at SAIL, food stands or music stages). The route choice is where this thesis will focus on. It researches the factors why a certain route is more attractive than another.

2.2. PEDESTRIANS In traffic flow theory the main principle says that there are multiple states wherein traffic can occur: free flow or congestion. In addition, some theoreticians believe that there is a third stage namely: synchronized flow. These theories are explained by the so-called fundamental diagrams. The fundamental diagram relates flow q (in pedestrians per second) with density k (in pedestrians per metre). This theory is used to predict what happens if there is congestion and it gives insight in how to solve traffic related problems. In the last decade, these theories for vehicles were transformed into a theory for pedestrians. In the same way the fundamental

5 6 2.L ITERATURE REVIEW diagrams can predict what happens in pedestrian flows. A basic example of a fundamental diagram for pedes- trians is shown in Figure 2.1. What is shown in this figure is that the flow initially increases when the density of pedestrians increases, this stage is called the free flow stage. Once the amount of pedestrians reaches the maximum capacity (qc ), the speed of the flow will decrease and the congested stage is reached. The speed reduces till the jam density (k j ) is reached and no more persons are able to enter the area. Theories on the fundamental diagram for pedestrians differ even more strongly than for vehicles, due to the large variety on the qc and the k j . These measures vary strongly because of the individual pedestrian characteristics and external conditions. Influences on the fundamental diagram are for example: age, cul- ture, gender, shy away distance (distance to: walls, other pedestrians, objects, etcetera), outside temperature, travel purpose, type of infrastructure and walking direction [13, p.40]. The traffic flow theory will not be the main focus of this thesis, but it gives some insight in the movement of pedestrians. It is relevant to know that a higher density (or more dense crowd) eventually lead to a lower speed. This behavioural aspect of pedestrians could influence their route choices. Flow q (P/s)

Density k (P/m2)

Figure 2.1: Basic example of a fundamental diagram for pedestrians.

Level of Service Another theoretical part of pedestrian studies is the Level of Service (LOS). This therm is related to the Density, it provides a quantitative framework on the crowdedness at a certain location. The LOS scale starts at LOS A and ranges till LOS F. Where level A indicates a pedestrian density 0.18P/m2 or a flow < 0.27P/ms and level F indicates a pedestrian density 1.33P/m2 or a varying flow. [16] ≤ ≥

2.3. PEDESTRIAN BEHAVIOUR In addition to the traffic flow theory perspective, transportation (or walking) of pedestrians could be assessed by means of their choice behaviour. Hoogendoorn and Bovy [14] categorized the behaviour of pedestrians in following three levels: strategic, tactical and operational. On the strategic level pedestrians choose an activity set and the departure or arrival time of a certain trip. Once the activity set is known, the order in which the activities will take place is chosen. This is called activity scheduling, and is done on a tactical level. Next, the area is chosen where to perform the activities and a route to this area is defined. Both are also done on the tactical level and they depend on the network topology and timetables. Then, at the operational level, pedestrians walk, wait, perform an activity and interact with public transport. These aspects depend on the local geometry, obstacles, etcetera. As Daamen [13] states in her Ph.D. thesis, most commonly these choice processes of pedestrians are performed in a simultaneous way, but the consecutive steps are shown in Figure 2.2. Indeed in the tactical level pedestrian (route and activity) choices could be performed in a simultaneous way, as Hoogendoorn and Bovy showed [14]). Due to the scope of this research, it is assumed for this thesis that consecutive choices are made for route and activity choices, so they can be analysed separately.

ROUTE CHOICE BEHAVIOUR According to Ben-Akiva and Lerman [17] the individual (route) choice behaviour depends on four items as enumerated below. This accounts for every modality where individuals make their own route choice, such as pedestrians, bicycles and vehicles. In this thesis we will recall the theory for pedestrians. Although compared to vehicles, pedestrian flows are very complex. 2.3.P EDESTRIAN BEHAVIOUR 7

Strategic Activity set

Activity scheduling

Tactical

Activity area choice

Route choice

Operational Trajectory choice

Figure 2.2: Hierarchical representation of route choice behaviour of pedestrians [13] with the three behavioural levels [14].

1. Choice maker; 2. Choice set; 3. Attributes of alternatives; 4. Decision rules.

The choice maker is the person or group op people who choose the route. The route choice is dependent on the choice maker, because socio-economic characteristics of the pedestrian could influence the route choice. The set of alternatives where the choice maker can choose from is called the choice set. As Pagliara and Timmermans [18] state it is very important to pick the right choice set to avoid biased parameters for your choice model. Important herein is the size of the set, the composition of the set (joint distribution of attributes), and the spatial structure of the set (degree of overlap and crossings) [19]. The attributes of the alternatives (choice set) describe the different characteristics of the alternatives in terms of attractiveness. The decision rules are based on the internal procedure of the choice maker to arrive at the final choice. Since the goal of this thesis is to find the attributes which are of most influence on the route choice be- haviour of pedestrians during mass-events, the main focus will lay on the choices which are made by the individuals and the attributes which these alternatives (or choices) have. In the next section an overview of possible attributes of the alternatives will be explained.

FACTORS INFLUENCING THE ROUTE CHOICE BEHAVIOUR As stated in the previous section, the choice behaviour of pedestrians depends among other things on the attributes of the alternatives. To determine the attributes which influence the route choice of pedestrians during mass-events, literature is studied extensively. Below the commonly found attributes are described per different category: environmental- , mass-event-, socio-economic-, natural environment- and other in- fluences. An overview of all these attributes and their expected influence is given in Table 2.1. Attractive influence means that pedestrians are more willing to choose this route, because of the presence of this at- tribute. Repulsive influence means the opposite. Varying influence means that it depends on the situation (or alternative) if this attribute has a repulsive or attractive influence.

Environmental influences on route choices Related to the path or road of an environment, multiple influ- ences on the route choices of pedestrians could be named. First of all, the number of available routes is an important factor of influence [20], because without any routes no choices can be made. Another influence on the route choice behaviour of pedestrians can be the type of road [21], in terms of materialisation and what modalities make use of the road. A pedestrian-only road could for example be more attractive to walk on than a side walk next to a large vehicle road or on a high quality walking surface. In addition, as Korthals Altes and 8 2.L ITERATURE REVIEW

Steffen [9] state in their study on the "Experience and route choice in the city centre of Delft", the road width and the height-width ratio is also of influence. Guo and Loo [22] also found a significant influence of the road with ratio between a route choice, but due to the lack of variations in the case study areas (Hongkong and New York) this effect could be less than measured. Number of turns or complexity of a route or directness is of influence on the route choice [9, 14, 21, 23], a complex route could both be of positive as negative influence: pedestrians could tend to take the easiest route, but on the other hand, the more ’explorative’ pedestrian (f.e. children) could find it more interesting to explore complexer routes. Intersections or crossings are assumed to be of influence on route choices by Guo and Loo [22] and Bovy and Stern [21], because of their disutility -in forms of potential waiting times or hazardous situations- pedestrians tend to avoid them. Similar experiences are found by Hoogendoorn and Bovy [14] for other obstacles on the path. Related to the buildings in an environment, the building type is of influence on the route choice accord- ing to Bafatakis, Guo and Loo and Bovy and Stern [8, 21, 22], whereby a difference is found the year that the building is built (f.e. distinguish between modern or old buildings). Building density [9, 21], land use along the route -commercial or residential areas-[8, 21–23] and visual pleasantness or aesthetics of a building (f.e. varying facades or colours) [9, 21] are also found to be factors of influence. Landmarks (f.e. churches, bridges, statues, notable facade elements/colours, squares) can, according to Korthals Altes and Steffen[9] be of influence on route choices and especially plays a role in orientation measures for pedestrians. Other environmental influences are the topography, wherein a hilly topographies (f.e. bridges and slopes) are tended to be avoided by pedestrians, due to the extra effort which is needed to conquer them [21, 22]. Vegetation, in therms of grass, bushes and trees are of influence according to Bafatakis [8], Korthals Altes and Steffen [9] and Hill [23]. Especially trees can provide needed shelter at varying weather conditions [21] -in terms of shade and dryness- and are found to have positive influence on the route choice of pedestrians. Presence of water fountains, canals, rivers is explored by Korthals Altes and Steffen [9] and categorized an positive aesthetic influence of the environment. In addition they analysed the ambiance of an environment, taken into account the quality of the environment (f.e. litter) and other pedestrians on the streets.

Mass-event influences on route choices There are multiple mass-event related influences which will be explained in this paragraph. One of the first factors of influence is the trip purpose, as the four-step model by De Dios Ortuzar and Willumsen [15] sows, this is included in the first step: trip generation. In this case the trip purpose of (almost) all pedestrians is leisure, due to the fact that they are visiting an mass-event to have fun. Attractions along the route can be seen as a stimulation of the environment and are in multiple studies found to be of major influence on the route choice pedestrians [14, 21, 22, 24]. Mass-events naturally bring multiple attractions towards an event area (f.e. pop stages, markets, etcetera). Besides, these attractions bring many visitors to the location of the event. Therefore crowdedness is an influence which can be related to the mass-event influence on route choices. As Daamen[13] states: "Even if the progress on a direct route is relatively slow, still the choice for a longer route (in distance) is seldom made." This is a very interesting statement which could be analysed based on the retrieved data at the case study in the upcoming chapters. During mass-events, multiple measures are made to give information about the main attractions, routes, crowdedness, etcetera. Zomer[11] found that these information measures can have a large influence on route choices of pedestrians and that it even could steer or spread the crowd over the network. Time of the trip and the group size is during an mass-event also relevant. Crowds could become tired during the day, and pick another route if they are tired. On the other hand groups take other decisions than individuals [20]. And besides, pedestrians could show herding behaviour, which is a decision style that can have influence on the route choice [21]. Next, at events there is an organization or crowd management team who give certain directions to the crowd. And therefore have major influence on the route choices of pedestrians. Their main motives for taking decisions -the crowd management basics- are formed by Hoogendoorn [25] in a set of golden rules. Two of them could influence pedestrians’ route choice directly:

• Distribute flows over network: ensure using under-utilized parts of the network (e.g. using guidance). This measure is an extra factor of influence on pedestrians’ route choices; • Limit inflow if needed: keep number of pedestrians in (critical) facility below critical number (e.g. by perimeter control). This measure could oblige pedestrians to take another route at some parts of the network. 2.4.D ISCRETE CHOICE MODELS 9

Other influences on Route Choices Other general influences on route choices are first related to socio- economic factors. Age, gender, life cycle, income level, education, household structure, race, profession, length of residence and family could all be of influence. Second, the natural environment factors: Noise- and air pollution, day or night, weather conditions [21] and time of the day or week [24] could all be of influ- ence on the route choice. Finally, travel time [14, 21] and travel distance [14, 23] are considered to be one of the major influences on route choices. These two (related) factors will significantly influence experiments if they are considered.

Table 2.1: Attributes found in literature which could be of influence on the route choice of pedestrians at mass events.

Expected influence Attributes on route choice Source Environmental factors Traffic mixture: cars, public transportation, bicycles, pedestrians repulsive influence [21] Width of road or side walk attractive influence [9, 21, 22] Intersections or crossings repulsive influence [21, 22] Topography: bridges, slopes repulsive influence [21, 22] Building type: old building attractive influence [8, 21, 22] Building density: high density attractive influence [9, 21] Land use along the route: commercial use attractive influence [8, 9, 21–23] Complexity of a route or directness: a more direct route attractive influence [9, 14, 21, 23] Lighting attractive influence [21] Visual pleasantness or aesthetics (f.e. varying facades) attractive influence [9, 21] Landmarks or possibilities for orientation attractive influence [9] High quality of the walking surface and environment attractive influence [21] Vegetation, greenery attractive influence [8, 9, 23] Number of routes available: many alternatives attractive influence [20] Presence of obstacles, high number of crossings, discontinuation repulsive influence [14, 26] Safety and shelter during poor weather conditions attractive influence [21, 26] Presence of water fountains, canals, rivers attractive influence [9] Mass-event factors Attractions along the route, stimulation of the environment attractive influence [14, 21, 22, 24, 26] Crowdedness varying influence [13, 14, 26] Road & traffic information (f.e. (matrix) signs, social media, etc.) attractive influence [11, 21, 23, 27, 28] Trip purpose varying influences [21, 23] Goupsize and -composition varying influences [20] Herding attractive influence [21] Crowd management who give directions attractive influence [25] Socio-economic factors Age, gender varying influence [21–23] Income level limited influence [21] Education level limited influence [21] Household structure, Family limited influence [21] Nationality varying influence [21] Profession limited influence [21] Decision style or habit always the same route [21, 27] Familiarity with or visibility of the environment varying influence [23, 24, 28] Natural environment factors Noise- and air pollution repulsive influence [21, 26] Day/night varying influences [21] Time of the day or week varying influences [24] Bad weather conditions covered streets chosen [21] Other factors Travel time: lower attractive influence [14, 21, 29] Travel distance: shorter attractive influence [14, 23, 26]

2.4. DISCRETE CHOICE MODELS Discrete choice models are used to describe the decision making as a process of choices among alternatives. Within the discrete choice model theory there are multiple types of models which could be used to describe the route choice behaviour of pedestrians. This section will give a short explanation of the models that are most applicable for this thesis.

Deterministic versus Probabilistic Models Deterministic models are mathematical models which always give the same outcome if the input values are similar. For example: when a driver will always choose the short- 10 2.L ITERATURE REVIEW est path based on a minimization of a single variable such as time or distance, by using Dijkstra’s algorithm [30]. However, the behavioural limitations of this approach have motivated the development of probabilistic models [31], which describe a system based on stochastic variables and is more close to pedestrian behaviour. For example: when a route choice of drivers is determined by drawing a random number from a probability distribution. This thesis will use probabilistic models to describe the route choice behaviour of pedestrians, because it will be based on stochastic variables extracted from revealed- and stated preference studies.

Utility Theory Utility theory states that alternatives are characterised by their utility -or gain that the deci- sion maker has by choosing that alternative. This utility is based on attributes that influence this behaviour. The main assumption in these models is that pedestrians make a subjective rational choice between alter- natives, and he does a trade-off between a good performance of one attribute (f.e. short time) and a poor performance of another (f.e. high cost). Either Random Utility Maximization (RUM) -where the decision maker chooses the alternative with the highest utility- or Random Regret Minimization (RRM) -where the lowest dis-utility is chosen- are usable models. This theory is summarized in Equation 2.1 and 2.2. The use of RUM is shown in several empirical studies [21, 23], which have shown its applicability can be justified to pedestrian route choices. Therefore, it will also be applied in this thesis.

U V ² (2.1) i j = i j + i j

X Vi j βk Xi jk (2.2) = k

• Ui j = total utility (i = individual, j = alternative)

• Vi j = structural utility, the sum of all utility parts.

• ²i j = random utility

• βk = model parameter for attribute k which is similar for all individuals (this coefficient needs to be estimated)

• Xi jk = attribute level (k = attribute)

There are multiple types of RUM models, of which the characteristics are shortly explained. For more information on discrete choice models the reader is referred to, among other things, Train [32] and Ben-Akiva and Lerman [17].

Multi Nominal Logit The Multi Nominal Logit (MNL) model is the most widely used utility model in discrete choice theory, wherein the main assumption is that the unobserved attributes are uncorrelated over the alternatives [32]. The MNL states the chance that alternative j is chosen (see Equation 2.3). The chance Pi j increases when the utility of Vi j increases, or when the utility of the other components (Vim) decreases.

evi j pi j (2.3) P vim = m S e ∈ i

• Pi j = the change that an individual i chooses alternative j

• Vi j = utility of individual i to choose alternative j

• Si = choice-set of m alternatives of individual i

Other RUM Models In reality it is seen that alternatives or attributes could be correlated and multiple models are designed to deal with this shortcoming of MNL. Latent Class (LC) and Mixed Logit (ML) models are designed to capture heterogeneity in a respectively less or more advanced way. Heterogeneity may, for example, be produced by taste variations across varying groups of pedestrians in the data set. Or when choice sets considered by individuals vary. [31] Nested Logit (NL) and Cross-Nested Logit (CNL) models can cope with correlated alternatives. In these models the alternatives are allowed to take part in one (NL) or multiple (CNL) nests. C-Logit or Path Size Models assume that an overlapping path may not be perceived as a distinct alterna- tive. One alternative could contain links which are shared by multiple alternatives. These models describe the correlation along overlapping paths. 2.5.D ATA COLLECTION METHODS 11

Based on its ease of use, the MNL model is used to review the SP and RP data of this thesis. Other models can have been interesting to find the heterogeneity or correlations between alternatives and attributes, but significant attributes have to be found for this. In addition, it is later shown in Chapter 4 and 7 that the best estimated MNL models already give a quite high fit.

2.5. DATA COLLECTION METHODS As was shown before in the methodology, two methods (SP and RP) are used to collect data for estimating deterministic choice models. SP data is based on responses to hypothetical travel situations in a survey con- text. RP data is based on actual choice situations of travellers and can be acquired by multiple techniques (For example: surveys, counting equipment, GPS-trackers, etcetera). In the last years the emphasis has shifted from stated preference data to stated choice data which is most similar to revealed preference data. However, concerns remain that actual choices and stated choice may be measuring different aspects of behaviour. These differences may be mainly important in forecasting, where the amount of variation in actual behaviour becomes critical [33]. Both methods could also be used for strengthening the results and conclusions, and use the ’best of both worlds’ [34]. This thesis uses both methods to compare the results of both SP as RP to find any comparisons or differentiations in their results, and use both methods as a more strong foundation for conclusions on the topic. It would be ideal -for forecasting purposes- to derive the essential information from SP data and to com- bine this with the RP data reflecting the actual conditions. So the most desirable situation is to combine the stronger features of RP and SP data. Bradley and Daly [33] explain in their paper how this combination can be achieved. Unfortunately, this seemed not possible with the collected data sets for this thesis.

2.6. CONCLUSIONS Literature showed that the research and methodologies to find attributes of influence on the route choices differ. There are many attributes found to be of influence on the route choices of pedestrians. Although the influences during mass events in city centres are not frequently researched, so that will be a focus of this thesis. To find out more in detail which attributes are of most influence at mass events, first a brainstorm session was conducted, which is explained in the next chapter, Section 3.1.1. Furthermore, it is found that both stated- and revealed preference data have their pros and cons, but there are the concerns about the aspects in which stated- and actual choices could differ. By applying both methods (SP and RP) for this thesis, it targets to provide insight into the similarities in their results. For the estimation of the choice models an MNL model will be used, because it is justified in literature for the application of pedestrian route choice behaviour and it is a good model to give a first insights in the attributes.

3 DESIGNOFTHE STATED PREFERENCE ROUTE CHOICE SURVEY

This chapter describes the online stated preference survey of this thesis. The aim of the SP survey is to inves- tigate the factors of influence on pedestrian route choices at mass events. This is achieved via the SP survey which asks people to chose their preferred route based on two pictures they see. First, the pilot design is described in Section 3.1, wherein the attributes (via a brainstorm session) and levels are defined and a first design for the online survey is made. Based on the first parameter estimates of the pilot survey, a more effi- cient final SP survey is designed, described in Section 3.2. The conclusions and findings of this final SP survey will be discussed briefly in Section 3.3, however they are elaborated more in the following chapter.

3.1. PILOT SURVEY A pilot SP survey is conducted to find prior values for the attributes and to test if the participants understand the questions and photo choices. For the design of the pilot survey the steps of Molin [35] are used as a base. He states that designing a survey could be done in three steps:

1. Selection of attributes; 2. Choose the attribute level; 3. Combine the attribute levels into profiles.

Molin [36] describes the main challenges for creating an SP survey. First, it is important to create sufficient variation in the choice situations in order that the intended utility functions can be estimated. This has to be done in such a way that estimated parameters are reliable and have small standard errors. Next, the choice task should not exhaust participants, so the survey cannot be too extensive. Besides, the choice situations should resemble the real world choice situations as much as possible to increase validity. For the pilot design the number of attributes and the amount of levels per attribute have to be chosen. A brainstorm session is organized to choose the main attributes. With these attributes an sequential orthogonal designs will be made in the program Ngene. A sequential design can typically only be generated in cases where each utility function has the same attributes with the same levels [37, p.72], so similar levels have to be chosen: this thesis chooses 6 attributes, all containing of two levels. The number of choice sets that are possible with 6 attributes and 2 levels each are: 26 64. This is too much to ask in a survey. Therefore the = total number of choice sets which is used is: 12. Wherefore Ngene is used to design these 12 choice sets.

3.1.1. BRAINSTORM SESSION The goal of the brainstorm session in this thesis is to find out which attributes are of most importance on the route choices of pedestrians during an event. This session is used as input, besides the reviewed literature, to select the attributes for the design of the SP survey. A brainstorm is regularly used to produce a large number of ideas with a group of participants and is usually carried out in the beginning of the idea generation. Brainstorming was invented by Osborn around 1930 and consists of three steps [38]:

13 14 3.D ESIGNOFTHE STATED PREFERENCE ROUTE CHOICE SURVEY

1. Diverging from the problem: Begin with a problem statement and generate lots of ideas; 2. Inventorying, evaluating and grouping ideas: An overview is created of the possible solution space and whether more ideas are needed; 3. Converging: choosing a solution.

A small group of 6 students was asked to participate in this brainstorm session of 45 minutes. The set-up of the brainstorm was guided by the three steps. The detailed description of the brainstorm can be found in Appendix A, the section below will give a short summary of the session.

1. Diverging from the problem The brainstorm session started with the explanation of the problem state- ment. The group was asked to imagine themselves in Amsterdam at a mass event. A short introduction movie was shown to the participants to ensure they knew what kind of mass event the brainstorm is about. Figure 3.1 shows some photo fragments of this movie.

Figure 3.1: Photo fragments of the introduction movie for the brainstorm session

Once the movie was shown, the participants were asked to write down their individual opinion about influences on route choices. Next, a discussion was held to compare everyone’s ideas and to generate some more. A few supporting questions were asked to start-up the brainstorm. Such as: How do you choose your routes, at such a mass-event?; What elements in the city or in your environment determine the routes you choose?; Does your route choice differ if you are: in a group, familiar with the location, tired at the end of the day? And how? To ensure the group was not stirred too much into one direction, they were let very free to come up with ideas.

2. Inventorying, evaluating and grouping ideas Once many ideas were generated and gathered, these ideas needed to be organized. All ideas where gathered and different categories were specified. The cat- egories the participants came up with were: External; Location specific; Personal preferences; Knowledge. These categories gave a clear overview of the generated ideas and it generated some new ideas on missing attributes, which were not named before.

3. Converging: choosing a solution All the attributes which were named in the previous stages of the brain- storm session are in the last step converged to a solution to the question: What are the most important at- tributes that influence the route choice of pedestrians? Table A.2 in Appendix A give an overview of the cate- gories and how the attributes are ranked. The attributes which were considered of most influence by the six participants of the brainstorm are ranked below, in the right order of (expected) importance. Number 8 and 9 have multiple attributes, because these were ranked similar. Furthermore, some additional explanation is given to relate these attributes to the ones found in literature.

1. Final destination; 2. Time pressure or in a hurry -or trip purpose and travel time; 3. In-between locations -or attractions on the route; 4. Avoid busy areas -or crowdedness; 5. Familiarity with the area; 6. Signs; 7. Information via mobile phones; 3.1.P ILOT SURVEY 15

8. a) Activity program or planning; b) Most attractive route; c) Speed, faster if you are alone; 9. a) Weather conditions; b) Shelter for bad weather conditions; c) Go with the crowd -or herding.

3.1.2. SELECTIONOFTHE ATTRIBUTESAND LEVELS This paragraph uses the obtained literature of Chapter 2 and the results of the brainstorm session explained in Paragraph 3.1.1 to make a selection of attributes which are considered of most influence for this thesis, by means of a Multi Criteria Analysis (MCA). Based on three criteria the attributes are selected: 1) Is the attribute measurable in the RP at SAIL ánd in the SP survey? 2) Did the attribute receive at least two votes at the brainstorm ánd has an expected influence at the mass event? or 3) Was not named in the brainstorm, but in the literature ánd is expected to have influence at the mass event? In Appendix B the entire MCA table is shown, and these criteria are given credits. Below the feasible attributes -which were selected to be suitable by the MCA- are enumerated.

• Width of road • Landmarks or possibilities for orientation • Vegetation • Presence of water • Attractions along the route • Crowdedness • Information or signs • Crowd management • Time of the day or week

Selection for the Stated Preference Survey From the MCA found attributes the attribute landmarks and attractions are expected to be of same influence at SAIL, because the tall ships (attractions) at the event will also function as a possibility for orientation (landmark). Therefore they will be combined in one attribute: attractions. Besides, information for example on your mobile or in flyers is hard to analyse in the SP survey. Therefore, sings will only be used as an attribute. Next, crowd management and time of the day or week are both found a bit hard to interpret in the SP, because in RP crowd managers have very direct influence with what they say to the crowd and time or days are expected to be of little influence at SAIL. It is possible to include them, but due to the limited selection of attributes, they are left aside. Furthermore, the most expected attributes of influence from the environment are also considered in the SP: road width; vegetation (or trees); and the presence of water, because the case study is surrounded by a lot of water. These six attributes (Table 3.1) will be used for further design of the SP survey. The next paragraph explains more in detail how this survey is designed. Besides, this thesis picks an attribute level of 2 for the SP.This means that each attribute has two options to chose from. (f.e. Vegetation or trees. Level 1. Trees are present, Level 2. Trees are not present.) This number is chosen to ensure that not to many alternatives can be generated with Ngene and to keep the SP survey simple and doable for participants.

3.1.3. DESIGNOFTHE PILOT SURVEY For the pilot survey an orthogonal design is chosen. This assumes that the attribute levels are not correlated over the different profiles. For example: a high quality modality is combined an equal number of times to a high price as to a low price. For the final survey design, these correlations will be distinguished and filtered. The software Ngene is used to design the different questions or profiles for the survey. A Ngene model (.ngs) was made to generate alternatives for 12 choice sets for the pilot survey design. This model can be found in Appendix C. 16 3.D ESIGNOFTHE STATED PREFERENCE ROUTE CHOICE SURVEY

Table 3.1: Attributes which are used for the SP survey.

Attributes Levels 1. Attractions 1. Yes 0. No 2. Crowdedness 1. Crowded 0. Less crowded 3. Signs 1. Yes 0. No 4. Road width 1. Wide 0. Narrow 5. Trees 1. Yes 0. No 6. Water 1. Yes 0. No

Table 3.2: Ngene Pilot Survey, orthogonal design. The meaning of 1 and 0 at each alternative can be found in Table 3.1.

Choice alt1. alt1. alt1. alt1. alt1. alt1. alt2. alt2. alt2. alt2. alt2. alt2. Mo- At- Crowd- Signs Road- Trees Water At- Crowd- Signs Road- Trees Water ments trac- ed- Size trac- ed- Size tions ness tions ness 1 1 1 1 1 1 1 0 0 1 0 0 1 2 0 0 0 1 1 0 1 0 0 0 0 0 3 0 1 0 1 0 1 0 0 0 1 1 0 4 0 1 1 1 0 0 1 0 0 1 0 1 5 1 1 1 0 0 0 0 1 0 0 1 0 6 1 0 1 1 1 0 1 1 0 0 1 1 7 1 1 0 0 1 1 0 0 1 0 1 1 8 0 0 1 0 1 1 0 1 0 1 0 1 9 1 0 0 1 0 1 1 1 1 1 1 1 10 0 1 0 0 1 0 0 1 1 1 0 0 11 0 0 1 0 0 1 1 1 1 0 0 0 12 1 0 0 0 0 0 1 0 1 1 1 0

Survey Design The survey is made in the online survey program GoogleForms. The Survey consists of three parts: 1) twelve choice sets where for each set the participant have to chose his or her preferred route based on two photos; 2) personal experience at events; 3) personal details or socio demographic details.

3.1.4. RESULTS OF THE PILOTSURVEY Fourteen responses were obtained for the pilot survey. The statistical information of the answers on the personal experience at events and personal details of participants can be found in Table 3.3. The answers on the choice moment part can be found in Appendix C, Table C.1.

Estimate prior values in Biogeme This paragraph explains how the software Biogeme is used to estimate the prior values (or first estimation of the parameters) for a new design in Ngene, which will be used for the final survey. In Appendix C two examples of the Biogeme file are shown and the results of Biogeme for all runs is shown in Table C.2. It is shown that the attributes Attractions, Crowdedness and Signs are of significant influence on the choice model. Water is sometimes significant - only when it is runned as the only attribute (see run # 6) and trees and road size are not significant. The estimated parameter values of the attributes are given in Table 3.4. If the value is negative it has a repellent influence and if it is positive it has an attractive influence on the choice. The signs and size of the parameter values are as expected. Attraction it the major

Table 3.3: Statistical information of the respondents of the pilot survey.

Number of respondents 14 Average age 37.0 Gender 36% male, 64% female Nationality 100% Dutch Has been at a mass-event 100% Yes, 0% No Frequency of visits [times per year] 21% 0-1, 57% 1-2, 21% >2 Group composition last event 7% alone, 36% family, 57% friends Who decided the route 21% participant himself, 29% group member, 50% together 3.2.F INAL SURVEY DESIGN 17

Table 3.4: Parameter estimates of all attributes, based on the choice model estimation for the pilot survey for all parameters together.

Attribute Parameter value 1. Attractions 1.26 2. Crowdedness -1.53 3. Signs 0.827 4. Road Size 0.167 5. Trees 0.086 6. Water -0.616 attractive influence and crowdedness is the major repellent influence, which seems logical for a mass event. In addition, signs has a quite hight attracting influence as well, while water is quite repellent. Road size and trees have little influence. Water could be the only attribute which is a bit odd, maybe water is repellent for some people, but it could also be expected attract because of its pleasant scenery. For the design of the final survey, these prior parameter values can be used to make an efficient design in Ngene.

3.2. FINAL SURVEY DESIGN Based on the pilot survey -as described in the previous section- a final survey will be designed. This section describes the design of this survey and the adaptations which are made compared to the pilot survey. For the final survey design, the prior values will be used to generate an efficient design of the survey. Ngene is runned multiple times, to find out the influence of the different prior values. All prior values are rounded to one decimal. In the first run the prior values for β1 till β6 are included, using Biogeme runs 1 till 6 (Appendix C, table C.2). In the second run the prior values for β1 till β6 are included, using Biogeme run 7. In the third run the prior values for β1 till β3 are included, which are the significant parameters according to the pilot survey results. For each run, the survey design with the lowest D-error is chosen as option for the final survey design. The three remaining designs (one of all three runs) were compared based on the present attributes. The third run, seemed to have plausible combinations of attributes and did not include an odd choice set. Therefore this third run -with only the significant attributes- is used for the design of the final survey. Furthermore, the efficient design will consist of 8 questions instead of 12, which makes it more convenient for the respondents to fill in the survey. The minimum number of questions (or rows) is 7, which is defined in Equation 3.1 [37, p.72]:

# parameters 7 minimum # questions 7 (3.1) = (alternatives 1) = (2 1) = − −

The design of the final survey can be seen in Table 3.5. Finally, for this design the prior values for β1 till β3 are included, because the design based on these prior values seemed sufficient. The values were: β1 = 1.3; β2 = -1.5; β3 = 0.8. For β4, β4 and β6 no priors were used, because they turned out to be not significant in the pilot survey. But still it is expected that they have a certain influence, therefore they are taken into account for the final survey.

Table 3.5: Ngene final survey, efficient design. The meaning of 1 and 0 at each alternative can be found in Table 3.1.

Choice alt1. alt1. alt1. alt1. alt1. alt1. alt2. alt2. alt2. alt2. alt2. alt2. Mo- At- Crowd- Signs Road- Trees Water At- Crowd- Signs Road- Trees Water ments trac- ed- Size trac- ed- Size tions ness tions ness 1 1 1 0 0 0 0 0 0 1 1 1 1 2 0 0 1 0 1 0 1 1 0 1 0 1 3 0 1 1 1 1 1 0 0 0 0 0 0 4 0 0 0 1 1 0 1 1 1 0 0 1 5 1 0 0 1 0 1 1 1 1 0 1 0 6 1 1 1 1 0 0 0 0 0 0 1 1 7 1 0 0 0 1 1 0 0 1 1 0 0 8 0 1 1 0 0 1 1 1 0 1 1 0

Two out of the eight stated preference route choice questions which were asked in the final survey are shown below. The participant had to either pick route choice A or B. In Figure 3.2 choice A varies from choice B because the main route is indicated by signs, there are trees, it is less crowded and the street is more narrow. While choice B has attractions (tall ships and a music stage) and water. In Figure 3.3 choice A varies from 18 3.D ESIGNOFTHE STATED PREFERENCE ROUTE CHOICE SURVEY

Figure 3.2: Route choice question 2 of the final stated preference survey. choice B, because it is more crowded, there are attractions, signs and the road is less narrow. While choice B has trees and water and is less crowded than choice A. Some extra information was given below the figures to explain if there are event attractions along the route choice, because in reality the attractions are more present (better to hear/see) than can be showed in the figures. This design will be elaborated on further in Chapter 4, wherein the statistics of respondents and the results of the estimated choice model will be given.

Figure 3.3: Route choice question 6 of the final stated preference survey.

3.3. CONCLUSIONS This chapter gave the results of the brainstorm, the design of the orthogonal pilot- and the efficient final survey. Six attributes were selected for the designs of the SP survey (both pilot and final), based on the dis- cussion, literature and expected influence at mass events. In the results of the pilot survey it was seen that tree out of six attributes were significant: attraction, crowdedness and signs. These are all mass event related attributes, and they are found to be logical in terms of influence, since the respondents are introduced in the mass event situation first. It is expected that the final survey will give a more reliable value to each parameter and therefore can give a better insight in the influence of the attributes, since the amount of respondents is much higher. The next chapter will give the results of this final survey. 4 RESULTS OF THE STATED PREFERENCE SURVEY

This chapter elaborates on the results of the online stated preference survey, of which the design was ex- plained in the previous chapter, Table 3.5. The goal of this chapter is to bet insight into the results of the stated preference survey and to find if there are any significances found between route choices, but also between the socio-demographic characteristics of the respondents and their stated preference. First a quantitative analy- sis is conducted in Section 4.1 to give insight in the experience at mass events and the socio demographical characteristics of the 177 respondents. Next, a choice model is estimated in Section 4.2 to gain insight into the attributes of influence on the route choices.

4.1. QUANTITATIVE ANALYSIS OF THE SURVEY To conduct a quantitative or statistical analysis the program SPSS is used. The goal of this statistical analysis is to get a general insight into the characteristics of the respondents of the online survey and their charac- teristics. A differentiation is made between the socio-demographic characteristics and the answers on the mass-event related questions.

Socio-demographic characteristics of the 177 respondents There were multiple questions about the socio- demographic characteristics of the 177 respondents of the online survey, which are explained below. The ra- tio male versus female respondents is 46% versus 54%, as is shown in Figure 4.2. Nationality had a ratio of approximately 75% Dutch and 25% foreign respondents (See Figure 4.1). From which four dominant nation- alities could be distinguished: Indians (4%), Greek (3.4%), Chinese (2.8%) and Germans (2.3%). The remain- ing 12,5% of foreign respondents was somewhat equally distributed over Iranians, French, Irish, Mexicans, Russians, Belarusians, Portuguese, Brazilians, Canadians, Palestinians, Koreans, Lithuanians, Sri Lankans, Latvians, Spaniards, Hungarians, Italians, Finns, Ukrainians, Turks.

Figure 4.1: Nationality distribution of the survey respondents.

19 20 4.R ESULTS OF THE STATED PREFERENCE SURVEY

Figure 4.2: Age and gender distribution of the 177 survey respondents (46% male 54% female).

The most common age category is the ages 25 till 34 and the runner up is 18 till 24 (Figure 4.2). The high peak around these ages can be explained by the sender of the survey, who is in the same age category. It furthermore can be seen that the most common eduction level is Masters level, while the household income is generally low (Figure 4.3). It is expected that a high education level relates to a high income level. However, the age of the respondents is still quite young and most likely that under the respondents a high amount of students can be found (who are one of the largest groups who received the survey). Presumably this sample contains many students from Delft University of Technology. This group does not have a high income (yet). If compared to the Dutch inhabitants this sample is not very representative, since their average age is around 39 years [39], the education level is generally lower [40] and household income is higher on average [41]. Since this SP research will be compared with the RP data from SAIL, it would have be interesting to know if this sample is representative for the SAIL visitors. Unfortunately this is unknown, because data about the SAIL visitors is not as extensive.

Figure 4.3: Education level and annual household income distribution of the survey respondents.

Mass-event related characteristics Next to the socio-demographic related questions, there were five ques- tion asked which are related to the respondents’ experience at mass-events. Firstly, there was asked if the respondent has visited a mass-event in a city centre before. As Figure 4.4a shows, not all respondents have visited a mass event. The frequency of visits per year is shown in Figure 4.4b. Whether the respondents do or do not prepare before they go to a mass-event is shown in Figure 4.5a: approximately 25% does not prepare before visiting, while 75% does. The respondents chose one or multiple ways of preparing, 64% prepared by performing multiple ways to prepare. 4.1.Q UANTITATIVE ANALYSIS OF THE SURVEY 21

(a) Answer to the question: "Have you ever visited a mass-event within a city centre?"

(b) Answer to the question: "What is the average number of mass- event you visit within a year?"

Figure 4.4: Statistics of the answers to the mass event related questions 1 and 2.

Next, it was asked who decided the routes that were taken at the event itself to find out what the respon- dents’ perceptions on this is. Mostly the answer was "We decided together with the group" (Figure 4.5b). However, it is expected that in reality this answer is different than in theory, because it is frequently seen that there is one (or more) dominant person(s) who ’leads’ a group. The final mass event related question asked about the group composition at the last event visit of the respondent. As Figure 4.5c shows, the dominant group composition were visits with friends (69%), this is expected to be a reliable distribution considering the average relatively young respondents to the survey.

(a) Answer to the question: "How do (b) Answer to the question: "Who is (c) Answer to the question: "What was your you prepare before you go to a mass deciding what routes you take at a groups composition at the last mass event event?" mass event?" you visited?"

Figure 4.5: Statistics of the answers to the mass event related questions 3, 4 and 5.

Correlations This paragraph gives an overview of the correlations which are found in the survey. Table 4.1 shows the correlations between the socio-demographic characteristics and the mass event characteristics. The significant correlations (at multiple significance levels: 10%, 5% and 1%) are marked bold in the table. It is visible that there are many significant correlations found for these categories. Significant at the highest level (1%) are the characteristics explained below. In Appendix E an explanation of these significance correlations is explained in cross tables. Gender correlates significantly with highest education level and frequency of event visits per year. It is found that the female respondents of the survey have generally a lower education level (Table E.1). Further- more, it is shown that females visit mass-events more frequently compared to men (Table E.2). Household income correlates significantly with age and group composition. And in addition, age correlates significantly with group composition. It is seen that the higher the age of the respondent, the higher the household in- come (Table E.3), which seems logical. In addition, it is seen that the lower the household income (so also the lower the age), the more change that you visited the mass-event with a group of friends. While older respon- dents (or a higher income), visited mostly with family (Table E.4). Furthermore, it is seen that frequency of mass-event visits per year correlates significantly with gender, age, and group composition. Table E.5 shows that the higher the frequency of visits, the younger the respondent (and also it is more likely that the group 22 4.R ESULTS OF THE STATED PREFERENCE SURVEY composition exists of friends instead of family).

Table 4.1: Correlations table of the socio-demographic- plus mass-event characteristics of the respondents. The significant correlations are marked bold. *Correlation is significant at the 10% level (2-tailed). **Correlation is significant at the 5% level (2-tailed). ***Correlation is significant at the 1% level (2-tailed).

Highest House- Visited Freq. Group Who Prepare Natio- educ. hold mass event visits compo- decided for Gender Age nality level income event [per year] sition routes? event Gender Pears. Cor. 1 .052 -.140* .297*** .140* -.097 -.196*** -.051 -.069 -.125* Sig.(2-tail.) .494 .063 .000 .062 .198 .009 .502 .360 .098 Age Pears. Cor. .052 1 -.016 .136* .612*** .012 -.224*** -.224*** -.077 -.033 Sig.(2-tail.) .494 .837 .070 .000 .878 .003 .003 .308 .660 Nationality Pears. Cor. -.140* -.016 1 .157** -.063 .018 -.035 -.063 -.001 .039 Sig.(2-tail.) .063 .837 .037 .402 .813 .643 .404 .986 .605 Highest Pears. Cor. .297*** .136* .157** 1 .186** .045 -.147* -.030 .067 -.033 educ. level Sig.(2-tail.) .000 .070 .037 .013 .554 .051 .689 .375 .666 Household Pears. Cor. .140* .612*** -.063 .186** 1 -.077 -.127* -.323*** -.019 -.109 income Sig.(2-tail.) .062 .000 .402 .013 .307 .093 .000 .799 .150 Visited Pears. Cor. -.097 .012 .018 .045 -.077 1 .473*** .571*** .336*** .132* mass event? Sig.(2-tail.) .198 .878 .813 .554 .307 .000 .000 .000 .080 Frequency Pears. Cor. -.196*** -.224*** -.035 -.147* -.127* .473*** 1 .289*** .138* .072 event visits Sig.(2-tail.) .009 .003 .643 .051 .093 .000 .000 .066 .343 Group Pears. Cor. -.051 -.224*** -.063 -.030 -.323*** .571*** .289*** 1 .132* .247*** composition Sig.(2-tail.) .502 .003 .404 .689 .000 .000 .000 .079 .001 Who deci- Pears. Cor. -.069 -.077 -.001 .067 -.019 .336*** .138* .132* 1 -.056 ded routes? Sig.(2-tail.) .360 .308 .986 .375 .799 .000 .066 .079 .461 Prepare Pears. Cor. -.125* -.033 .039 -.033 -.109 .132* .072 .247*** -.056 1 for event Sig.(2-tail.) .098 .660 .605 .666 .150 .080 .343 .001 .461

In addition to these ten questions about the characteristics of the respondent, he or she had to answer eight multiple choice questions about his preference on route choices. This was based on figures of the route that were shown in these questions and whereof the results are reviewed in the next section. Table 4.2 first explain the correlations between the socio-demographic- plus the mass-event characteristics versus these eight route choice questions. As the table shows, there are a few significant correlations between the socio-demographic- or mass-event characteristics and the answers that the respondents gave to the eight route choice questions. These signif- icant correlations are marked bold. Two significant correlations are found for the household income of the respondent: route choice questions 2 and 6 (respectively significant at the 5% and the 1% level). Another correlation (at the 10% level) is found between route 4 and the group composition of the last event the re- spondent visited. A correlation (at the 1% level) is found between gender and question 7. A last correlation (at the 10% level) is found between nationality and question 8. It found to be counter intuitive that household income has such a high correlation with two of the eight questions, because there is no easy explanation to give for this correlation. It is not known how the level of household income could influence the answers given in the stated choice survey. However, there is one very dominant group based on their household income ( 10,000), which considers of almost 50% of the respondents. This could have influenced the high signifi- ≤ cance. The other correlations could be coincidence because they are not found to be highly significant (only at 10%). Moreover the eight route choice questions and the respondents answers in the next section, where the choice models are estimated.

4.2. CHOICE MODEL ESTIMATION OF THE SURVEY DATA The previous section showed that there are no noticeable influences of the additional questions (socio de- mographic - and mass event-related) towards the answering of the eight route choice questions. Therefore, this section elaborates more on the eight route choice questions. For which respondents had to make a route choice based on two photos, each of one optional route (see Chapter 3 for the explanation of these questions). The answers of these route choice questions are analysed with the program Biogeme. The goal of this analysis is to find what the level of influence is for all six attributes. Each attribute is estimated separately to give an in-depth analysis of each factor. In the end a combination of multiple attributes is made to find out more on the collaboration of the attributes. In Table 4.3 the answer distribution of the 177 respondents is shown. Remarkable is the growing percent- age of answers "Makes no difference", while the share of answer A/B of the latter questions is not per se more equally distributed. A reason for this could be the exhaust of the respondents, because they do not see the difference in the photos very clear any more and pick this one. On the other hand, question five shows the highest share (93.2% for answer A) towards one answer compared to the other questions, while the "Makes 4.2.C HOICE MODEL ESTIMATION OF THE SURVEY DATA 23

Table 4.2: Correlations table of the socio-demographic- plus mass-event characteristics versus the eight route choice questions of the re- spondents. The significant correlations are marked bold. *Correlation is significant at the 10% level (2-tailed). **Correlation is significant at the 5% level (2-tailed). ***Correlation is significant at the 1% level (2-tailed).

Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Gender Pearson Correlation -.042 .010 -.049 -.085 -.078 -.054 .127* .084 Sig. (2-tailed) .578 .899 .520 .261 .299 .475 .092 .269 Age Pearson Correlation .049 -.121 .015 -.103 -.067 .070 .046 .102 Sig. (2-tailed) .521 .107 .843 .173 .373 .353 .546 .178 Nationality Pearson Correlation -.021 .020 -.067 .107 .117 .011 -.026 -.137* Sig. (2-tailed) .780 .789 .377 .155 .121 .888 .731 .068 Highest educational level Pearson Correlation -.064 -.074 -.012 .025 .112 -.044 -.006 .020 Sig. (2-tailed) .401 .326 .873 .742 .138 .558 .937 .788 Household income Pearson Correlation .102 -.185** .032 -.123 -.105 .214*** -.013 .079 Sig. (2-tailed) .177 .014 .675 .104 .164 .004 .868 .295 Did you ever visit a mass-event? Pearson Correlation -.058 -.013 .026 .038 -.031 .052 -.059 .015 Sig. (2-tailed) .442 .860 .733 .615 .683 .488 .432 .844 Frequency event visits (per year) Pearson Correlation .015 .004 .078 -.046 -.097 .080 -.041 -.003 Sig. (2-tailed) .842 .956 .302 .540 .199 .288 .590 .968 Group composition Pearson Correlation -.076 -.058 -.091 .140* .019 .041 -.038 -.034 Sig. (2-tailed) .313 .443 .227 .063 .798 .586 .612 .655 Who decided routes? Pearson Correlation -.083 -.082 -.051 -.037 .005 .062 -.002 -.096 Sig. (2-tailed) .273 .276 .497 .626 .949 .415 .976 .205 Prepare for event Pearson Correlation .042 -.043 .076 .055 -.056 .017 .122 -.053 Sig. (2-tailed) .582 .571 .318 .466 .460 .818 .105 .480 no difference" share also is lowest here. This seems quite logical.

Table 4.3: The distribution of answers on the 8 route choice questions (177 respondents).

Makes no Question A B difference 1 24.0% 74.0% 1.7% 2 48.6% 50.3% 1.1% 3 19.2% 79.2% 1.7% 4 59.3% 39.5% 1.1% 5 93.2% 6.2% 0.6% 6 35.0% 62.7% 2.3% 7 71.8% 25.4% 2.8% 8 26.6% 65.5% 7.9%

The photos of question five are shown in Figure 4.6. In this figure a wide street with little other pedestrians and attractions represents choice A (left photo), a narrow alley with a dense crowd, attractions but also a sign with "route" on it represents choice B (right photo). Furthermore, water is visible in choice A, while trees are present in choice B. The high tendency to chose A could be interpreted in a way that respondents try to avoid the crowd especially in narrow streets, even though signs suggest another main route. The choice model estimations will quantify this statement.

Figure 4.6: Respondents route choice question 5 of the survey.

For the SP data a MNL model is used to estimate the attribute parameters. In Chapter 2, Section 2.4 an overview was given of the usable discrete choice models. As the MNL model turns out to be justified to use for pedestrian behaviour and is sufficient to find the first parameter estimates, this model is applied for further analysis in this section. 24 4.R ESULTS OF THE STATED PREFERENCE SURVEY

Table 4.4: Parameter estimates of ten MNL choice models in Biogeme. Each row represents one estimation. The parameters which are found to be significant are coloured blue and their p-values are marked (* significant at the 1% level).

MNL Attractions Crowdedness Signs Road size Trees Water Final log- Adjusted 2 # β1 β2 β3 β4 β5 β6 likelihood ρ 1 value -0.037 -957.061 -0.001 p-value 0.55 2 value -0.875 -865.833 0.094 p-value 0.00 * 3 value -0.614 -895.141 0.064 p-value 0.00 * 4 value 0.158 -952.93 0.003 p-value 0.00 * 5 value 0.083 -956.06 0.000 p-value 0.13 6 value 0.240 -947.356 0.009 p-value 0.00 * 7 value -0.780 -0.478 0.188 0.284 -817.544 0.142 p-value 0.00 * 0.00 * 0.00 * 0.00 * 8 value 1.180 -1.890 0.0242 0.254 -0.193 0.335 -771.348 0.188 p-value 0.00 * 0.00 * 0.76 0.00 * 0.01* 0.00 * 9 value 1.160 -1.870 0.253 -0.193 0.334 -771.394 0.189 p-value 0.00 * 0.00 * 0.00 * 0.01* 0.00 * 10 value -0.281 -0.690 -886.881 0.071 p-value 0.00 * 0.00 *

Results MNL model estimations In Table 4.4 the estimations of ten MNL models are shown. Each row rep- resents one estimation. Attributes are first estimated separately, then the significant attributes are combined and the model is slowly build up, until the highest ρ2 value is found. First, each attribute is calculated separately in a MNL choice model. Equations 4.1 and 4.2 show the utility functions for the two available alternatives (route choice A or B) of MNL model number 1, which only considers β1 or the attribute attractions. For MNL model numbers 2 until 6, the equations will look similar, only using other β’s. The data retrieved from the stated preference survey is used to calculate the parameter values of each attribute.

U β Attractions (4.1) 1 = 1 × 1 U β Attractions (4.2) 2 = 1 × 2 In which: U utility of alternative i i = β parameter of attribute k k =

In MNL models 1 until 6 the adjusted ρ2-values are quite low, so the models do not have a very accurate fit. Furthermore attractions has a low β-value (-0.037), while it was expected that attractions have a highly attractive influence (high positive value). Also signs has an unexpected value, because it was expected to have an attractive influence (positive value) and now has a high negative value (-0.614) in MNL model 3. The other values are as expected. Secondly, the significant attributes of MNL models 1 until 6 are estimated together in another model (MNL #7). It is seen that the parameter values do not differ very largely from the individually estimated attributes. They have a maximum β-difference of 0.136 (for the attribute signs) and the positive or negative signs are similar. This means that they have the same kind of influence: they are either repulsive or attracting. Besides, the adjusted ρ2-value is remarkably higher than the previous models. Thirdly, a parameter estimation is conducted for all attributes together (MNL #8), since already four out of six attributes were found significant. In Equations 4.3 and 4.4 the utility functions for the two alternatives are given for this model. It is seen in the results that both attractions and trees turn out to be significant in this model. It is very remarkable to see that the value of attractions is increased largely and now has a positive sign, this seems odd. In addition, signs is not significant any more and has a very different value (0.0242) compared to before. Also trees was expected, based on literature, to have a attractive influence, while in this model it has a repellent influence (negative value). On the other hand, the ρ2 has a higher value than the ρ2-values of the previous models, so this model has a more accurate fit to the SP data. 4.3.C ONCLUSIONS 25

U β Attractions β Crowdedness β Signs β Road size β Trees β Water (4.3) 1 = 1 × 1 + 2 × 1 + 3 × 1 + 4 × 1 + 5 × 1 + 6 × 1 U β Attractions β Crowdedness β Signs β Road size β Trees β Water (4.4) 2 = 1 × 2 + 2 × 2 + 3 × 2 + 4 × 2 + 5 × 2 + 6 × 2

Then, the attributes that were significant in choice model 8 are taken into account in the next model (MNL #9). It is visible that the positive and negative signs are equal and the β-values do almost not vary. Also the ρ2 is a little bit higher,and actually turns out to be the highest of all MNL models. So this is model has the most accurate fit to the data.

Drawbacks of the models Due to the strange and varying values for the attributes attractions and signs, another model is estimated (MNL #10) wherein the β’s of these two attributes are combined. The model results are again varying and give other values than before. The model fit is very low again (0.071), but both attributes are found to be significant. It could be possible that the design of the survey is done in such a way, that attributes are not combined in the most efficient way. This could lead to false results which are found here for the β-values of attractions and signs. It was expected that the pilot study, its prior values and the efficient design with Ngene, would overcome this drawback and could design reliable choice sets. The designs of the choice set for the survey seems to have some faults anyhow, whereby attractions and signs are influenced strongly.

4.3. CONCLUSIONS This chapter elaborated on the results of the stated preference survey, wherein questions were asked about the personal characteristics of the respondents, their experience at mass events and about their route choice preferences. It was found that the sample is not very representative for the Dutch inhabitants (since 75% of the respondents was of Dutch nationality): the average age of this sample is lower, the education level higher and household income lower. Presumably most respondents were students from Delft University of Technology. Therefore it should be kept in mind, that this is a very specific sample. Furthermore, most had visited mass events before and about 75% percent of the respondents prepares before he or she goes to a mass event. Friends were the dominant group composition and over 50% states that the routes were chosen together with the group. It is, however, expected that in reality this answer is different than in theory, because it is frequently seen that there is one (or more) dominant person(s) who ’leads’ a group. The correlations (Table 4.2) showed that there are no noticeable influences of the respondents characteristics on their answers to the route choices, so this will not be taken into account in the model estimations. The MNL model with the most accurate fit (MNL # 9 in Table 4.4) showed that five out of six attributes were significant: attractions (1.160), crowdedness (-1.870), road size (0.253), trees (-0.193) and water (0.334). It means that attractions are the most attracting attributes to chose a route, while crowdedness is according to this survey the most repellent attribute. Road size, trees and water are somewhat attractive or repellent, but they are less of influence compared to attractions and crowdedness.

Figure 4.7: Respondents route choice question 6 of the survey, and the shares of choices of the 177 respondents (2% choose ’Makes no difference’)

Furthermore, it was not expected that signs are not significant in the route choice behaviour of pedestri- ans. It could be stated that the people did not perceive the signs as the most dominant factor for choosing A 26 4.R ESULTS OF THE STATED PREFERENCE SURVEY or B in the survey. Or as you can see in Figure 4.7, 63% of the respondents chose the right figure, where there are no attractions or signs, but where it is less crowded. Presumingly, the crowdedness is so dominant that the signs or attractions are of less influence or even not accounted at all. This was also found in the comments of some of the respondents: "I generally would prefer the less crowded routes". Although it is seen that signs is significant when it is estimated in other combined attributes (MN #3 and #7) and that the β-values for signs (but also for attractions) differ strongly. It could be possible that the design of the survey is done in such a way, that attributes are not combined in the most efficient way. This could lead to false results which are found here for the β-values of attractions and signs. 5 CASE STUDY:SAILAMSTERDAM 2015

This chapter describes the Revealed Preference (RP) study. This RP study is done to reveal the behaviour of pedestrians -focusing on their route choices- during the mass event SAIL Amsterdam 2015. This event took place from the 19th until the 23rd of August 2015. Hill (1982) states that as a source of data collection at a case study, "Tracking is a form of ’simple observa- tion’ ..in which the observer has no control over the behaviour of sign in question and plays an unobserved, and non-intrusive role in the research situation." This RP study uses a new form of tracking, whereas Hill used an undercover investigator to track bypassing persons, this thesis will use the newest technologies such as WiFi sensors and GPS-trackers to gain data. With the difference that at Hills case, the ’participants’ did not know that they were participating, while the case study at SAIL hands out the tracking devices (GPS-trackers). First, this chapter will describe the preparations which were taken before SAIL was held. It describes the measurement equipment that is used and a plan of approach is given for during the event. Next, an expectation is given of the outcome of this RP study. Third, a general overview is given of the days at SAIL, when the RP study was executed. Lastly, the quick findings which were conducted during SAIL are discussed.

5.1. BEFORETHE CASE STUDY As was reviewed in the literature (Section 2.5), multiple possibilities for data collection exist. This thesis uses SAIL to collect RP data and to research the route choice behaviour of pedestrians at this case study. This section describes the preparation work which is done for the RP data collection and it describes the available measurement equipment and what the expected usability of this RP data is.

Overview of the measurement equipment This RP study uses different measurement equipment for the collection of revealed preference data. One hundred GPS trackers are handed out to visitors of the event to track their routes during one day. This means that the four days of the event will be able to collect at least four hundred sets of data. If visitors return their GPS tracker before the end of the day, this means that it could be handed out to another visitor and an extra data set could be collected for that day. Therefore it is expected to collect over four hundred data sets. The data of these GPS-trackers will be used to find out what routes the visitor of SAIL choose. The GPS trackers are bought at My GPS Tracker via the website mygpstracker.nl. The battery of this tracker last three until fourteen days, which is sufficient for the maximum of one day each tracker will be handed out. The accuracy of the trackers is 5 metres, which is sufficient to see which route the participant takes. Besides the GPS trackers, the RP study uses WiFi-sensors, to find out what happens at a given time at a certain location. The WiFi-sensors are able to track all persons who enabled WiFi on their mobile devices. Each WiFi sensor will recognize a given IP-address, so this enables to reconstruct possible routes of these visitors. The cameras are able to count the number of persons at a given location at a certain time. These images will in this thesis be used to find out the share of pedestrians who chose a certain route. The locations of the WiFi-sensors can be found in Figure 5.1. There is a sufficient amount of sensors to calculate these required shares, which will be used to measure the crowdedness at the mass event.

27 28 5. CASE STUDY:SAILAMSTERDAM 2015

LEGEND

WiFiWiFi sensor ((21x)21x) TelcameraTelcamera (8x)(8x)

Figure 5.1: Map of the SAIL area where the locations of the WiFi-sensors are indicated.

Plan of approach, during the 5 days of SAIL The SAIL programme was different for all five days. At the first day (August 19, 2015) the SAIL-in parade took place, meaning that all tall ships entered Amsterdam via the Noordzeekanaal. From 6 pm till 11 pm the ships were docked in the IJ-haven. On the second till the fifth day (August 20 till 23, 2015) of SAIL, the tall ships were open for visits from 10 am till 11 pm. Besides, multiple events were held all around the city on these days. The tall ships will depart on the last day (August 23, 2015) during the closing event SAIL-out. During the event itself a team of students and staff of the Delft University of Technology and the AMS In- stitute have collaborated to collect the needed data, also the required data for this thesis as described before. There are multiple tasks which are conducted during these days. The GPS-trackers had to be prepared, tested if the GPS trackers worked and ensure a connection between the computer and the GPS signal. GPS trackers were handed out from a AMS Institute stand at Amsterdam Central station. Where the team asked for e-mail address and mobile phone number to ensure the return of the trackers. In addition, the participants had to fill in a form, with a few personal details: age, gender and group size. Once the GPS-trackers were handed out the team kept track of the GPS-data and ensured that there are no troubles during the use and that people did not forget to bring the trackers back. Finally, the team walked around at the SAIL area to made notes of multiple aspects: the weather conditions; the general atmosphere of the event; special incidents; etcetera.

5.2. EXPECTATIONS OF THE CASE STUDY The following paragraphs give the expected outcome of the revealed preference data that was be collected at SAIL Amsterdam 2015, based on the literature review and the held brainstorm session.

Expected route choices The expectation was that most visitors will try to follow the main route of the event. If the event will be more crowded it was expected that more people got off the main route and tried to find more quiet routes. However, the main expectation was that they tried to keep close to the attractions of the event. On the other hand, once the visitors are more tired and try to get home, it was expected that they choose the fastest route back to their car, the train station or to their homes. In this case they probably do not care much about the scenery of the area of attractions, but they will prefer the fastest or shortest route home.

Expected crowdedness The main crowds are expected around the main attractions: the stages, the markets and the tall ships. Especially around the stages it is expected that people will stand still and enjoy the music, what could lead to large crowds. Besides, there are some areas where the capacity of the road decreases rapidly (f.e. at the Javakade where small bridges need to meet the capacity of the wide road before; and at the Veemkade where construction works block a certain point at the road) and congestion is expected.

Data processing The GPS trackers are linked to the ’My GPS Tracker’ database, where each GPS tracker can be followed real time, during SAIL. Afterwards the database will be converted to a Matlab file (.mat) wherein the GPS ID’s, times, latitude- and longitude coordinates are given. Each choice moment has a latitude- and longitude coordinate and each chosen route has its own coordinates. In Matlab, the data file with the GPS ID’s should first be linked to the coordinates of each choice moment (1 till 13). Once it is known how many GPS ID’s pass each choice moment and at which time, the chosen routes of the data file could also be linked to the coordinates of the choices. Resulting in a share of GPS ID’s which choose path a and another share which choose path b at each choice moment. After the calculation of the shares of each path, it is possible to analyse crowdedness at different time-moments during the event. 5.3.O VERVIEWOFTHE DAYS AT SAIL 29

5.3. OVERVIEWOFTHE DAYS AT SAIL This section gives an overview of the progress and of noticeable situations at SAIL of all five days of the mass event.

Day 1: SAIL-in and Veemkade The first day of SAIL Amsterdam 2015 started on Wednesday the 19th of August. It is a different day than the others, because on that day the tall ships enter the city of Amsterdam from the North Sea canal. It was the first day that this Revealed Preference study was held, whereas it also was the first day of data collection. Unfortunately, due to some errors at the server of the company My GPS Tracker, the GPS-trackers did not work on the first day. The team -which consisted of a professor, multiple teachers, PhD and graduate students of the TU Delft- handed out 5 GPS-trackers on the first day, but just one (tracker # 20) out of five gave sufficient results. However, the results of the tracker are not extremely useful for this study, because this elderly couple watched some ships at the central station and afterwards headed into the city centre of Amsterdam. This was indeed typical for the SAIL-in visitors: most of them stood next to the shores and watched the tall ships passing by. So not much walking on this first day, thus not many route choices were made. The ferry from the central station towards the north side of the IJ was very crowded (as was also seen on the other four days), and police needed to bring structure to the access and egress process of the ferries: by placing some fences.

Figure 5.2: Veemkade (photo by author, August 19, 2015)

Furthermore, after the shifts for handing out the GPS-trackers (around 10 pm), the team walked a small part of the orange route on the Veemkade until the Verbindingsdam. During this walk it could be seen that the event area was still a bit crowded. Figure 5.2 gives impression of the first evening. People could walk freely at most parts of the route. However, some bottlenecks were experienced at the Veemkade, due to more narrow roads: blocked by entrances or queues to the tall ships, tables and food stands.

Day 2: Java-eiland The second day, a tour was made by the author of this study to get a clear overview of the event and its ambiance. The orange route of the event was partly followed and some noticeable things were seen during this tour.

Figure 5.3: Temporary bridges on the Javakade (photo by author: August 20, 2015)

First it was noticed that some extra bridges were built at the Javakade, to cross the small canals. Figure 30 5. CASE STUDY:SAILAMSTERDAM 2015

5.3 shows these bridges and it also shows that most people -in a not very crowded situation- use the original bridge. My explanation for this would be the uncomfortable high step to get on the temporary bridges and the herding behaviour of people, who just follow their predecessor. However, it is expected that once it is more crowded more pedestrians may use the temporary bridges, to avoid queueing. However this is not notices by the GPS-trackers, because their accuracy only tracks the routes, not the specific location or path on a street. A second thing was seen at the Sumatrakade, where people needed to stay at the left (or right) side of a row of pawns. This actually worked pretty well and other modes were still able to use the other road part (see Figure 5.4). Furthermore, it was seen on this day that the main routes were guided by some (small) signs and larger matrix signs, and that the majority of pedestrians walked in these directions. Just a few pedestrians took a short cut or walked against the main flow.

Figure 5.4: Pawns divide multiple modalities (photo by author, August 20, 2015)

Day 3: Purple route On the third day of Sail the author took the Buiksloterweg ferry. First, a walk towards the Eye museum was made. Then, the Willem-I lock was crossed and the purple route was followed. The ferry on the IJ-plein was taken to return to the central station. A GPS-tracker was taken to track the whole route.It was remarkable to see that the North side of the IJ was appreciable less crowded than the south side at the orange route. Pedestrians were able to walk freely at most parts of the purple route and more alternative paths could be taken. However, a noticeable thing was seen on the Nordwal, where almost all pedestrians walked on the dike, while it was pretty crowded and an extra path was made on the grass field just next to the dike. A logical reason for this (herding) behaviour is that the tall ships and the IJ could only be seen from the dike and this was not clearly visible from the alternative path next to it. Once the author stood on the grass field, to make some photos and movies, more people followed and dared to take the alternative path. It is expected that at a more crowded situation on the dike -and people start to feel more uncomfortable- the alternative path will be taken more frequently. Another noticeable thing was the part of the purple route where you could return to the central station, this part of the route was deserted. A valid reason for this choice could be threefold: the route was badly indicated, because you had to enter a small alley before the signs were even visible; pedestrians returned via the ferry towards the orange route; pedestrians returned via the same path as they came from -the Nordwal- and were able to see the IJ again. At the north to south ferries large queues were formed. Also at the south quay queues were seen to cross the IJ. It was said that the ferries had troubles to cross the IJ frequently, because they had to manoeuvre across the ships which sailed on the water. Resulting in a lack of capacity of the ferries. Once returned at the south side of the IJ, a parade was seen at the Ruijterkade: Multiple vehicles, marine staff and music bands marched in a parade towards the station. Next, a bottleneck was experienced at the Ruijterkade, just at north side of the central station. People were getting uneasy and feeling uncomfortable, because they could not walk freely. The reason for this bottleneck could be the narrowing of the path, but also a sign was places in the middle of the Ruijterkade -as the figure shows- which created the opportunity for pedestrians to stand still around this sign. And automatically block a part of the road for the large flow of pedestrians. At the GPS stand it was noticed that on this Friday a lot of people visited SAIL around 7 pm, just after they returned from work. In addition, the GPS-trackers were handed in quite late, a lot of them returned just before midnight. 5.3.O VERVIEWOFTHE DAYS AT SAIL 31

Day 4: Veemkade and Java-eiland On the forth day of SAIL -a Saturday- it was expected to be most crowded, because people have a day off and the weather -just like all the other days- was very nice: sunny and around 30 degrees. This day the orange route was walked by the author of this study and a GPS-tracker was taken to track the route. The Veemkade -the starting point of the tall ships- was very crowded (see Figure 5.5). People were moving slowly and queues were formed to access the tall ships. The flow of pedestrians existed of different segments: the major flow of pedestrians who walked in eastern direction; a minor flow of pedestrians who walked in a western direction; and at the edges of the path people who stood still to either stand in a queue, rest or watch the tall ships.

Figure 5.5: A crowded Veemkade on Saturday afternoon (photo by author, August 22, 2015)

At the Javakade pedestrians were resting in the shades of the trees to cool down (see Figure 5.6). This caused some small congestions at the path. But this part of the orange route was less crowded than the Veemkade. On the other side of the Java-eiland at the Sumatrakade, the same pawns were used as explained at day 2. These still functioned pretty well in dividing the different modalities.

Figure 5.6: Trees at the Javakade provide shade for the pedestrians (photo by author, August 22, 2015)

Once returned on the mainland, it turned out that the trams did not work, due to a failure with the elec- tricity cables. Buses were deployed to transport the pedestrians from the end of the Veemkade (at station Rietlandpark) to the central station. The public transport was very crowded at this moment.

Figure 5.7: Different type of staff members at SAIL (photos by author, August, 2015)

Day 5: SAIL-out The fifth day was the last day of SAIL. At this day the ships started to sail out of the IJ-haven at 3 pm. The day finished around 6 pm when all tall ships had exited the harbour. The last GPS-tracker was handed in around 7 pm. 32 5. CASE STUDY:SAILAMSTERDAM 2015

A general thing that was noticed over the five days of SAIL is the huge amount of staff members who were visible at the event. There were lots of people who wore a t-shirt of SAIL as if they were part of the group, but what role each person had was not always clear. Figure 5.7 gives an overview of these different kind of staff members at SAIL. Another general remark can be made on the overload of signs at the event area. In Figure 5.8 an overview is given of the different kind of route signs which were present at SAIL. These signs could have had influence on the chosen routes, while on the other hand the overload of signs also could give contradicting suggestions for routes. Figure 5.9 gives an overview of the different matrix signs which were shown, where changes were made in the message the signs showed.

Figure 5.8: Overview of the different types of signs at SAIL (photos by author, August, 2015)

Figure 5.9: Overview of the matrix signs at SAIL (photos by author, August, 2015)

5.4. CONCLUSIONS A few quick conclusions or findings can be drawn after the RP data collection was finished on Sunday August 23, which are summed up below. These conclusions are based on the authors experience at SAIL. A first remark could be made on the locations of the attractions on the SAIL area. At every location where multiple GPS-trackers were ’standing still’, a main attraction was located. For example at the entrances of the most popular Tall Ships there was a queue to enter the ship. Furthermore, people were gathering at the music stages and stood still to wait for the ferries. Secondly, the main routes which were distributed by the SAIL organisation were followed by the majority of pedestrians. The same applied for the main direction of the flows. Only at the north side of the purple route, people tended to take another path. Thirdly, pedestrians tended to follow their predecessor, as could be seen at the temporary bridges and paths which were created especially for the SAIL event. It is important to notice that the weather was quite similar over the five days (sunny and dry) and there were no major incidents (or panic situations) at SAIL. This means that these two factors could have no influ- ence on the route choices of the pedestrians and do not influence the retrieved data. These quick conclusions will be verified by reviewing the GPS data. The outcome and processing of this GPS data is explained in the next chapters, in which more quantitative results are given. 6 DATA PROCESSINGAND QUANTITATIVE ANALYSIS OF THE GPSDATA

The previous chapter gave insight into the case study at SAIL Amsterdam 2015 and showed us how the Re- vealed Preference data was collected with the GPS trackers at the event itself. This chapter elaborates more on the GPS data of this case study. Section 6.1 gives an overview of the steps which are taken to process the raw data to a readable, analysable format. Then, in Section 6.2, a statistical analysis is conducted to find cor- relations between the socio-demographic characteristics and the route choice behaviour of the participants. Lastly, conclusions are drawn in Section 6.3.

6.1. PROCESSINGOFTHE GPSDATA FROM SAIL The raw data which is retrieved at SAIL should be processed further -into a more readable format- before any conclusions can be drawn out of the data. The software program Matlab is used for this. The data pro- cessing started with multiple input entities. These entities had to be transformed into the required output information by multiple steps in the processes. Figure 6.1 shows a flow diagram of all the steps which are taken in Matlab to process the data. The input, process and output boxes will be explained separately in the following paragraphs. Appendix F elaborates more in detail on each grey-box in the figure and the technical steps which are taken in Matlab.

Input data For the analysis in Matlab four different types of input were used. Firstly the raw data, the data which was retrieved from the server of MyGPStracker, consisting of one large matrix describing the routes people followed. This data-file indicated the GPS-tracker ID, longitude, latitude and time. Secondly, a Google Forms table was made at SAIL, containing the characteristics of each participant of the experiment. The characteristics which were used for this thesis arex Trip ID, GPS ID, departure time, arrival time, gender, age and group size. Next, a points of interest list (POI-list) was made to define in the SAIL Orange route area what routes the participants took. The locations of these POI are shown in Appendix F, Figure F.1. Finally, the GPS-data was checked manually. Routes are added when the GPS-data missed some data-points -see the next paragraph for explanation.

Data Processing The input, which is described in the previous paragraph, is used as a start for the process- phase. The main goal of this process is to understand the input-data, and to be able to analyse and conclude on the data. This paragraph will briefly explain the major steps. The raw data consisted of one file for all trips. One trip is defined here as starting at the time where a person collected a GPS-tracker at the stand until the time when this person handed in the same GPS-tracker at the stand. Furthermore, the GPS data contained some errors which can be separated in two categories: First, some trips had large time gaps between two subsequent points, because the GPS-trackers did not always work correctly during the trip. Second, some subsequent points in the data made too large steps, both in time and distance, because of an incorrect measurement of the GPS-tracker (for example caused by reflections on buildings). These two types of errors were solved by both manually adding some points in the data -if it was

33 34 6.D ATA PROCESSINGAND QUANTITATIVE ANALYSIS OF THE GPSDATA

INPUT

Raw data Trip data Points of Manually: (GPS server) (Google Forms) Interest list remove errors

Make data Trip Plotter per trip

PROCESS Smoothen the data

Find routes

Make structure per trip

OUTPUT

Characteristics Number of trips per trip per route

Figure 6.1: Flow diagram of the process done in Matlab clear that a trip followed a certain route- and by smoothing the data with a moving average method. Lastly, when the data was smoothed and completed, the routes of each trip were analysed.

Output As an output of this whole process the characteristics of each trip were found. From each trip it is known what the gender, age and group size the participant had. Furthermore, it is known what OD-pairs each trip crossed and if they did, what routes they took for a specific OD-pair. In Figure 6.2 the numbers for the origins (O) and destinations (D) are given. These OD-pairs were selected, because they formed logic borders to different areas. Number 1 is Amsterdam Central station, number 2 is the Verbindingsdam and number 3 is the Kop van Java. If this thesis elaborates on the trips on OD12, it means the trips which originate at origin number 1 and depart at destination number 2.

3 1 2

Figure 6.2: Origin (O) and Destination (D) numbers. In total four OD-pairs are formed: OD12, OD21, OD23 and OD32. 6.2.S TATISTICAL ANALYSIS OF THE GPSDATA 35

6.2. STATISTICAL ANALYSIS OF THE GPSDATA This section explains the first quantitative analysis that is done on the RP data. First some general remarks and numbers are given about the data. Secondly, a statistical analysis is conducted in the program SPSS. A few questions are formulated which are statistically tested with the RP data gathered at SAIL.

Sample size of the RP Data Figure 6.3 gives an overview of the number of trips which are captured during the RP case study at SAIL. Two bars are shown in this figure. The first bar is the total sample size of the participants, 322 trips, so this represents the total number of GPS-trackers that are handed out at the AMS Institute stand. From this total sample 155 trips have crossed the area at SAIL where this thesis is focusing on: the orange route area. This area is outlined white in the figure. The remaining 167 trips either did not visit the orange route area, or were lacking too much data points of the GPS-trackers. Especially on the first day there were some problems with keeping the GPS-trackers active, resulting in some participants who walked with a GPS-tracker that was not functioning.

322

155

SAIL area Orange route area

Figure 6.3: Overview of the number of trips recorded with the GPS-trackers.

Figures 6.4a and 6.4b show the characteristics of the 155 trips sample. Summarized there are 25 percent more males who participated, the average age of the participants is 46 years and the most common group size is 2. Overall it can be said that the general participant for this case study -but also at SAIL- were elderly couples. It is seen that the older the participant, the more likely it is a man. However, it should be noted that most were couples, so the groups existed of both man and women, only the man carried the GPS-tracker. The age and gender distribution looks different for the RP data than previously shown in the SP data (Table 4.2), where the peek laid around 25 years. Also for this RP data it is unknown if the sample is representative for the SAIL visitors, because data about the SAIL visitors was not as extensive.

(a) Age and gender distribution. Average age is 46 years. (b) Group size distribution. Average group size is 3. 25% More male participants.

Figure 6.4: Age, gender and group size distribution over the 155 trips sample. 36 6.D ATA PROCESSINGAND QUANTITATIVE ANALYSIS OF THE GPSDATA

Questions The four questions which are formulated to analyse the RP-data of SAIL are enumerated below. Each question which will be answered by using SPSS for conducting statistical analysis.

1. It is expected that pedestrians follow the main routes, which were indicated at the SAIL-area. Especially in the area of the orange route. Which routes are chosen by pedestrians who walked in the orange route- area?

2. Another expectation is that previously taken routes influence the next routes that are chosen. This question will be answered for the pedestrians who took the ferry from north to the Sumatrakade. They are expected to take the main route, but than backwards (a route via the Javakade and the Veemkade), because most attractions (tall ships, food and stages) are found on this route. Which routes are chosen to the station by pedestrians who arrive at the Sumatrakade via the northern ferry?

3. It is not expected that gender or age will have a large influence on the route choice of pedestrians. It could however be that the more elderly take shorter routes, due to physical characteristics of the pedes- trian. Furthermore, it is expected that group size has an influence on the route choice of pedestrians. Because large groups are expected to walk more slowly and maybe have a more specific goal to reach. Does age, gender or group size have influence on the route choice?

4. Another question is that the day and time of that day, can play a role in the route choices pedestrians made. For example at very crowded time (such as Saturday during the day) it is expected that more people will take other routes than the main route, due to the crowdedness. Does the time or day of the trip influence the route choices?

1. Which routes are chosen by pedestrians who walked in the orange route-area? In Figure 6.7a until 6.9b the route shares of different type of routes are shown for the four analysed Origin Destination-pairs: sta- tion > Verbindingsdam (OD12); Verbindingsdam > station (OD21); Verbindingsdam > Kop van Java (OD23); Kop van Java > Verbindingsdam (OD32). The shares of the OD-pairs station-Verbindingsdam-station and Verbindingsdam-Kop van Java-Verbindingsdam will both be explained separately in the following paragraphs. Besides for trips which went both forth and back between one OD-pair a total of 79 trips is found for OD12=21 and for OD23-32 a total of 51 trips is found (Table 6.1 and 6.2). In Figures 6.5 and 6.6 the alternative routes are shown, which are distinguished in the GPS-data for this thesis. A simplification is made of the real routes that were found in the GPS-data: the short-cut routes are not all taken into account (there are short-cuts taken at many crossing streets), but they are combined in one choice of route. This short-cut route is taken at the place where most short cuts in reality were taken, so therefore this assumption does not differ too much from the reality.

OD12 OD23 ROUTE Main route Alternative route Short-cut to main route 3 Short-cut 1 2 Figure 6.5: Alternative routes for OD12 and OD23 (station to Kop van Java).

Station > Verbindingsdam > Station: In Table 6.1 a cross table is shown of the two OD-pairs OD12 and OD21. It says when a certain route is picked on the first OD-pair, what route will be picked on the latter OD- pair. There are five types of routes shown in the figures: main route, which is the main route indicated by SAIL in the right direction; the alternative route is the opposite route, which is the main route for the counterflow of pedestrians; the short cut to the main route is the short cut from the Piet Heinkade to the Veemkade via the Vriesseveem; a short cut, is another cut from or to the main route via another street; public transport is the share of trips who took either bus or tram from around the Verbindingsdam-area. 6.2.S TATISTICAL ANALYSIS OF THE GPSDATA 37

OD21 OD32 ROUTE Main route Alternative route Short-cut to main route 3 Short-cut 1 2

Figure 6.6: Alternative routes for OD21 and OD32. (Kop van Java to station).

It can be seen that the main route is picked most frequently from the station to the Verbindingsdam. While on the return trip to the station, the short cuts are chosen most frequently. Furthermore the share of trips who take a short cut to the main route is also present, both back and forth. And in addition, more than ten percent takes public transport back to the station.

(a) OD12: Station > Verbindingsdam (b) OD21: Verbindingsdam > Station

Figure 6.7: Route shares of trips at OD12 and OD21.

The figures above take into account all trips who walked on a certain route. However, in reality it is seen that not all trips who take the main route at OD12 complete the full route until the Verbindingsdam. Some of the trips return halfway the main route. Figure 6.8b shows multiple points at the main route of OD12 and Figure 6.8a shows the share of trips which came until each point. It is visible that most of the trips walk the full length of the route. Besides, a large share (17%) walks until point B. This is the point where a square starts with some trees and a bit further multiple supermarkets and shops are present. This could be presumed by the participants as a logical point to turn around. In Appendix G a more extensive analysis is done for trips which returned halfway at OD12 (26 trips) versus trips which walked the full main route (52 trips). Overall it is shown in this extended analysis that trips which returned halfway tend to return via the same route more frequently, compared to trips which walked the full route.

F 1 E D C B A

(b) Turning points at main route OD12 (a) Shares at main route OD12

Figure 6.8: 78 Trips at main route OD12 divided by their turning points. 38 6.D ATA PROCESSINGAND QUANTITATIVE ANALYSIS OF THE GPSDATA

Table 6.1: Cross table of trips which walked from the station to the Verbindingsdam (OD12) and back to the station (OD21).

OD21 Alternative Main Short cut to Public Total Share OD12 route route main route Short cut Transport Main route 12 3 3 26 5 49 62% Alternative route 0 0 0 0 1 1 1% Short cut to main route 0 5 0 8 1 14 18% Short cut 0 4 0 8 3 15 19% Total 12 12 3 42 10 79 Share 15% 15% 4% 53% 13% 100%

Verbindingsdam > Kop van Java > Verbindingsdam: In these OD-pairs it is striking to see that on the way forth, almost 70 percent chooses the main route and just 20 percent chooses the alternative route. While on the way back this amount is very segregated, and almost equally divided between the main route (via the Sumatrakade), the alternative route (via the Javakade) and short cuts. So on the way up, people tend to take the main route more frequently than on the way back.

(a) OD23: Verbindingsdam > Kop van Java (b) OD32: Kop van Java > Verbindingsdam

Figure 6.9: Route shares of trips which walked at OD23 or OD32.

Table 6.2: Cross table of trips which walked from the Verbindingsdam to the Kop van Java (OD23) and back to the Verbindingsdam (OD32).

OD21 OD23 Alternative route Main route Short cut Total Share Main route 7 17 11 35 69% Alternative route 9 0 1 10 20% Short cut 3 1 2 6 12% Total 19 18 14 51 Share 37% 35% 27% 100%

It is also remarkable to see that on the Java-eiland more people follow the main route on the way back than from the Verbindingsdam to the central station. Apparently the main route on the Java-eiland is more attractive than the main route from Verbindingsdam to station.

2. Which routes are chosen to the station by pedestrians who arrive at the Sumatrakade via the northern ferry? Before answering the question, the correlations for all OD-pairs are shown in Table 6.3. It is found that there are significant correlations between: OD12 and OD23; OD12 and OD32; OD23 and OD32. This means that the previous taken route has influence on the next route that is chosen at the following OD- pair. The specific case for route choices, after a pedestrian egressed the northern ferry at the Sumatrakade, is explained in the next paragraph. When the ferry from the north to the Sumatrakade is taken by a pedestrian, it is more likely that the alternative route is taken on OD21 (See Table 6.4). This route is the main route as suggested by the SAIL organisation, but than in the opposite direction. This finding could mean that the attractions which can be found on these first routes are very attractive to pass, so if a pedestrian did not pass it before he or she really wants to pass the attractions anyway. Furthermore, it is seen in Table 6.5 that especially the short cuts are 6.2.S TATISTICAL ANALYSIS OF THE GPSDATA 39

Table 6.3: Correlations between the routes of each OD-pair. *Correlation is significant at the 0.05 level (2-tailed). **Correlation is signifi- cant at the 0.01 level (2-tailed).

OD12 OD21 OD23 OD32 OD12 Pearson Correlation 1 .068 .182* -.184* Sig. (2-tailed) .404 .024 .022 OD21 Pearson Correlation .068 1 .034 .061 Sig. (2-tailed) .404 .674 .454 OD23 Pearson Correlation .182* .034 1 .345** Sig. (2-tailed) .024 .674 .000 OD32 Pearson Correlation -.184* .061 .345** 1 Sig. (2-tailed) .022 .454 .000 taken more frequently, compared to pedestrians who did not take the ferry (48% versus 26%). This can be explained by the fact that pedestrians prefer to see the Javakade. So they either take the alternative route (via the Javakade) when they egress from the ferry, or they take a short cut to the Javakade.

Table 6.4: Ferry North > Sumatrakade: Verbindingsdam > station (OD21)

Alternative Short cut to Public OD21 route Main route main route Short cut Transport Total Share No Ferry 13 15% 15 17% 5 6% 45 51% 11 12% 89 80% Ferry Noord > Sumatrakade 10 45% 1 5% 0 0% 10 45% 1 5% 22 20% Total 23 21% 16 14% 5 5% 55 50% 12 11% 111 100%

Table 6.5: Ferry North > Sumatrakade: Kop van Java > Verbindingsdam (OD32)

OD32 Alternative route Main route Short cut Total Share No Ferry 21 39% 19 35% 14 26% 54 70% Ferry Noord > Sumatrakade 9 39% 3 13% 11 48% 23 30% Total 30 39% 22 29% 25 32% 77 100%

3. Does age, gender or group size have influence on the route choice? As Table 6.6 shows, the socio de- mographic characteristics of the pedestrians do have little influence on the route choices, because there is no significance found (at the 0.05 level) for the correlations. It was expected that group size would have influence on the route choices. However, not many large groups participated in the GPS-tracker test -the dominant group size was 2- so it is expected that the influence is able to be found in this RP dataset.

Table 6.6: Correlations of socio demographic characteristics versus route choice at each OD-pair

OD12 OD21 OD23 OD32 Gender Pearson Correlation -.069 .051 -.093 -.050 Sig. (2-tailed) .394 .532 .249 .538 Age Pearson Correlation -.129 .044 -.057 -.031 Sig. (2-tailed) .109 .587 .483 .706 Groupsize Pearson Correlation -.060 -.017 .009 .016 Sig. (2-tailed) .457 .831 .915 .841

In Figures 6.10 and 6.11 it is visible that there is a difference, based on their age, between the routes the participants took at the SAIL-area. It is clearly visible that there are more participants above 44 years who took a boat trip in the IJ and this group tends to visit the city-centre more frequently. While the participants under 45 years tended to go more frequently to the other parts of the SAIL-area: the NDSM-werf for example, because it is visible that this group is a more frequent user of the ferry to the north west side of the IJ (towards the NDSM-werf). 40 6.D ATA PROCESSINGAND QUANTITATIVE ANALYSIS OF THE GPSDATA

Figure 6.10: Trips from participants younger than 45 years old (45 trips).

Figure 6.11: Trips from participants who are 45 years or older (90 trips).

4. Does the time or day of the trip influence the route choices? For this analysis each day of the data (20-08-2015 until 23-08-2015) is divided in three parts: morning (9-12 am), afternoon (12-6 pm) and evening (6-12 pm). In Table 6.7 the correlations are shown of each day-part (per day) and the choices that are made for each OD-pair. It can be seen that both for the start and end time, there is a correlation for OD32. How this value can be interpreted is still uncertain. Furthermore, Appendix G shows the trips separated by start and end day-parts of the trip. Which shows that trips that starts in either the afternoon or evening, and end in that same day-part, do not take boat trips.

Table 6.7: Correlations between start- and end-day part of the trip versus route choices. The day is divided in three parts: morning (9-12 am), afternoon (12-6 pm) and evening (6-12 pm). * Correlation is significant at the 0.05 level (2-tailed).

OD12 OD21 OD23 OD32 Started day part Pearson Correlation 0.026 0.076 -0.073 -.175* Sig. (2-tailed) 0.747 0.348 0.368 0.029 Ended day part Pearson Correlation 0.019 0.07 -0.063 -.173* Sig. (2-tailed) 0.819 0.387 0.435 0.031

Day of the week influence In addition to the time of the day, the day of the week is taken into account. For each day a plot on the map is made, to show the different trips per day: divided in Thursday August 20th, Friday August 21st , Saturday August 22nd and Sunday August 23rd . A clear difference can be seen between Sunday and the other three days: on Sunday all the trips chose their routes around the IJ. An obvious reason could be that this was the last day of SAIL: it was the day of SAIL-out, when all the tall ships left Amsterdam. So this was a very interesting day to watch the watersides and see the tall ships leave. It is also visible that on this same Sunday the Javakade is more crowded than the Sumatrakade, while on the other three days this seems more equally distributed. Also for this phenomena the SAIL-out could be the reason, because the most interesting tall ships were situated in between the Veemkade and the Javakade. 6.2.S TATISTICAL ANALYSIS OF THE GPSDATA 41

Figure 6.12: Trips on Thursday August 20th (26 trips).

Figure 6.13: Trips on Friday August 21st (50 trips).

Figure 6.14: Trips on Saturday August 22nd (47 trips).

Figure 6.15: Trips on Sunday August 23rd (32 trips). 42 6.D ATA PROCESSINGAND QUANTITATIVE ANALYSIS OF THE GPSDATA

Total trip time influence As Ben-Akiva et al. already showed for cars in their conclusion on "Modelling inter urban route choice behaviour" [42, p.328], route choice preferences vary significantly by trip length, trip purpose and trip frequency. For this thesis the frequency is less of an influence but length or the duration of the trip was expected to have influence on the route choices of the participants. Figures 6.16, 6.17, 6.18 and 6.19 show different lengths of the total trip times. A split is made after three hours, because this turned our to be a time frame where after changes in trips are clearly visible. So the first figure shows the trips with a total time which is less than 3 hours. The second figure shows the trips with a total time between 3 and 6 hours. The third figure shows the trips with a total time between 6 and 9 hours. The last figure shows the trips over 9 hours. A clear distinction can be made between these four figures. It is shown that trips under 3 hours do not go very far: they do not reach the Java-eiland, or the north side of the IJ. Trips over 3 hours start to take ferries to visit the north side of the IJ, visit the Java-eiland and they start taking pleasure boat trips on the IJ. Trips over 6 hours start visiting the north western side (the NDSM-werf) of the SAIL-area more regularly: they take the most western ferry. Trips over 9 hours start visiting the city centre more frequently.

Figure 6.16: Trips where the total time is less than 3 hours (11 trips).

Figure 6.17: Trips where the total time is equal or more than 3, but less than 6 hours (76 trips).

Figure 6.18: Trips where the total time is equal or more than 6, but less than 9 hours. (53 trips). 6.3.C ONCLUSIONS 43

Figure 6.19: Trips where the total time is equal or more than 9 hours (15 trips).

6.3. CONCLUSIONS It can be concluded that 155 trips of the retrieved data were useful for the analysis of the SAIL-area around the orange route. It is found that many trips took the main route (the orange route), when they started their trip. When pedestrians returned to the station, more differentiation was found in the routes. Furthermore, it is shown that the previous part of a trip has influence on the next part: if the ferry from north was taken to the Sumatrakade, it is more likely that these pedestrians take the Javakade and the Veemkade. A reason for this could be the attractions (tall ships, music stages food stands, etcetera) which can be mainly found on this route. For Sunday it is noticeable that all trips took place alongside the water. This could be explained by the SAIL-out event which took place on that specific day. Trips which lasted only one day-part (started and ended in the evening or afternoon) did not take a leisure boat trip. Visitors under 45 years were more frequent vis- itors of the north west side (NDSM-werf) of the SAIL area, which indeed is an area for more younger audi- ence. Furthermore, it was found that the trip time and start time of the trips had influence on the areas were the trips took place. Trips under three hours did mainly visit the Veemkade and returned (just) before they crossed the Verbindingsdam. Trips over three hours did reach the Java-eiland and a larger share of the trips over six hours visited the north west side (NDSM-werf) of the SAIL area. Finally, trips over nine hours, which could be considered the day visitors, did almost all take a boat trip and visited the city centre of Amsterdam more frequently. The reasons for the chosen routes will be further analysed in the next chapter, in which multiple MNL route choice models are estimated to find significant parameters, and their corresponding extend of influ- ence.

7 FACTORSOF INFLUENCE ON ROUTE CHOICESAT SAIL

The previous chapter gave a description of the quantitative analysis of the RP data, it showed which routes were chosen most frequently. This chapter will use the RP data from SAIL to find the reasons for the route choices that all participants took. First, the alternative routes are explained in Section 7.1. Second, the pos- sible factors of influence -or attributes- have to be determined for the case study area at SAIL. This will be conducted in Section 7.2, where for each chosen attribute it is explained how they are quantified. Once the quantitative presence of these attributes is specified choice models will be estimated to determine the level of influence of each attribute. Each attribute will first be estimated separately for different OD-pair, then the significant models are combined. Finally, conclusions are draw in Section 7.3.

7.1. ALTERNATIVE ROUTESINTHE SAILAREA For each OD-pair that is examined in the SAIL area (OD12, OD21, OD23 and OD32)), multiple alternatives -or routes- might be chosen. This thesis includes four categories of routes, which are named before in the previous chapter: Main route, alternative route, short cut to main route and short cut. The figures, and counts of these routes were shown in the previous Chapter 6. It should be noted that the short cut routes are in re- ality not always similar: each pedestrian could have taken another street as short cut. Just these four route categories are specified, due to simplification of the assignment for this thesis, and lack of variety and quan- tity of the data sample. This thesis will neglect the possible alternative public transport, since mode choice is not within the scope of this thesis. Each of the remaining four alternatives are linked to the attributes. So for example: the main route of the station to the Verbindingsdam (OD12) crossed a lot of attractions (the tall ships) along the route. While the alternative route for this OD-pair crossed no attractions. How this will be quantified exactly, is explained in the following section.

7.2. QUANTIFYAND ESTIMATE ATTRIBUTES AT SAIL This section will link the chosen routes at SAIL to the six attributes which were defined earlier in this thesis: attractions, crowdedness, signs, water, trees and road width. First the methodology is explained, then the attributes are estimated one by one. To relate the six defined attributes to the chosen routes at the event, each attribute has to be specified based on the environment and/or the situation at SAIL. There are several ways to link these attributes to the real situation, but it is not known what way is the best one. Each OD-pair will be estimated separately for all the attributes, since the previous chapter showed that the choices of each OD-pair were quite divergent. Possible methods to link each attributes to the routes, are described in the next paragraphs. The general approach starts with the specification of different methods -per attribute- to link the attribute to the routes or situation at SAIL. Then, choice models are estimated and the significant parameters of each method are found. Finally the method which ensures the highest significance of the attribute to calculate multiple at- tributes in one choice model. A difference will be made between a so-called (for this thesis) binary- and scalar method to quantify the attributes for each alternative. They differ in whether they are taking the length of a route into account or not.

45 46 7.F ACTORSOF INFLUENCE ON ROUTE CHOICES AT SAIL

7.2.1. BINARY METHOD First, a binary method is used to quantify the attributes for each alternative route (7.1). This means that an attribute can either be present (1) or not present (0) at a given route. In Table 7.1 an overview is given for all these discrete values for each attribute. In the following paragraphs an explanation is given of these binary values.

Table 7.1: The discrete values of the attributes. They are specified for each route at each OD-pair.

OD-pair Alternative / Route choice Attractions Crowdedness Signs Road Size Trees Water OD12 1 Main route 1 1 1 1 0 1 2 Alternative route 0 0 0 1 1 0 3 Short cut to main route 1 1 1 0 0 1 4 Short cut 0 0 0 0 0 0 OD21 1 Alternative route 1 1 0 1 0 1 2 Main route 0 0 1 1 1 0 3 Short cut to main route 1 1 1 0 0 1 4 Short cut 0 0 0 0 0 0 OD23 1 Main route 1 1 1 1 1 1 2 Alternative route 0 0 0 1 1 1 3 Short cut 0 0 0 0 0 1 OD32 1 Alternative route 1 1 0 1 1 1 2 Main route 0 0 1 1 1 1 3 Short cut 0 0 0 0 0 1

Binary quantify attractions The first method for specifying attractions at the SAIL area is binary: there either are attractions (value 1) or there are no attractions (value 0). Attractions at the SAIL area are considered multiple objects for this thesis. First of all, the tall ships are considered major attractions, because this is what the SAIL event is all about. Secondly, there are multiple music stages at the SAIL area, which also attracted visitors of the event. Lastly, food- and drink-stands are also considered attractions at the event, because many persons will take a break during their visit at SAIL and therefore go to a food- or drink-stand. Figure 7.1 shows what routes are for the binary method considered with a value 1 or a value 0. In the case of SAIL the major attractions are found at the Veemkade and the Javakade. The other part of the route -the Sumatrakade and the Piet Heinkade- have no or little attractions. When a short cut is taken attractions are considered to have a value of zero, because it is seen in the data that most short cuts get away from the attractions (Veemkade or Javakade) and thus are perceived chose this route based on another attribute.

Sumatrakade

Ja vakad e Ve emk ade Piet Hei n kade

Figure 7.1: Attraction zones at the SAIL-area. Blue means that there are major attractions on this route and is given value 1 in Table 7.1. Grey means no or little attractions on the route and is given value 0.

Binary quantify crowdedness and signs A method to include the crowdedness in the choice model, is to give an estimation based on the experiences our team had at SAIL. This trivial method is not very precise, but it is easy to implement. Due to the low confidence this will be done in a discrete manner. The main routes for OD12 and OD23 were generally the most crowded and are given the number one. The short cuts were seen to be used as a ’get away’ from the crowded, so therefore they are given a value of zero. For signs a value of one says that route which is chosen, is most of the route as the main signs have indicated. So the largest part of the route follows the main route. In Figure 7.2 the main direction of the signs is shown. If the chosen route is in line with this direction the value of 1 is given in Table 7.1. 7.2.Q UANTIFYAND ESTIMATE ATTRIBUTES AT SAIL 47

Figure 7.2: Main routes which are indicated by the signs in the SAIL-area.

Binary quantify road size, trees and water Figure 7.3 indicates the roads which are considered as narrow streets in this thesis. If a route crossed a narrow street, the value of 0 is given. If a route did not cross any narrow street the value of 1 is given. If there are dominantly trees found the routes are given a value of one, if not a value of zero is given. Routes who have trees are the Piet Heijnkade (main route OD21), the Javakade (main route OD23) and the Sumatrakade (main route OD32). The SAIL area consists of a lot of water, which makes it quite difficult to specify this attribute and to give a value which can be useful. Especially at het Java-eiland there is water along almost every route, so it will be difficult to find the effect of water on the route choice. At OD12 and OD21 there is a bit more differentiation in water or no water. If there is dominantly water alongside the route, a value of 1 is given to this attribute.

Figure 7.3: Narrow streets are indicated with blue arrows in the map of the SAIL-area.

CHOICE MODEL ESTIMATION Firstly, each attribute is estimated separately per OD-pair in the choice models. In total twenty-four MNL choice models (Multi Nominal Logit) are estimated. In Equation 7.1 until 7.3 the utility functions of the choice model is given for the attribute attractions, specified for OD-pair 12 and 21 (MNL #1 and #5 in Table 7.4). The utility functions for the other five attributes look similar, but have other betas (parameter values) and other attribute values. There are four alternatives -or alternative routes- where the pedestrian could chose from at OD12 and OD21: the main route (U1), the alternative route (U2), the short-cut to the main route (U3) and the short-cut (U4). For OD23 and OD32 the pedestrian could chose out of three alternatives: the main route (U1), the alternative route (U2) and the short-cut (U4). The alternative specific constants (ASC’s) are added to capture the the value of the unobserved part of the utility of each alternative. For one alternative (the short cut) the ASC is set to zero.

U ASC β Attractions (7.1) 1 = 1 + 1 × 1 U ASC β Attractions (7.2) 2 = 2 + 1 × 2 U ASC β Attractions (7.3) 3 = 3 + 1 × 3 U β Attractions (7.4) 4 = 1 × 4

Results MNL estimation - binary method The results of these twenty-four MNL models are shown in Table 7.2. Multiple parameters are found significant (the blue values), which are all ASC values and not betas. The ρ2-values for OD12 and OD23 are quite high (0.313 and 0.403) while the values for OD21 and OD32 are lower (0.184 and 0.055). If the ρ2 is higher it means that the estimation fits the data-set more accurately, so 48 7.F ACTORSOF INFLUENCE ON ROUTE CHOICES AT SAIL

Table 7.2: Parameter value estimates, wherefore the binary data is used to estimate MNL choice models. Each parameter is estimated individually in Biogeme (In the table each row represents one estimation). The parameters which are found to be significant are coloured blue and are marked (*).

Main Alterna- Short cut to Attrac- Crowd- Road MNL OD- route tive route main route tions edness Signs size Trees Water Final log- Adjusted 2 # pair ASC1 ASC2 ASC3 β1 β2 β3 β4 β5 β6 likelihood ρ 1 12 value 0.90 -3.09 -0.55 0.35 -108.438 0.313 p-value 1.00 0.00* 1.00 1.00 2 12 value 0.90 -3.09 -0.55 0.35 -108.438 0.313 p-value 1.00 0.00* 1.00 1.00 3 12 value 0.9 -3.1 -0.6 0.4 -108.438 0.313 p-value 1.00 0.00* 1.00 1.00 4 12 value 1.9 -2.5 -0.2 -0.6 -108.438 0.313 p-value 1.00 1.00 0.53 1.00 5 12 value 1.3 -1.6 -0.2 -1.6 -108.438 0.313 p-value 0.00* 0.14 0.53 1.00 6 12 value 0.9 -3.1 -0.6 0.4 -108.438 0.313 p-value 1.00 0.00* 1.00 1.00 7 21 value 0.29 -1.23 -1.46 -1.16 -106.901 0.184 p-value 1.00 0.00* 1.00 1.00 8 21 value 0.29 -1.23 -1.46 -1.16 -106.901 0.184 p-value 1.00 0.00* 1.00 1.00 9 21 value -0.87 0.05 -1.34 -1.29 -106.901 0.184 p-value 0.00* 1.00 1.00 1.00 10 21 value -0.17 -0.53 -2.62 -0.70 -106.901 0.184 p-value 1.00 1.00 0.00* 1.00 11 21 value -0.87 -0.62 -2.62 -0.62 -106.901 0.184 p-value 0.00* 0.99 0.00* 0.99 12 21 value 0.29 -1.23 -1.46 -1.16 -106.901 0.184 p-value 1.00 0.00* 1.00 1.00 13 23 value 1.4 1.1 - 1.4 -36.993 0.403 p-value 0.96 0.10 0.96 14 23 value 1.4 1.1 - 1.4 -36.993 0.403 p-value 0.96 0.10 0.96 15 23 value 1.4 1.1 - 1.4 -36.993 0.403 p-value 1.00 1.00 1.00 16 23 value 1.5 -0.2 - 1.3 -36.993 0.403 p-value 1.00 1.00 1.00 17 23 value 1.5 -0.2 - 1.3 -36.993 0.403 p-value 1.00 1.00 1.00 18 23 value 2.8 1.1 - 0.0 -36.993 0.403 p-value 0.00* 0.10 1.00 19 32 value 0.6 1.0 - 0.6 -58.285 0.055 p-value 0.40 0.01* 0.47 20 32 value 0.6 1.0 - 0.6 -58.285 0.055 p-value 0.40 0.01* 0.47 21 32 value 1.3 0.5 - 0.5 -58.285 0.055 p-value 0.00* 0.16 0.29 22 32 value 0.5 0.2 - 0.8 -58.285 0.055 p-value 1.00 1.00 1.00 23 32 value 0.5 0.2 - 0.8 -58.285 0.055 p-value 1.00 1.00 1.00

the behaviour of the pedestrians can be estimated more accurately by that model. This was also seen in the statistical analysis. On the way up (OD12 and OD23) most pedestrians tended to follow the main routes, while on the way back (OD21 and OD32) their choice behaviour was less uniform. The values for the attributes attractions, crowdedness and signs are for OD12, OD23 and OD32 all positive. This means that these attractions attract the pedestrians. While for OD21 these same three attributes all have a negative value. This means that they repel the pedestrians on these routes. This can be explained by the fact that OD21 really is a return route. It might be possible that most pedestrians do not want to see the attractions again, and they tend to take the less crowded routes or the fastest routes back. Also the signs are not followed exclusively. The attributes road size and trees have negative values for OD12 and OD21, while at OD23 and OD32 these attributes have positive values. The attribute trees could be explained by the location of the trees in the area. 7.2.Q UANTIFYAND ESTIMATE ATTRIBUTES AT SAIL 49

At the Veemkade (the main route for OD12) there are almost no trees to be found, while there are many trees alongside the Piet Heijnkade (the alternative route for OD21). The Veemkade is chosen most frequently, so therefore the trees are found to be repellent. At the Java-eiland, the distribution of the trees is more equally divided -both the Javakade and the Sumatrakade have trees- so therefore they are not considered negative but positive for the route choices. The attribute water is quite similar at OD12 as the attribute attractions, because most attractions are found alongside or in the water. Therefore they have the same values for OD12 and OD21. At OD23 and OD32 this attribute is rather hard to estimate, because the Java-eiland -where these OD-pairs are located- just contains roads along the water. For OD23 the beta value is calculated zero and for OD32 the choice model could not be runned, because too many iterations were needed to find an estimation of the MNL model. In the next section a scalar method will be conducted to calculate the attribute values per route. The results of this scalar method will be compared with the binary method in the conclusions of this chapter. No combinations of the attributes are estimated for this binary method, because this method is considered to be too arbitrary.

7.2.2. SCALAR METHOD Since no significant betas were found by using the binary method for quantification of the attributes, a dif- ferent approach is used to quantify the attributes for each route. This approach is called the scalar method in this thesis. For the attributes attractions, signs, trees and water a similar approach is used. For road size and crowdedness a slightly different methods are used, and will therefore be explained separately.

Quantify attractions, signs, trees and water For the attributes attractions, signs, trees and water, a similar approach is used to quantify the attribute values. Instead of giving each route a binary value, this so-called scalar method looked more specific into each route and gave a value based on what percentage of the distance is alongside attributes. Instead of generalizing each route into one value -for example what was done for the short cut: all different short cuts were given the same value- each route could be individually given a number that represents a more accurate share of the attractions along that specific route. A ratio is given of the total length of a route versus the length of that route where the attribute is present. For example: the main route of OD12 has a total length of 2950 metres and the length of the attractions (tall ships) alongside this route is measured 2000 metres. So the attractions value or ratio is in this case 2000 0.68. For the attribute signs, 2950 = the length that pedestrians walked on the main route is taken as share of the total route length. So even for pedestrians who accessed the main route at a different location, the length on the main route is counted.

Quantify road size For the attribute road size, the decision point of the routes is taken as a measure. The width of the road of these decision points (in metres) is taken as the attribute value. For OD12 the decision point is considered to be the crossing of the Piet Heijnkade and the Vriesseveem. Pedestrians who go to the West at this point take the main route, to the North is the short-cut to the main route and to the East is the alternative route. For the fourth choice -the short-cut- for OD12 (which is equal to the forth choice of OD21), the most frequently used short-cut zone is taken: where the road width is around 10 meters. For OD21 the first three choices are based on the decision point at the most Southern point of the Verbindingsdam: where for the main route pedestrians take the Piet Heijnkade and for the alternative route the Veemkade. For OD23 and OD32 the width of the Javakade (main route OD23) and the Sumatrakade (main route OD32) and an intersecting street are used for the quantification of the road width. 50 7.F ACTORSOF INFLUENCE ON ROUTE CHOICES AT SAIL

Quantify crowdedness For the quantification of crowdedness WiFi-sensors are used. In AppendixG, more explanation on the calculations of this attribute is given. There were multiple WiFi-sensors at the SAIL area. To estimate the attribute crowdedness, this thesis will use the data of a few of the WiFi-sensors (Figure G.7), depending on the route which is taken the split of that route will be assigned as the crowdedness split (Table G.6). For example: the route split is found for the intersection between the Piet Heijnkade and the Vries- seveem, because at this point a decision can be made for OD12: go to the main route, the short cut to main route or the alternative route. The difficulty of this attribute is that it varies over time. for example -as was seen at the event itself- on Thursday afternoon it was less crowded than on Saturday afternoon. This conti- nuity in time will be taken into account during the quantification of the crowdedness for each route (Table G.7). It should be noted that the data of the WiFi-sensors was not always consistent, and some data in time is missing. For example: If at OD21, choice 1 the data from WiFi-sensor 1326>1295 and 1326>1299 is known, but the data from WiFi-sensor 27>21 is zero. Then the split for this route choice will be zero, while the split for choice 2 could be a value. Due to this multiplication, illogical splits could occur. But due to the time consuming process to find these faulty values this matter is ignored in this thesis. It is assumed that the data is sufficient for further model estimations.

Table 7.3: The scalar values of the attributes (minus crowdedness). They are specified for each route at each OD-pair. See Appendix G, Table G.5 for a more detailed explanation how these values are determined. And see Table G.7 for the attribute values for crowdedness.

OD-pair Choice Explanation Attractions Signs Road Size [m] Trees Water OD12 1 Main route 0.68 1 24.5 0.05 0.76 2 Alternative route 0 0.26 34.5 0.64 0 3 Short cut to main route 0.70 0.98 16 0.07 0.70 4 Short cut 0.44 0.67 10 0.07 0.42 OD21 1 Alternative route 0.68 0.19 12.5 0.05 0.76 2 Main route 0 1 26 0.64 0 3 Short cut to main route 0.70 0.26 16 0.07 0.70 4 Short cut 0.42 0.53 10 0.07 0.42 OD23 1 Main route 0.67 1 9 0.63 1 2 Alternative route 0.29 0.23 40 0.35 1 3 Short cut 0.38 0.53 6.5 0.5 1 OD32 1 Alternative route 0.67 0.23 16 0.63 1 2 Main route 0.29 1 21 0.35 1 3 Short cut 0.38 0.59 6.5 0.5 1

CHOICE MODEL ESTIMATION Similar as the previous, binary method, for the scalar method the attributes are first estimated individually per OD-pair. The same utility functions are used as shown in Equations 7.1 until 7.4. In Table 7.4 the results are shown of the twenty-seven estimated MNL choice models. Besides, the choice models are estimated with- out ASC values, of which the results are shown in Table 7.5. Equations 7.5 until 7.8 show the utility functions for the first choice model estimation (MNL #28) without ASC values.

U β Attractions (7.5) 1 = 1 × 1 U β Attractions (7.6) 2 = 1 × 2 U β Attractions (7.7) 3 = 1 × 3 U β Attractions (7.8) 4 = 1 × 4

Results MNL estimation individual attributes - scalar method Table 7.4 shows less significant parameters than was found for the binary method, although for that method the ASC values were the only parameters found to be significant. The scalar method shows that there is one significant beta: for attribute crowdedness for OD12 (value varying between 3.4 and 5.69). This is a real high value and shows that the crowd is a real attractor to chose your route at SAIL. Also for the other OD-pairs the crowdedness has a positive value and is quite high (value 1.3, 3.5 and 1.7), but it is not significant for these OD-pairs. This might mean that people tend to follow each other and go were the crowd is, especially on the way forth- at OD12 (value 3.4) and OD23 (value 3.5). Maybe if people are visiting a mass-event they come for leisure purposes and they tend to visit the locations where most people are going and they tend to follow the crowd. This is also called herding behaviour. 7.2.Q UANTIFYAND ESTIMATE ATTRIBUTES AT SAIL 51

The alternative specific constants (ASC) say something about the alternatives themselves. At OD12 the ASC1 (of alternative route 1 -the main route along the Javakade) has for all attributes the most positive value, so this route is most attractive. While ASC2 (of alternative route 2 -the alternative route along the Piet Hei- jnkade) has for all attributes the most negative value, so this route is least attractive. For OD21 the short cut has a value of zero (because it has no ASC), while the other values are all negative. So this means that the short cut is the most preferred alternative route. For the scalar method the adjusted ρ2 values are highest for OD12 and OD23: varying from 0.269 until 0.339. For OD21 the values are a little less high: varying from 0.134 until 0.163. And for OD 32 the values are way less: varying from -0.081 until -0.03 (which are negative and can be interpreted as zeros).

Table 7.4: Parameter value estimates, wherefore the scalar data is used to estimate MNL choice models. The parameters which are found to be significant are coloured blue and are marked (*).

Main Alterna- Short cut to Attrac- Crowd- Road MNL OD- route tive route main route tions edness Signs size Trees Water Final log- Adjusted 2 # pair ASC1 ASC2 ASC3 β1 β2 β3 β4 β5 β6 likelihood ρ 1 12 value 0.96 -2.55 -0.52 1.22 -108.438 0.313 p-value 1.00 1.00 1.00 1.00 2 12 value 0.59 -2.63 -0.20 3.4 -103.064 0.346 p-value 0.06 0.01* 0.53 0.01* 3 12 value 0.86 -2.61 -0.57 1.18 -108.438 0.313 p-value 1.00 1.00 1.00 1.00 4 12 value 2.26 -1.39 0.22 -0.07 -108.438 0.313 p-value 1.00 1.00 1.00 1.00 5 12 value 1.23 -2.32 -0.20 -1.35 -108.438 0.313 p-value 1.00 1.00 0.53 1.00 6 12 value 0.84 -2.58 -0.54 1.22 -108.438 0.313 p-value 1.00 1.00 1.00 1.00 7 12 value - -2.36 - 5.69 -106.103 0.339 p-value 0.03 0.00* 8 12 value - -2.15 - 0.39 5.53 -106.032 0.333 p-value 0.07 0.70 0.00* 9 12 value 0.36 -2.210 -0.45 0.95 3.4 -103.064 0.339 p-value 1.00 1.00 1.00 1.00 0.01* 10 12 value 0.98 -0.52 -0.25 0.4 3.4 0.46 -0.06 -0.32 0.48 -103.064 0.315 p-value 1.00 1.00 1.00 1.00 0.01* 1.00 1.00 1.00 1.00 11 21 value -0.80 -1.36 -2.32 -0.29 -109.989 0.169 p-value 1.00 1.00 1.00 1.00 12 21 value -0.99 -1.25 -2.4 1.32 -109.873 0.170 p-value 0.00* 0.00* 0.00* 0.58 13 21 value -0.78 -1.36 -2.33 0.26 -109.989 0.169 p-value 1.00 1.00 1.00 1.00 14 21 value -0.57 0.71 -1.67 -0.12 -109.989 0.169 p-value 1.00 1.00 1.00 1.00 15 21 value -0.88 -0.94 -2.4 -0.52 -109.989 0.169 p-value 1.00 1.00 0.00* 1.00 16 21 value -0.76 -1.37 -2.31 -0.33 -109.989 0.169 p-value 1.00 1.00 1.00 1.00 17 23 value 1.95 0.45 - 0.52 -44.944 0.318 p-value 1.00 1.00 1.00 18 23 value 1.5 0.14 - 3.53 -43.408 0.340 p-value 0.01* 0.80 0.17 19 23 value 1.79 0.60 - 0.66 -44.944 0.318 p-value 1.00 1.00 1.00 20 23 value 2.06 -0.15 - 0.02 -44.944 0.318 p-value 1.00 1.00 1.00 21 23 value 2.07 0.44 - 0.20 -44.944 0.318 p-value 1.00 1.00 1.00 22 23 value 2.1 0.41 - 0.0 -44.944 0.318 p-value 0.00* 0.44 1.00 23 32 value 0.134 -0.123 - 0.05 -83.009 -0.030 p-value 1.00 1.00 1.00 24 32 value -1.04 -0.33 - 1.67 -82.299 -0.022 p-value 0.25 0.34 0.18 25 32 value 0.12 -0.09 - -0.08 -83.009 -0.030 p-value 1.00 1.00 1.00 26 32 value 0.16 -0.11 - 0.00 -83.009 -0.030 p-value 1.00 1.00 1.00 27 32 value 0.14 -0.12 - 0.04 -83.009 -0.030 p-value 1.00 1.00 1.00 52 7.F ACTORSOF INFLUENCE ON ROUTE CHOICES AT SAIL

Results MNL estimation combined attributes - scalar method There are a few combinations made be- tween the attributes to estimate different choice models These twelve models are also shown in Table 7.4. All are using the significant attributes from the previous estimated models. Still the parameter value of crowded- ness (β2) is the only significant one. In addition, this value change if different models are estimated, because in MNL# 7 and 8 this value is 5.69 or 5.53. The added attributes will ’get’ their value from other attributes than crowdedness. Which is indeed shown for the beta of attractions (β1), that reduces in value when new attributes are added. Furthermore, the adjusted ρ2 values are similar as was shown before, so the same conclusions can be drawn from these new models: The way forth models (OD12 and OD23) have a more accurate fit than the models for the way back to the station (OD32 and OD21). So the behaviour of the pedestrians could be changing when they return to the station or even return home. Next, it is found in the MNL estimation of all attributes together that all attributes are correlated. All attributes are correlated, meaning that they all influence each other.

Results MNL estimations without ASC - scalar method It is expected that the ASC values are very dom- inant in the choice model estimations, because the routes seen at SAIL are presumed to lack variety in alter- natives. Therefore, the models are estimated without ASC values to find out the influence of the attributes individually. In these estimations where the ASC values are not considered (Table 7.5) it is seen that there are more significant parameter values. So this showed indeed that there is a very strong influence by the unob- served attributes of the routes, captured in the ASC values. Another reason for this change could be the lack of variety in the RP data, because very little spread was found in the alternatives and therefore a high preference for certain routes was found. In these new MNL models the attribute signs is positive and significant at OD12 and 23 (MNL# 30 and 45). Also the combination of attractions and crowdedness give significant values, where it is seen that crowdedness is more attractive than the attractions itself. However, when combining more attributes (MNL# 35 and 36) the attributes signs and attractions change sign, which seems quite odd. Presumably in MNL# 35 attractions is more dominant compared to signs and in MNL# 36 water is more dominant compared to attractions. This is also seen at MNL# 50, where signs has a strong repellent value, while in the individual estimation it had an attractive value. The models on the way up with the highest fit are MNL# 34 for OD12 and MNL# 49 for OD23, in which both attractions and (at the 10% level) crowdedness are significant. On the way back it is seen that the attributes lack to predict the route choices, because the model fits ar very low or have ρ2 values of (almost) zero. So especially in these models it is shown that the way back can not be predicted with these parameter values. 7.3.C ONCLUSIONS 53

Table 7.5: Parameter value estimates without ASC values, wherefore the scalar data is used to estimate MNL choice models. The param- eters which are found to be significant are coloured blue and are marked (*).

Main Alterna- Short cut to Attrac- Crowd- Road MNL OD- route tive route main route tions edness Signs size Trees Water Final log- Adjusted 2 # pair ASC1 ASC2 ASC3 β1 β2 β3 β4 β5 β6 likelihood ρ 28 12 value --- 3.89 -132.013 0.187 p-value 0.00* 29 12 value --- 7.4 -111.251 0.314 p-value 0.00* 30 12 value --- 3.37 -128.613 0.208 p-value 0.00* 31 12 value - - - -0.01 -163.274 -0.004 p-value 0.23 32 12 value - - - -10.0 -128.461 0.209 p-value 0.38 33 12 value --- 3.73 -128.613 0.208 p-value 0.00* 34 12 value --- 1.87 5.62 -108.222 0.326 p-value 0.01* 0.00* 35 12 value - - - 2.24 5.66 -0.34 -108.219 0.320 p-value 0.64 0.00* 0.94 36 12 value - - - -1.97 4.67 3.68 -108.219 0.320 p-value 0.56 0.00* 0.29 37 21 value - - - -0.285 -136.911 -0.005 p-value 0.28 38 21 value - - - -0.246 -137.234 -0.007 p-value 0.88 39 21 value - - - 0.18 -137.081 -0.006 p-value 0.47 40 21 value --- -0.09 -126.187 0.073 p-value 0.00* 41 21 value --- -0.93 -135.067 0.009 p-value 0.05* 42 21 value - - - -0.27 -136.912 -0.005 p-value 0.30 43 23 value --- 5.46 -46.466 0.325 p-value 0.00* 44 23 value --- 8.79 -51.360 0.255 p-value 0.00* 45 23 value --- 2.85 -48.094 0.302 p-value 0.00* 46 23 value --- -0.03 -65.837 0.049 p-value 0.00* 47 23 value --- 8.05 -50.616 0.266 p-value 0.00* 48 23 value - - - 0.00 -70.311 -0.014 p-value 1.00 49 23 value --- 3.83 4.47 -43.783 0.349 p-value 0.00* 0.10 50 23 value - - - 10.0 3.71 -3.15 -43.422 0.340 p-value 1.00 0.06 0.00* 51 32 value - - - 0.67 -83.038 -0.007 p-value 0.29 52 32 value - - - 0.38 -82.909 0.007 p-value 0.33 53 32 value - - - -0.36 -83.012 -0.006 p-value 0.33 54 32 value - - - 0.00 -83.461 -0.012 p-value 0.79 55 32 value - - - 0.99 -83.013 -0.006 p-value 0.33 56 32 value - - - 0.00 -83.495 -0.012 p-value 1.00

7.3. CONCLUSIONS For the binary estimated attributes, almost all MNL model estimations were found to be not sufficient to predict the behaviour of the pedestrians at SAIL. For these MNL models, the only significant parameters were the ASC values. So there are preferences found in the routes which are chosen, which are not explained by the six binary attribute values. Next, the attribute values were estimated in a so-called scalar method. These scalar estimated attributes also found little significance when ASC were considered in the models. Only the attribute crowdedness for one OD-pair (OD12) is found significant. It was expected that the ASC values are very dominant. A reason for this could be the unspecified attributes which the ASC values captures, but most presumingly the lack of variety in the RP data is a cause: the alternatives vary very little in their attributes, while the route choices are 54 7.F ACTORSOF INFLUENCE ON ROUTE CHOICES AT SAIL found to be strongly leaning towards one route (at least at OD12 and OD23). When these ASC values were not taken into account in the MNL models, much more significant attributes were found indeed. However, there are still some strange value which occur when the attributes are combined. For example signs and attributes become negative in MNL # 35 or 36, when they are combined with other attributes. A reason for this could again be the data sample. For all MNL models it was found that there are large differences in model fits (ρ2 values) between the different OD-pairs. It is seen that the ρ2 is higher for the ’first’ OD-pairs -the OD-pairs which most people took at the beginning of their trip (OD12 and OD23)- while the value is way lower for the ’later’ OD-pairs -the last part of the trips (OD32 and OD21). It might be that pedestrians change behaviour once their trip goal changes during the day, if they want to go home maybe they just want to take the shortest or fastest route back to the station. This difference was even stronger in the MNL model estimations of the models without ASC values. This could mean that the six attributes which are specified for this thesis are able to estimate the way forth more precise than the way back, because when the ASC are left aside the model fit for de models for OD32 and OD21 is even lower. This finding could presume that on the way back, other attributes are of more influence: for example the previous taken route could be of influence on the way back, which was already found to be a significant factor in the statistical analysis. 8 CONCLUSIONS,DISCUSSIONAND RECOMMENDATIONS

In Section 8.1 conclusions are drawn first, based on the main findings of this research. Next, Section 8.2 gives the discussion, wherein the authors interpretation of the conclusions are given and is reflected on the SP and RP studies. Finally, in Section 8.3 recommendations are given for both practical and theoretical purposes.

8.1. CONCLUSIONS When the results of two methods are aligned, strong conclusions may be drawn. Unfortunately, no direct answer can be given on the main research questions in this thesis, due to the variety in SP and RP results. However, it can be concluded that new insights are generated for pedestrian route choices, based on these differences. Keeping in mind that both datasets contain uncertainties and have their limitations, of which the considerations will be discussed in Section 8.2, conclusion may be grouped in two area’s: 1) How the attributes influence pedestrian route choices. 2) The influences on the different outcomes of SP and RP data.

Which attributes -and to what extend- influence the stated - (SP) or the revealed preferences (RP) of pedestrian route choice behaviour at the mass event SAIL? Interpretation of the overall estimations has to be done with some caution, as the results differ strongly by the used methodology and models. Conclusions for the SP and RP results are therefore given separately.

The SP results show that the extend of influence of each attribute differs per model. The attribute crowd- edness is significant in all MNL models, and always has the highest repellent value. The attributes road size, trees and water have relatively little influence in all models. Of which trees has overall the least influence. Much variation is found for the models including attractions and signs. In which signs is most regularly found negative, as was unexpected based on literature. These strange results might be mainly driven by the inaccu- rate design of the survey choice sets, as is elaborated further in the discussion.

For the RP data gathered at SAIL, a large variation is found between the result which capture the way forth and the way back. Amsterdam Central station until the Kop van Java is considered as the way forth, vice versa is the way back. Both in the estimated values of the attributes and in the model fits there are large differences found. Generally it may be concluded that the way forth is more accurate to estimate (with these attributes) than the way back. Crowdedness is the only attribute which is significant in most models, particularly on the way forth, and has a highly attractive value. More attributes are significant when the Alternative Specific Constants (ASC) are not considered in the models. Especially on the way forth the attributes are more able to estimate the route choices, and both attractions and signs are highly attractive. On the way back, the influence of the attributes is of less extend. The ASC had a strong influence on the estimations of the MNL models. Since the ASC explains the unob- served attributes of each alternative, there might be other attributes of influence on the route choices. Based on the statistical analysis, previous taken routes are expected to influence the next route choice. Especially

55 56 8.C ONCLUSIONS,DISCUSSIONAND RECOMMENDATIONS the way back might be influenced by this attribute, since the existing attributes were unable to explain these routes.

How do the attributes that influence the pedestrian route choice behaviour at the mass event SAIL, correspond for stated- (SP) and revealed preference (RP) studies? Although this conclusion is obtained with data that is difficult to compare, it is of interest to note that SP and RP data collection methods can vary this much in their outcomes. The method which is used as a regulatory decision tool, might influence the final decision completely. SP model estimations show significant values for crowdedness, in which the attribute is repellent. RP model estimations showed the opposite, crowdedness is a major attractive attribute and highly significant. This difference might be driven by the difference in perceptions of the SP and the RP analyses [43]. The SP survey was filled in at home, where participants are expected to prefer a less crowded situation. In reality pedestrians tend to follow each other, as the RP study showed for SAIL. Besides, the different sample might be of influence on these different results. The RP data shows a sample of old people, while the SP data contains mainly students or young people. However, the main driver for this difference is expected to be the design of the SP survey and the lack of variety in the RP data. This shows and emphasizes the difficulty of collecting valuable data with RP or SP methods, and in the discussion a further elaboration is presented on these data collection methods.

8.2. DISCUSSION This section gives the authors interpretation of the findings and elaborates on the uncertainties and limita- tions for both data sets.

8.2.1. STATED PREFERENCE SURVEY The advantage of a SP over RP study is the possibility to design variation in the alternative choices. This is an advantage as the variety experienced in RP data limited the results that were obtained. However, for SP studies the way how the respondents interpret the design of the survey can vary, which could bias the results.

Design of the SP survey Different aspects of the SP survey design can be reflected on. First, the length should be considered carefully when designing a survey, as the level of attention of the respondents will de- crease accordingly. For this thesis the survey did not reveal all the data that was expected, a longer survey to reveal more relevant data is advised. Secondly the final survey left individuals free to make different assump- tions when answering the questions, which decreases the value of the data obtained. A more thorough pilot survey could circumvent this issue. Thirdly, the design of the photo questions has outlined a very crowded situation, as it was possible to look far ahead. In reality crowdedness is expected to be strongly related to the space surrounding the pedestrians directly. Whenever this space feels comfortable for the pedestrian, crowd- edness might not experienced as unpleasant. In a SP survey it is very difficult to portrait a realistic sense of the crowd. Perhaps new technologies such as virtual reality may give a more realistic view.

Assumptions and interpretation of SP data It has been assumed beforehand, that the pilot study, its prior values and the design with Ngene could make an efficient design and create realistic choice sets. Unfortu- nately, it is seen that the design of the survey is done inefficiently anyhow, resulting in varying models. This have led to unrealistic results which are expected to be found in this thesis for the parameter values of at- tractions and signs. To prevent this from happening in future research it is recommended to conduct a more extensive or a second pilot survey.

8.2.2. REVEALED PREFERENCE DATA In order to interpret the results in this thesis in further details, additional remarks need to be made. As the col- lected RP data has a specific variety and sometimes lack of information this will be discussed first. Secondly the assumptions that have been made throughout this thesis are elaborated.

Variety and correlations in the collected RP data The RP data is missing variety in the alternatives, which means that the attribute values hardly differs for the alternatives, while the route choices are very homo- geneous. For example, at the Java-eiland water is present everywhere, so it will be very difficult to find the 8.3.R ECOMMENDATIONS 57 influence of water on the route choices, as the attribute ‘no water’ is missing. This results in low significant parameters for the related route choice models. In addition, some of the attributes seem to have a strong correlation. In the studied area at SAIL the tall ships (defined as attractions) and water are correlated as they are always on the same route, and thirdly crowdedness was often seen along these attractions. Logically it is difficult to explain the pedestrian choice based on these attributes (crowdedness, attractions or water). Due to this expected connection of attributes, and also the lack of more data which contains samples where these attributes are not connected. The inter- pretations of the choice models have to be made carefully, keeping the above effects in mind.

Assumptions and interpretation of the data By choosing a MNL model it is assumed that unobserved at- tributes are uncorrelated over the alternatives [32]. However, as explained in the previous discussion, correla- tions were already found for the observed attributes - e.g. attractions and water. As it is extremely difficult to prove that the unobserved attributes are all uncorrelated over the alternatives the choice for the MNL might be incorrect, and further validation of the results could therefore be done with a different model. Models which capture overlap in the alternatives, such as the Path Size model, could have a better fit to the retrieved RP data. Since the alternative routes contain overlap in their paths. However, it is expected that overlap in the alternatives have little influence for this data sample, because the path size factor [31] is found to be very similar for the different alternatives. Other models such as Latent Class or Mixed Logic models might be able to capture the differences on the way forth and the way back of the pedestrians’ routes, in which taste variation could have played a role. However, especially in the first part of their journey a homo- geneous group is seen. It is expected that these models have little influence on the results for this sample, due to the missing variety in the sample. Although estimating new models is the only way to be certain.

As explained in the conclusion, the six attributes which were selected are not ideal for predicting the route choices at SAIL, a reason for the varying outcomes and lack of significant values could be improved by re-evaluating the set of attributes. First by finding other attributes which are of influence on the route choices and second by re-evaluate the determination of the attributes values as they also influence the results. Un- fortunately there is no explicit ’recipe’ for this process and therefore it is suggested that more research in this area is conducted. Especially for attributes which are harder to quantify (e.g. visual pleasantness), multiple methods are applicable.

In choosing the orange route area at SAIL, a certain area has been defined. This area contains boundaries and is split into four Origin Destination (OD) -pairs, variations of these choices are expected to result in a dif- ferent outcome of this thesis and are therefore discussed in the section. First of all the boundaries limit the analysed trips to be exclusively inside this area, therefore choice models derived in this thesis only account for the parts within this area. Some information, for instance the reason for exiting the area, is not taken into account. Secondly the level of detailing for the Origin Destination (OD)-pairs affects the results. Currently the four pairs are fairly coarse, resulting in a low level of detailed outcome for route choice behaviour. For future research it is suggested that smaller parts of the trips are analysed.

The model fits differ strongly for the different OD-pairs. On the way up, from the station to the Kop van Java, the model fit was high, while on the way back the values are much lower. The differences in these values might be explained by a change in behaviour of the pedestrians. On the way up the pedestrians are predicted more accurately and have a consistent behaviour, while on the way back the behaviour of each pedestrian varies more. Some pedestrians want to return as fast as possible, while others choose to stay longer. This change in behaviour cannot be explained fully in this study, and is interesting to research in future RP studies.

8.3. RECOMMENDATIONS Multiple recommendations may be given based on the findings of this thesis, both for theory and for practice.

Theoretical recommendations

• The estimation of other choice models, as elaborated in the discussions, could be used to compare the results in seek for the best model fit and most reliable weights. However, the data sample should be more extensive and contain more variety to find reliable results in these more advanced models. 58 8.C ONCLUSIONS,DISCUSSIONAND RECOMMENDATIONS

These models might improve the model fit and could give more precise information on the influence of individual characteristics and the continuous heterogeneity. • When the SP and RP analyses are more aligned in their results, a combined choice model for the SP and RP data would be ideal to make most use of both methods. Before this combined model can be estimated, the attributes of most influence on the route choices at mass events should be found and a new studies should be conducted. Therefore, it is recommended to analyse after more research is conducted on the attributes of influence. • Other attributes could be estimated for this data set, to find significant influences on the route choices of pedestrians at mass-events. The statistical analysis showed that the routes chosen before are of influence on the sequential route choices. This factor could be researched more thoroughly in future studies. Besides, the influence of herding or crowdedness, which was found significant in the SP and RP studies, should be researched more thoroughly and should be validated for other data samples, and at other case studies. • A choice model might be estimated for the whole SAIL area, once more significant attributes are found. In which the whole trip is included instead of a selection of OD-pairs. This may predict the choice behaviour of pedestrians more accurately. Contradictory, it could be interesting to analyse smaller parts of the area and look more specific at ’decision points’ instead of looking at routes. Zooming into these point, could provide more insight in the route choices and may give more reliable results, because more variety in data could be present at this level of detail. • More research could be conducted on the quantification of attributes in RP studies. Especially for at- tributes of which the quantification can be discussed on. For example the length of the route is easier to quantify than visual pleasantness. It is recommended to use different quantification methods to determine the attribute values, which could be used for the same case and valued by their quality. • The design of SP surveys could be done more extensive in future research. First it is recommended to conduct a more elaborated pilot survey or multiple pilot surveys. Second by using the newest tech- nologies such as virtual reality more realistic results might be found for crowdedness research, since the situation is more alike reality.

Practical recommendations • It is seen at SAIL that pedestrians have a high tenancy to start herding and do not think about the routes they chose thoroughly. Herding behaviour should be kept in mind, when a city is designed and transformed for the purpose of a mass event. On the one hand, the main routes should be dimensioned sufficiently wide to bear the needed capacity. On the other hand, the crowd managers, information measures and signs should anticipate on this herding behaviour as well. When for example the crowd density is too high, and measures should be taken to spread the crowd, the herding behaviour could in some way be used to quickly spread the crowd. How this will exactly work in practice should be analysed by new experiments or case studies. • It is unknown if the findings and conclusions of this thesis are applicable for the organization of other mass events. It is known that SAIL is a very specific event centred around the IJ-haven, and where attractions are very close to each other. At mass events such as Kingsday in Amsterdam, the layout of the event area is completely different. New data should be collected and compared with the findings at SAIL to give more insight in the general influences on route choices at mass events. • Approximately 10% of the pedestrians took the public transport (bus or tram) from the Verbindingsdam back to the station. In the organization of the public transport capacities for the next editions of SAIL this number could be taken into account and could be used for event planning purposes. • The results in this thesis do not provide clear cut policy suggestions. However, it is of interest to note that SP and RP data collection methods can vary this much in their outcomes. This means that the method which is used as a regulatory decision tool, could influence the decisions completely. The usage of SP data is in the case of mass events carefully questioned, based on the contradictory findings in RP for the crowdedness. The scope and main limitations of this thesis should be kept in mind, which are elaborated in the discussions. However, the large differences are not to be ignored. Regulatory bodies are advised to critically look at the methodologies for data collection, before its outcomes can be trusted fully. BIBLIOGRAPHY

[1] COT, Grenzen verlegd. Evaluatie Koninginnedag Amsterdam 2012, Tech. Rep. (Instituut voor Veiligheids- en Crisismanagement, 2012).

[2] COT, De Troonwisseling 2013. Feestelijk, open, ongestoord en veilig. Een terugblik, Tech. Rep. (Instituut voor Veiligheids- en Crisismanagement, 2013).

[3] SAIL Amsterdam, About sail, www.sail.nl, accessed on: May 28th, 2015.

[4] SAIL Amsterdam, Aanvraag evenementenvergunning SAIL Amsterdam 2015, Tech. Rep. (SAIL Amster- dam, 2015).

[5] Boon L., Doden op pukkelpop door noodweer, Nrc (2011).

[6] Unknown, Meeste doden loveparade geidentificeerd, De Telegraaf (2010).

[7] NOS, Menigten in het nauw: levensgevaarlijk, www.nos.nl (2010).

[8] Bafatakis C., Influence of urban characteristics on pedestrian route choice. A stated preference approach, Master’s thesis, Delft University of Technology (2014).

[9] Korthals Altes H.J. and Steffen C., Beleving en routeroute in de binnenstad van Delft, edited by Onder- zoeksinstituut voor Stedenbouw, Planologie en Architectuur (Delftse Universitaire Pers, 1988).

[10] Borgers A. and Timmermans H., A model of pedestrian route choice and demand for retail facilities within inner-city shopping areas, Geographical Analysis (1986).

[11] Zomer L.B., Managing Crowds. Pedestrian Activity Choice Behavior at Mass Events. A Case Study on the Effectiveness of Information Measures during the Vierdaagsefeesten Nijmegen 2013, Master’s thesis, Delft University of Technology (2013).

[12] Transportation Research Board, Highway capacity manual, HCM 2000, Tech. Rep. (National Research Council, Washington D.C., 2000).

[13] Daamen W., Modeling passenger flow in public transport facilities, Ph.D. thesis, TU Delft (2004).

[14] Hoogendoorn S. P.and Bovy P.H. L., Pedestrian route-choice and activity scheduling theory and models, Transportation Research Part B 38, 169 (2004).

[15] de Dios Ortuzar J. and Willumsen L.G. , Modeling Transport (John Wiley & Sons, Ltd, 2011).

[16] Transportation Research Board, The Highway Capacity Manual, Tech. Rep. (National Academy of Sci- ences, Washington, 2000).

[17] Ben-Akiva, M. and Lerman, S. R., Discrete Choice Analysis: Theory and Application to Travel Demand (MIT Press, Cambridge, Massachusetts., 1985).

[18] Pagliara F. and Timmermans H.J.P., Choice set generation in spatial contexts: a review, Transportation Letters: The International Journal of Transportation Research (2009).

[19] Antonini, G., A discrete choice modeling framework for pedestrian walking behavior, Tech. Rep. (Ecole Polytechnique Federale de Lausanne, 2005).

[20] Daamen W. and Hoogendoorn S.P., Experimental research of pedestrian walking behavior, (2003).

[21] Bovy P.H.L. and Stern E., Route choice: Wayfinding in transport network (Kluwer academic publishers, 1990).

59 60 BIBLIOGRAPHY

[22] Guo Z. and Loo B.P.Y, Pedestrian environment and route choice: evidence from new york city and hong kong, Transport Geography 28, 124 (2013).

[23] Hill M.R., Spatial structure and decision-making aspects of pedestrian route selection through an urban environment, Ph.D. thesis, University of Nebraska (1982).

[24] Ton D., Navistation. A study into the route and activity location choice behaviour of departing pedestrians in train stations, Master’s thesis, TU Delft (2014).

[25] Hoogendoorn S.P., The living crowd... from crowd dynamics to principles for crowd management, (2013) p. 26.

[26] Seneviratne, P.N., Morrall, J.F., Analysis of factors affecting the choice of route of pedestrians, Transporta- tion Planning and Technology, 10:2, 147-159 (1985).

[27] Ramos G.M., Frejinger E., Daamen W. and Hoogendoorn S.P., A revealed preference study on route choices in a congested network with real-time information, in Travel behaviour research: Current foundations, future prospects (13th International conference on travel behaviour research, Toronto, 2012).

[28] Andresen E., Haensel D., Chraibi M. and Seyfried A., Wayfinding And Cognitive Maps For Pedestrian Models, Tech. Rep. (Bergische Universitat Wuppertal, 2015).

[29] Borgers, A.W.J., Kemperman A.D.A.M. , and Timmermans H.J.P. , Pedestrian behaviour in downtown shopping areas: differentiating between hedonic and utilitarian shoppers, in Proceedings of the RARRS- conference, Orlando (2005).

[30] Dijkstra, E.W., A note on two problems in connexion with graphs, Numerische Mathematik (1959).

[31] Ben-Akiva M. and Bierlaire M., Discrete choice methods and their applications to short term travel deci- sions, Chapter for the Transportation Science Handbook (1999).

[32] Train K., Discrete Choice Methods with Simulation (Cambridge University Press, 2003).

[33] Bradley M.A., Daly A.J., Estimation of logit choice models using mixed stated-preference and revealed- preference information, in 6th International Conference on Travel Behaviour Quebec (1991).

[34] Simoes P., Barata E. and Cruz L., Combining observed and contingent travel behaviour: The best of both worlds? Journal of Forest Economics (2013).

[35] Molin E.J.E, Spm3610 keuzemodellen 1, (2013).

[36] Molin E.J.E, Lecture 2 orthogonal experimental designs, www.blackboard.tudelft.nl (2014).

[37] Ngene 1.1.2 User Manual & Reference Guide, Choice Metrics Pty Ltd (2014).

[38] Tassoul M., Creative Facilitation, 3rd ed. (VSSD, 2009).

[39] Centraal Bureau voor de Statistiek, Leeftijdsopbouw nederland 2015, www.cbs.nl (2016).

[40] Centraal Bureau voor de Statistiek, Hoogst behaald onderwijsniveau, www.onderwijsincijfer.nl (2014).

[41] Centraal Bureau voor de Statistiek, Gemiddeld inkomen; particuliere huishoudens naar diverse ken- merken, www.cbs.nl (2016).

[42] Ben-Akiva M., Bergman M.J., Daly A.J. and Ramaswamy R., Modelling inter urban route choice behavior, in Ninth International Sumposium on Transportation and Traffic Theory (1984).

[43] Wardman M., A comparison of revealed preference and stated preference models of travel behaviour. Jour- nal of transport economics and policy (1988). A BRAINSTORM SESSION

In this appendix an overview is given of the brainstorm session. First, the general information is given. Next, individual perceptions are shown. Finally, the perceptions after the brainstorm, of the whole group are shown and attributes are ranked.

Table A.1: General information about the brainstorm session

Date June 29, 2015 Time 13:00 – 13:45 Location Piet Opstal vergaderruimte, Faculty of Civil Engineering Participants Fieke Beemster Hidde Janssen Margot Overvoorde Anouk Pelser Matthijs de Ruyter Lidewij van Twillert Facilitator Ilse Galama

Individual results Each participant had to write down in their personal thoughts on the influences on the route choices during events. The results hereof can be seen in Figure A.1 till A.6.

Figure A.1: Factors of influence by participant 1

Group results After the individual review, the participants were asked to share their thoughts and make a list of all influences on a flip-over. Once the list was complete, the participants had to categorize all attributes and eventually rank them, see Figure A.7 (Each participant had his own colour). The results of this group- part of the brainstorm can be seen in Table A.2, where top 5 attributes of most influence are chosen by each participant individually (where #1 is of most influence). In the overall ranking, each # 1 was accounted for 5 points; #2 for 4 pt.; #3 for 3 pt.; #4 for 2 pt.; #5 for 1 pt.

61 62 A.B RAINSTORM SESSION

Figure A.2: Factors of influence by participant 2

Figure A.3: Factors of influence by participant 3

Figure A.4: Factors of influence by participant 4 63

Figure A.5: Factors of influence by participant 5

Figure A.6: Factors of influence by participant 6

Figure A.7: White Board with the retrieved attributes of the brainstorm session. 64 A.B RAINSTORM SESSION

Table A.2: Overview of the ranking of the attributes.

Attributes Ranking p.p. Total Points Overall Ranking

External Weather 5 1 # 9 Sounds 0

Location specific influences Final destination 1 1 1 2 19 # 1 In-between destination 5 2 2 9 # 3 Stages/locations 0 Signs 4 4 4 # 6 Weather (covered or not?) 5 1 # 9

Personal preferences/goal/behaviour Time pressure or in a hurry 1 1 2 3 17 # 2 Go with the crowd 5 1 # 9 Take another route than the crowd 0 Sounds attract people 0 Dark/dirty alleys 0 Weather 0 Short-term choices: go left/right by what you see 0 Speed, faster if you are on your own 4 2 # 8 If aimlessly: then choose the most nice route/spot 0 Most attractive route 4 2 # 8 Most easy route 0 Avoid busy areas 3 2 7 # 4 Clothing (heels, f.e.) 0 With whom are you? 5 1 # 9 Drunk? 0

Knowledge about/by Mobile Phone 3 3 # 7 Familiarity with the area 3 3 6 # 5 Follow someone (when you don’t have knowledge) 0 Signs 0 Activity program/planning 4 2 # 8 Application (f.e. as they have for the Efteling) 0 B MULTI CRITERIA ANALYSIS

In this appendix the Multi Criteria Analysis (MCA) of selecting attributes is shown, wherein the attributes of literature are connected the the attributes of the brainstorm session. The criteria are based on: 1. Measurable at RP (SAIL) and SP (0=no, 1=yes); 2. Points given at brainstorm; 3. Expected influence at mass-event (0=low, 4=high). Attributes are in the selection if they are equal to a or b: a. Crit.1 2, Crit.2=1 and Crit.3 2; b. Crit.2=1 ≥ ≥ and Crit.3 3. By this MCA 9 Attributes are found to be feasible for further analysis. This is discussed in Section ≥ 3.1.2.

Table B.1: Multi Criteria Analysis of attributes.

# Attributes Literature Attributes Brainstorm Criteria Selection Environmental factors 1. 2. 3. 1 Type of road 1 1 2 Width of road 1 3 x 3 Length of road 1 1 4 Intersections or crossings 1 0 5 Topography: bridges, slopes 1 1 6 Building type Most attractive route 0 2 2 7 Building density 1 1 8 Land use along the route 1 2 9 Number of turns or complexity of Most easy route 1 0 2 a route or directness 10 Lighting 1 2 11 Visual pleasantness or aesthetics Most attractive route 0 2 2 (f.e. varying facades or colours) 12 Landmarks or possibilities for ori- 1 3 x entation 13 Road type: Quality of the walking 1 2 surface, materialization 14 Vegetation Most attractive route 1 2 3 x 15 Visibility 1 2 16 Number of routes available Final destination 0 19 2 17 Presence of obstacles number of 1 2 crossings, continuation 18 Safety and shelter for poor Weather (covered or not?) 1 1 2 weather conditions 19 Presence of water fountains, Most attractive route 1 2 3 x canals, rivers 20 Quality of the environment graf- Dark/dirty alleys 1 0 0 fiti, litter, dogs, waste terrain, etc. 21 Traffic mixture cars, public trans- 1 0 portation, bicycles, pedestrians

65 66 B.M ULTI CRITERIA ANALYSIS

# Attributes literature Attributes brainstorm Criteria Selection Mass-event factors 1. 2. 3. 22 Attractions along the route, stimu- a) In-between destination 1 9 4 x lation of the environment b) Stages/locations 0 c) Sounds attract people 0 d) Activity program/planning 2 23 Crowdedness or LOS Avoid busy area’s 1 7 4 x 24 Road & traffic information (f.e. a) Signs 1 4 4 x (matrix) signs, socialmedia, etc.) b) Mobile Phone 3 c) Application (f.e. as they have for 0 the Efteling) 25 Trip purpose a) If aimlessly: then choose the 0 0 4 most nice route/spot b) Time pressure / hurry? 17 26 Goupsize and -composition a) Speed, faster if you are on your 0 2 3 own b) With whom are you? 1 27 Herding a) Go with the growd 0 1 3 b)Take another route than the 0 crowd c) Follow someone (when you do 0 not have knowledge) 28 Crowd management who give cer- 0 4 tain directions to the crowd

Socio-economic factors 29 Age, gender 1 2 30 Income level 0 0 31 Education level 0 0 32 Household structure, Family 0 0 33 Nationality 0 0 34 Profession 0 0 35 Decision style or habit Short-term choices: go left/right 0 2 by what you see 36 Familiarity with the environment a) Familiarity with the area 0 6 2 b) Follow someone (when you do 0 not have knowledge) 37 Weather (do not mind to get wet?) 0 0 1 38 Clothing (f.e. heels) 0 0 1

Natural environment factors 39 Noise- and air pollution Sounds 0 0 2 40 Day/night 1 1 41 Time of the day or week Drunk? 1 0 3 x 42 Weather conditions Weather 0 1 3

Other factors 43 Travel time Time pressure / hurry? 0 17 2 44 Travel distance 1 2 C SPPILOT SURVEY

This appendix gives an overview of both the design and results of the pilot survey. First, Ngene is used to make a design for the SP survey (C.1). Secondly the design of this survey is made in Photoshop. Then, this survey was spread under 14 participants. The results of these participants are estimated in a choice model with Biogeme (C.2): where a .mod file and a .dat file need to be selected to estimate the choice models. The pilot survey is shown in Section C.3 The results are then analysed and shown in Section C.4.

C.1. NGENE FILE:ORTHOGONAL DESIGN For the pilot survey an orthogonal design is used. The Ngene file which is used to design the choice sets of the survey. design ;alts = alt1, alt2 ;rows = 12 ;orth = seq ;model: U(alt1) = b0 + b1*X1[0,1] + b2*X2[0,1] + b3*X3[0,1] + b4*X4[0,1] + b5*X5[0,1] + b6*X6[0,1] / U(alt2) = b1*X1 + b2*X2 + b3*X3 + b4*X4 + b5*X5 + b6*X6 $

C.2. BIOGEME MODEL:PILOT SURVEY First, each attribute was runed in Biogeme separately. Figure C.2 gives an example of the .mod file which is loaded in Biogeme. The .dat file is (almost) similar to Table 3.3. Then, all attributes were runned; and the significant attributes were runned.

Attribute 1: Attractions (.mod file) [Choice] CHOICE

[Beta] // Name Value LowerBound UpperBound status (0=variable, 1=fixed) ASC1 0 -10 10 0 BETA1 0 -10 10 0

[Utilities] // Id Name Avail linear-in-parameter expression 1 Alt1 one ASC1 * one + BETA1 * alt1ATTRACTIONS 2 Alt2 one BETA1 * alt2ATTRACTIONS

[Expressions]

67 68 C.SPP ILOT SURVEY

// Define here arithmetic expressions for name that are not // directly available from the data one = 1

[LaTeX] ASC1 "Constant for alt. 1"

BETA1 "β1"

[Model] // $MNL stands for "multinomial logit model", $MNL

All Attributes (.mod file) [Choice] CHOICE

[Beta] // Name Value LowerBound UpperBound status (0=variable, 1=fixed) ASC1 0 -10 10 0 BETA1 0 -10 10 0 BETA2 0 -10 10 0 BETA3 0 -10 10 0 BETA4 0 -10 10 0 BETA5 0 -10 10 0 BETA6 0 -10 10 0

[Utilities] // Id Name Avail linear-in-parameter expression 1 Alt1 one ASC1 * one + BETA1 * alt1ATTRACTIONS + BETA2 * alt1CROWDEDNESS + BETA3 * alt1SIGNS + BETA4 * alt1ROADSIZE + BETA5 * alt1TREES + BETA6 * alt1WATER 2 Alt2 one BETA1 * alt2ATTRACTIONS + BETA2 * alt2CROWDEDNESS + BETA3 * alt2SIGNS + BETA4 * alt2ROADSIZE + BETA5 * alt2TREES + BETA6 * alt2WATER

[Expressions] // Define here arithmetic expressions for name that are not // directly available from the data one = 1

[LaTeX] ASC1 "Constant for alt. 1"

BETA1 "β1"

BETA2 "β2"

BETA3 "β3"

BETA4 "β4"

BETA5 "β5"

BETA6 "β6"

[Model] // $MNL stands for "multinomial logit model", $MNL C.3.O NLINE SPSURVEY 69

C.3. ONLINE SPSURVEY In Figure C.1 an overview is given of the pilot online SP survey which was send to a group of respondents. 70 C.SPP ILOT SURVEY C.3.O NLINE SPSURVEY 71

Figure C.1: Overview of the pilot online SP survey made in GoogleForms. The Survey consists of three parts: choice moments; personal experience at mass events; and personal details. 72 C.SPP ILOT SURVEY

Results Online Survey (.dat file) Table C.1 shows the data file for Biogeme of the pilot survey. The choice or answers which were given on the twelve questions (Q) by the fourteen respondents (ID) are shown in the last column (Choice).

Table C.1: Results of the pilot survey.

Choice 1 Choice 2 ID Q Attrac- Crowd- Signs Road- Trees Water Attrac- Crowd- Signs Road- Trees Water Choice tions edness Size tion edness Size 1 1 1 1 1 1 1 1 0 0 1 0 0 1 1 1 2 0 0 0 1 1 0 1 0 0 0 0 0 2 1 3 0 1 0 1 0 1 0 0 0 1 1 0 2 1 4 0 1 1 1 0 0 1 0 0 1 0 1 2 1 5 1 1 1 0 0 0 0 1 0 0 1 0 1 1 6 1 0 1 1 1 0 1 1 0 0 1 1 1 1 7 1 1 0 0 1 1 0 0 1 0 1 1 2 1 8 0 0 1 0 1 1 0 1 0 1 0 1 1 1 9 1 0 0 1 0 1 1 1 1 1 1 1 3 1 10 0 1 0 0 1 0 0 1 1 1 0 0 2 1 11 0 0 1 0 0 1 1 1 1 0 0 0 2 1 12 1 0 0 0 0 0 1 0 1 1 1 0 2 2 1 1 1 1 1 1 1 0 0 1 0 0 1 1 2 2 0 0 0 1 1 0 1 0 0 0 0 0 1 2 3 0 1 0 1 0 1 0 0 0 1 1 0 2 2 4 0 1 1 1 0 0 1 0 0 1 0 1 2 2 5 1 1 1 0 0 0 0 1 0 0 1 0 2 2 6 1 0 1 1 1 0 1 1 0 0 1 1 1 2 7 1 1 0 0 1 1 0 0 1 0 1 1 2 2 8 0 0 1 0 1 1 0 1 0 1 0 1 1 2 9 1 0 0 1 0 1 1 1 1 1 1 1 2 2 10 0 1 0 0 1 0 0 1 1 1 0 0 2 2 11 0 0 1 0 0 1 1 1 1 0 0 0 2 2 12 1 0 0 0 0 0 1 0 1 1 1 0 2 3 1 1 1 1 1 1 1 0 0 1 0 0 1 2 3 2 0 0 0 1 1 0 1 0 0 0 0 0 2 3 3 0 1 0 1 0 1 0 0 0 1 1 0 2 3 4 0 1 1 1 0 0 1 0 0 1 0 1 2 3 5 1 1 1 0 0 0 0 1 0 0 1 0 1 3 6 1 0 1 1 1 0 1 1 0 0 1 1 1 3 7 1 1 0 0 1 1 0 0 1 0 1 1 2 3 8 0 0 1 0 1 1 0 1 0 1 0 1 1 3 9 1 0 0 1 0 1 1 1 1 1 1 1 1 3 10 0 1 0 0 1 0 0 1 1 1 0 0 2 3 11 0 0 1 0 0 1 1 1 1 0 0 0 3 3 12 1 0 0 0 0 0 1 0 1 1 1 0 1 4 1 1 1 1 1 1 1 0 0 1 0 0 1 2 4 2 0 0 0 1 1 0 1 0 0 0 0 0 2 4 3 0 1 0 1 0 1 0 0 0 1 1 0 1 4 4 0 1 1 1 0 0 1 0 0 1 0 1 1 4 5 1 1 1 0 0 0 0 1 0 0 1 0 1 4 6 1 0 1 1 1 0 1 1 0 0 1 1 1 4 7 1 1 0 0 1 1 0 0 1 0 1 1 2 4 8 0 0 1 0 1 1 0 1 0 1 0 1 1 4 9 1 0 0 1 0 1 1 1 1 1 1 1 2 4 10 0 1 0 0 1 0 0 1 1 1 0 0 2 4 11 0 0 1 0 0 1 1 1 1 0 0 0 2 4 12 1 0 0 0 0 0 1 0 1 1 1 0 2 5 1 1 1 1 1 1 1 0 0 1 0 0 1 2 5 2 0 0 0 1 1 0 1 0 0 0 0 0 2 5 3 0 1 0 1 0 1 0 0 0 1 1 0 2 5 4 0 1 1 1 0 0 1 0 0 1 0 1 2 5 5 1 1 1 0 0 0 0 1 0 0 1 0 1 5 6 1 0 1 1 1 0 1 1 0 0 1 1 1 5 7 1 1 0 0 1 1 0 0 1 0 1 1 2 5 8 0 0 1 0 1 1 0 1 0 1 0 1 1 5 9 1 0 0 1 0 1 1 1 1 1 1 1 1 5 10 0 1 0 0 1 0 0 1 1 1 0 0 2 5 11 0 0 1 0 0 1 1 1 1 0 0 0 1 5 12 1 0 0 0 0 0 1 0 1 1 1 0 1 6 1 1 1 1 1 1 1 0 0 1 0 0 1 2 C.3.O NLINE SPSURVEY 73

6 2 0 0 0 1 1 0 1 0 0 0 0 0 2 6 3 0 1 0 1 0 1 0 0 0 1 1 0 2 6 4 0 1 1 1 0 0 1 0 0 1 0 1 2 6 5 1 1 1 0 0 0 0 1 0 0 1 0 1 6 6 1 0 1 1 1 0 1 1 0 0 1 1 1 6 7 1 1 0 0 1 1 0 0 1 0 1 1 2 6 8 0 0 1 0 1 1 0 1 0 1 0 1 1 6 9 1 0 0 1 0 1 1 1 1 1 1 1 2 6 10 0 1 0 0 1 0 0 1 1 1 0 0 2 6 11 0 0 1 0 0 1 1 1 1 0 0 0 2 6 12 1 0 0 0 0 0 1 0 1 1 1 0 2 7 1 1 1 1 1 1 1 0 0 1 0 0 1 2 7 2 0 0 0 1 1 0 1 0 0 0 0 0 2 7 3 0 1 0 1 0 1 0 0 0 1 1 0 3 7 4 0 1 1 1 0 0 1 0 0 1 0 1 2 7 5 1 1 1 0 0 0 0 1 0 0 1 0 2 7 6 1 0 1 1 1 0 1 1 0 0 1 1 1 7 7 1 1 0 0 1 1 0 0 1 0 1 1 2 7 8 0 0 1 0 1 1 0 1 0 1 0 1 1 7 9 1 0 0 1 0 1 1 1 1 1 1 1 2 7 10 0 1 0 0 1 0 0 1 1 1 0 0 1 7 11 0 0 1 0 0 1 1 1 1 0 0 0 2 7 12 1 0 0 0 0 0 1 0 1 1 1 0 2 8 1 1 1 1 1 1 1 0 0 1 0 0 1 2 8 2 0 0 0 1 1 0 1 0 0 0 0 0 1 8 3 0 1 0 1 0 1 0 0 0 1 1 0 2 8 4 0 1 1 1 0 0 1 0 0 1 0 1 2 8 5 1 1 1 0 0 0 0 1 0 0 1 0 3 8 6 1 0 1 1 1 0 1 1 0 0 1 1 1 8 7 1 1 0 0 1 1 0 0 1 0 1 1 2 8 8 0 0 1 0 1 1 0 1 0 1 0 1 1 8 9 1 0 0 1 0 1 1 1 1 1 1 1 1 8 10 0 1 0 0 1 0 0 1 1 1 0 0 2 8 11 0 0 1 0 0 1 1 1 1 0 0 0 1 8 12 1 0 0 0 0 0 1 0 1 1 1 0 2 9 1 1 1 1 1 1 1 0 0 1 0 0 1 1 9 2 0 0 0 1 1 0 1 0 0 0 0 0 1 9 3 0 1 0 1 0 1 0 0 0 1 1 0 2 9 4 0 1 1 1 0 0 1 0 0 1 0 1 2 9 5 1 1 1 0 0 0 0 1 0 0 1 0 2 9 6 1 0 1 1 1 0 1 1 0 0 1 1 1 9 7 1 1 0 0 1 1 0 0 1 0 1 1 2 9 8 0 0 1 0 1 1 0 1 0 1 0 1 2 9 9 1 0 0 1 0 1 1 1 1 1 1 1 1 9 10 0 1 0 0 1 0 0 1 1 1 0 0 1 9 11 0 0 1 0 0 1 1 1 1 0 0 0 2 9 12 1 0 0 0 0 0 1 0 1 1 1 0 1 10 1 1 1 1 1 1 1 0 0 1 0 0 1 1 10 2 0 0 0 1 1 0 1 0 0 0 0 0 2 10 3 0 1 0 1 0 1 0 0 0 1 1 0 2 10 4 0 1 1 1 0 0 1 0 0 1 0 1 2 10 5 1 1 1 0 0 0 0 1 0 0 1 0 1 10 6 1 0 1 1 1 0 1 1 0 0 1 1 1 10 7 1 1 0 0 1 1 0 0 1 0 1 1 2 10 8 0 0 1 0 1 1 0 1 0 1 0 1 1 10 9 1 0 0 1 0 1 1 1 1 1 1 1 3 10 10 0 1 0 0 1 0 0 1 1 1 0 0 2 10 11 0 0 1 0 0 1 1 1 1 0 0 0 1 10 12 1 0 0 0 0 0 1 0 1 1 1 0 2 11 1 1 1 1 1 1 1 0 0 1 0 0 1 2 11 2 0 0 0 1 1 0 1 0 0 0 0 0 2 11 3 0 1 0 1 0 1 0 0 0 1 1 0 2 11 4 0 1 1 1 0 0 1 0 0 1 0 1 2 11 5 1 1 1 0 0 0 0 1 0 0 1 0 1 11 6 1 0 1 1 1 0 1 1 0 0 1 1 1 11 7 1 1 0 0 1 1 0 0 1 0 1 1 2 11 8 0 0 1 0 1 1 0 1 0 1 0 1 1 11 9 1 0 0 1 0 1 1 1 1 1 1 1 1 11 10 0 1 0 0 1 0 0 1 1 1 0 0 2 11 11 0 0 1 0 0 1 1 1 1 0 0 0 1 11 12 1 0 0 0 0 0 1 0 1 1 1 0 2 74 C.SPP ILOT SURVEY

12 1 1 1 1 1 1 1 0 0 1 0 0 1 1 12 2 0 0 0 1 1 0 1 0 0 0 0 0 2 12 3 0 1 0 1 0 1 0 0 0 1 1 0 1 12 4 0 1 1 1 0 0 1 0 0 1 0 1 2 12 5 1 1 1 0 0 0 0 1 0 0 1 0 1 12 6 1 0 1 1 1 0 1 1 0 0 1 1 1 12 7 1 1 0 0 1 1 0 0 1 0 1 1 1 12 8 0 0 1 0 1 1 0 1 0 1 0 1 1 12 9 1 0 0 1 0 1 1 1 1 1 1 1 1 12 10 0 1 0 0 1 0 0 1 1 1 0 0 1 12 11 0 0 1 0 0 1 1 1 1 0 0 0 2 12 12 1 0 0 0 0 0 1 0 1 1 1 0 2 13 1 1 1 1 1 1 1 0 0 1 0 0 1 2 13 2 0 0 0 1 1 0 1 0 0 0 0 0 2 13 3 0 1 0 1 0 1 0 0 0 1 1 0 2 13 4 0 1 1 1 0 0 1 0 0 1 0 1 2 13 5 1 1 1 0 0 0 0 1 0 0 1 0 1 13 6 1 0 1 1 1 0 1 1 0 0 1 1 1 13 7 1 1 0 0 1 1 0 0 1 0 1 1 1 13 8 0 0 1 0 1 1 0 1 0 1 0 1 1 13 9 1 0 0 1 0 1 1 1 1 1 1 1 1 13 10 0 1 0 0 1 0 0 1 1 1 0 0 2 13 11 0 0 1 0 0 1 1 1 1 0 0 0 2 13 12 1 0 0 0 0 0 1 0 1 1 1 0 2 14 1 1 1 1 1 1 1 0 0 1 0 0 1 1 14 2 0 0 0 1 1 0 1 0 0 0 0 0 2 14 3 0 1 0 1 0 1 0 0 0 1 1 0 2 14 4 0 1 1 1 0 0 1 0 0 1 0 1 1 14 5 1 1 1 0 0 0 0 1 0 0 1 0 1 14 6 1 0 1 1 1 0 1 1 0 0 1 1 1 14 7 1 1 0 0 1 1 0 0 1 0 1 1 1 14 8 0 0 1 0 1 1 0 1 0 1 0 1 1 14 9 1 0 0 1 0 1 1 1 1 1 1 1 2 14 10 0 1 0 0 1 0 0 1 1 1 0 0 2 14 11 0 0 1 0 0 1 1 1 1 0 0 0 2 14 12 1 0 0 0 0 0 1 0 1 1 1 0 1 C.4.R ESULTS BIOGEME 75

C.4. RESULTS BIOGEME Table C.2 shows the results of the choice model estimations of the eleven MNL model for the pilot survey, which were done in Biogeme. In the Significance columns the * means that the parameter is not significant.

Table C.2: Results of the pilot survey which were estimated in Biogeme.

Run Attribute Parameters Values Std err t-test p-val Sig- Rob. Rob. t- Rob. Sig- Final log- Adj. # nifi- std test p-val nifi- likeli- ρ2 can- err can- hood ce ce ASC1 -0.282 0.161 -1.75 0.08 * 0.161 -1.75 0.08 * -108.28 0.018 1 Attractions BETA1 0.508 0.23 2.21 0.03 0.223 2.27 0.02 ASC1 -0.31 0.174 -1.79 0.07 * 0.174 -1.78 0.08 * -97.613 0.113 2 Crowdedness BETA2 -1.07 0.223 -4.79 0 0.227 -4.7 0 ASC1 -0.321 0.17 -1.89 0.06 * 0.172 -1.87 0.06 * -101.242 0.081 3 Signs BETA3 0.892 0.214 4.16 0 0.222 4.01 0 ASC1 -0.275 0.159 -1.73 0.08 * 0.159 -1.73 0.08 * -110.478 -0.002 4 Road BETA4 0.176 0.222 0.79 0.43 * 0.227 0.77 0.44 * Size ASC1 -0.274 0.159 -1.73 0.08 * 0.159 -1.73 0.08 * -110.778 -0.004 5 Trees BETA5 0.032 0.195 0.16 0.87 * 0.196 0.16 0.87 * ASC1 -0.293 0.162 -1.81 0.07 * 0.162 -1.81 0.07 * -107.577 0.024 6 Water BETA6 -0.719 0.292 -2.46 0.01 0.289 -2.49 0.01 ASC1 -0.246 0.206 -1.2 0.23 * 0.206 -1.19 0.23 * -78.329 0.24 Attractions BETA1 1.26 0.298 4.23 0 0.3 4.2 0 Crowdedness BETA2 -1.53 0.304 -5.04 0 0.301 -5.08 0 7 Signs BETA3 0.827 0.288 2.87 0 0.298 2.78 0.01 RoadSize BETA4 0.167 0.313 0.53 0.59 * 0.289 0.58 0.56 * Trees BETA5 0.0856 0.257 0.33 0.74 * 0.267 0.32 0.75 * Water BETA6 -0.616 0.43 -1.43 0.15 * 0.436 -1.41 0.16 * ASC1 -0.231 0.204 -1.13 0.26 * 0.202 -1.14 0.25 * -78.566 0.256 Attractions BETA1 1.26 0.3 4.19 0 0.313 4.02 0 8 Crowdedness BETA2 -1.47 0.281 -5.22 0 0.269 -5.44 0 Signs BETA3 0.89 0.275 3.24 0 0.283 3.15 0 Water BETA6 -0.594 0.413 -1.44 0.15 * 0.407 -1.46 0.14 * ASC1 -0.266 0.203 -1.31 0.19 * 0.204 -1.3 0.19 * -79.521 0.247 Attractions BETA1 1.29 0.308 4.2 0 0.329 3.93 0 9 Crowdedness BETA2 -1.43 0.278 -5.14 0 0.269 -5.32 0 Signs BETA3 0.946 0.282 3.36 0 0.287 3.3 0 RoadSize BETA4 0.168 0.307 0.55 0.58 * 0.293 0.57 0.57 * ASC1 -0.251 0.202 -1.24 0.21 * 0.202 -1.24 0.21 * -79.671 0.255 Attractions BETA1 1.3 0.311 4.18 0 0.335 3.88 0 10 Crowdedness BETA2 -1.4 0.269 -5.19 0 0.26 -5.37 0 Signs BETA3 1 0.266 3.76 0 0.267 3.75 0 ASC1 -0.291 0.162 -1.79 0.07 * 0.162 -1.8 0.07 * -107.508 0.007 RoadSize BETA4 0.0639 0.23 0.28 0.78 * 0.229 0.28 0.78 * 11 Trees BETA5 -0.056 0.202 -0.28 0.78 * 0.205 -0.27 0.79 * Water BETA6 -0.712 0.301 -2.37 0.02 0.3 -2.37 0.02

D SPFINAL SURVEY

This appendix gives an overview of the design and results of the final survey. First, Ngene is used to make an new, efficient design for the SP survey (D.1). Secondly the choice sets of the survey are shown in Section D.2. The other questions, in which mass event and socio-demographic related characteristics are asked, are similar as shown in the pilot study.

D.1. FINAL SURVEY:EFFICIENT DESIGN design ;alts = alt1, alt2 ;rows = 8 ;eff = (mnl,d) ;model: U(alt1) = b0 + b1[1.3]*X1[0,1] + b2[-1.5]*X2[0,1] + b3[0.8]*X3[0,1] + b4*X4[0,1] + b5*X5[0,1] + b6*X6[0,1] / U(alt2) = b1*X1 + b2*X2 + b3*X3 + b4*X4 + b5*X5 + b6*X6 $

D.2. EIGHTCHOICESETS This section shows the eight choice sets (Figure D.1 until D.8) which are designed for the final SP survey. The designs of these eight choice sets can be found in Table D.1, on which is elaborated more in Section 3.2.

Table D.1: Ngene final survey, efficient design. The meaning of 1 and 0 at each alternative can be found in Table 3.1.

Choice alt1. alt1. alt1. alt1. alt1. alt1. alt2. alt2. alt2. alt2. alt2. alt2. Mo- At- Crowd- Signs Road- Trees Water At- Crowd- Signs Road- Trees Water ments trac- ed- Size trac- ed- Size tions ness tions ness 1 1 1 0 0 0 0 0 0 1 1 1 1 2 0 0 1 0 1 0 1 1 0 1 0 1 3 0 1 1 1 1 1 0 0 0 0 0 0 4 0 0 0 1 1 0 1 1 1 0 0 1 5 1 0 0 1 0 1 1 1 1 0 1 0 6 1 1 1 1 0 0 0 0 0 0 1 1 7 1 0 0 0 1 1 0 0 1 1 0 0 8 0 1 1 0 0 1 1 1 0 1 1 0

77 78 D.SPF INAL SURVEY

Figure D.1: Choice set 1

Figure D.2: Choice set 2

Figure D.3: Choice set 3

Figure D.4: Choice set 4 D.2.E IGHTCHOICESETS 79

Figure D.5: Choice set 5

Figure D.6: Choice set 6

Figure D.7: Choice set 7

Figure D.8: Choice set 8

E ADDITIONTO: QUANTITATIVE ANALYSIS STATEDPREFERENCESURVEY

In this appendix an explanation of these significance correlations between the socio-demographical and mass event related questions is explained in cross tables. It elaborates further on Section 4.1.

It is found that the female respondents of the survey have generally a lower education level (Table E.1). Furthermore, it is shown that females visit mass-events more frequently compared to men (Table E.2). It is seen that the higher the age of the respondent, the higher the household income (Table E.3), which seems logical.

Table E.1: Cross table of gender and highest education level. It is shown that the female participants of the survey have generally a lower education level.

Highest educational level High school Vocational education Bachelor Master PhD Total Gender Female 7 2 27 57 3 96 Male 1 0 13 52 15 81 Total 8 2 40 109 18 177

Table E.2: Cross table of gender and frequency of event visits per year. It is shown that females visit mass-events more frequently com- pared to men.

Frequency event visits 0 0-1 1-2 >2 Total Gender Female 3 35 42 16 96 Male 6 42 25 8 81 Total 9 77 67 24 177

Table E.3: Cross table of household income and age of the participants. It is seen that the higher the age of the participant, the higher the household income.

Age <18 18-24 25-34 35-44 45-54 55-64 65 Total ≥ Household Unknown 0 1 2 0 1 0 0 4 income < €10000 1 36 35 1 0 0 0 73 €10000 - €24999 0 6 14 1 0 0 0 21 €25000 - €34999 0 3 15 0 1 3 0 22 €35000 - €44999 0 3 12 1 1 0 0 17 €45000 - €55000 0 1 5 1 2 3 0 12 > €55000 0 3 4 4 4 10 3 28 Total 1 53 87 8 9 16 3 177

81 82 E.A DDITIONTO: QUANTITATIVE ANALYSIS STATED PREFERENCE SURVEY

Older respondents (or a higher income), visited mostly with family (Table E.4). Table E.5 shows that the higher the frequency of visits, the younger the respondent (and also it is more likely that the group composi- tion exists of friends instead of family).

Table E.4: Cross table of household income and group composition of the participants. It is seen that the lower the household income, the more change that you visited the mass-event with a group of friends.

Group composition Unknown Family Friends Alone Total Household Unknown 0 0 4 0 4 income < €10000 5 2 61 5 73 €10000 - €24999 1 2 17 1 21 €25000 - €34999 2 3 15 2 22 €35000 - €44999 2 6 8 1 17 €45000 - €55000 2 5 5 0 12 > €55000 3 12 12 1 28 Total 15 30 122 10 177

Table E.5: Cross table of frequency of mass-event visits per year and age of the participants. The higher the frequency of visits, the younger the participant.

Age <18 18-24 25-34 35-44 45-54 55-64 65 Total ≥ Frequency 0 0 3 5 0 0 0 1 9 event visits 0-1 0 22 28 6 6 13 2 77 1-2 0 20 41 1 2 3 0 67 >2 1 8 13 1 1 0 0 24 Total 1 53 87 8 9 16 3 177 F DATA PROCESSING:MATLAB

This appendix gives a more detailed explanation of the data processing in Matlab.The process-phase consists of five different steps (as can be seen in the grey-boxes in Figure 6.1) which are enumerated below:

Input For the analysis in Matlab four different input files are used, which are enumerated below:

• Raw data: This is the data which was retrieved from the server of MyGPStracker, consisting of one large matrix (size: 399496 x 4). The four columns indicated the GPS-tracker ID, longitude, latitude and time. The 399496 rows represented all the different data points.

• Trip data: This is the Google Forms table which was made at SAIL. It contains the characteristics of each participant of the experiment. The characteristics which were used for this thesis are: Trip ID, GPS ID (survey), IMEI-number, GPS ID (server), departure time, arrival time, gender, age and groupsize.

• Points of interest list (POI-list): Multipe Points Of Interest (POI) are defined at the SAIL-area to find out what routes the participants took. The locations of these POI are shown in Figure F.1.

Figure F.1: Points Of Interest (POI) at the SAIL-area

• Manually remove errors: A manual input is given to check the GPS-data. Routes are added or deleted when the GPS-data consisted of errors.

Processing

• Make data per trip: The Raw data from the server gave a full list of all trips combined. This means that one GPS-tracker could have been taken by more than one participant, so one GPS-tracker could include more than one trip (One trip means in this thesis: from the time one participant took the GPS-tracker, until the time this participant returned the GPS-tracker at the stand). The start- and end times of the trips are written down in the Trip data file (from Google Forms). These start- and end times are used to separate the full list of all trips, to a smaller list, where each trip is specified individually.

83 84 F.D ATA PROCESSING:MATLAB

• Smooth the data: There are some outliers found in the data which might be caused by reflection of high buildings, measurement errors or possibly other artefacts of GPS measurements. In order to remove these faulty data points a data-analyses method was applied. First, for smoothing the data the Matlab command ’smooth’ is used. This commend has multiple methods for smoothing the data. For this thesis all methods are tried and the smoothing method which appeared to be the best fitting with the physical street network in Amsterdam is the moving average method. According to Matlab this moving average method uses "a lowpass filter with filter coefficients equal to the reciprocal of the span." The span is the number of points of which the average is taken. In addition, by evaluating the distance of each point up to the moving average of the route that was measured some outliers were removed. A threshold value determines the points which should be deleted. For the span and threshold value different values are tried out and the one which gave intuitively (best fit on the map of Amsterdam) the best results were chosen (span: 7 and threshold value: 0.1 103). Figures F.2 and F.3 show how the outliers are found and × what is the resulting trip when plotted on the map of Amsterdam.

Figure F.2: Visualization of the removal of outliers in Matlab. This figure shows in the upper graph the the outliers (which were above threshold value = horizontal coloured lines) of trip 300. The lower graph shows in binary-values which points are deleted

Figure F.3: Visualization of multiple smoothing methods for trip 300. 1) the raw data (blue-line); 2) the data smoothed with moving average (red-line); 3) the data smoothed with both moving average and removal of outliers (green-line).

• Trip plotter: The trip plotter is used as a tool to manually add POI to the routes which are missing some data at the routes they took. Note that not the visual data is changed, just POI are added, to make sure these trips are also taken into account at the next step: count routes. As an example for the trip plotter see the route of Trip 300, which was shown in Figure F.3. This trip is missing some data points across 85

the ’Verbindingsdam’ (the GPS-tracker did not work during the entire trip), but it is visible which route was taken. In the trip plotter (see Figure F.4) POI number 2 is added manually at OD-pairs 1-2 and 2-3.

Figure F.4: Trip Plotter layout: which enables the user to add or remove POI values from a certain trip

• Find routes: Based on the data from the smoothened-data-box and the trip plotter, the routes between each Origin- and Destination-pair (OD-pair) were counted.

• Make structure per trip: Once all the data has followed all previous steps the data is ordered in a clear Matlab structure. In Figure F.5 an overview is given of this structure.

Information ID Start time End time GPS ID Gender Age Group size

Measurement Longitude Latitude Time Time string

Processed Longitude Latitude Time Time string dDistance Trip(i) dTime dSpeed

OD information OD data Start idx OD string End idx OD idx Start POI End POI Route12 Sequence Longitude IDs Latitude Number Time Route21 Sequence IDs Number

Route23 Sequence IDs Number

Route32 Sequence IDs Number

Figure F.5: Overview of the Matlab structure per Trip i

G REVEALED PREFERENCEADDITIONAL INFORMATION

This appendix gives additional information to the revealed preference choice model estimation. And gives extended information on the results of the MNL models which are estimated in Biogeme.

OD12: FULLTRIPSVSHALFTRIPS It was shown in Chapter 6 that not all trips reached the Verbindingsdam. This section gives an overview of the trips which walked the full length of the Verbindingsdam, versus the trips which turned around somewhere in between the Verbindingsdam and the station. The purpose of this analyse is to find out if there is difference in route choices between these two groups. The total sample of this analysis are 78 trips: The number of trips which took the main route at OD12 (Figure G.1a and G.1b).

F 1 E D C B A

(b) Turning points at main route OD12 (a) Shares at main route OD12

Figure G.1: 78 Trips at main route OD12 divided by their turning points.

Table G.1: Correlations of Full trips at OD12 (station all the way to the Verbindingsdam)

OD21 OD23 OD32 OD21 Pearson Correlation 1 .479** .331* Sig. (2-tailed) .000 .016 N 52 52 52 OD23 Pearson Correlation .479** 1 .400** Sig. (2-tailed) .000 .003 N 52 52 52 OD32 Pearson Correlation .331* .400** 1 Sig. (2-tailed) .016 .003 N 52 52 52 **. Correlation is significant at the 0.01 level (2-tailed). *. Correlation is significant at the 0.05 level (2-tailed).

In Table G.1 it is seen that the correlations between the routes of trips who walked the full main route at OD12 (52 trips in total) are significantly correlated (at the 1% and 5% level). Table G.2 shows the frequency

87 88 G.R EVEALED PREFERENCEADDITIONALINFORMATION of routes which are chosen by these 52 trips at the other three OD-pairs. It is seen that at OD21 more than fifty percent of these trips don’t walk on this OD-pair. This could have caused the significant correlation. At OD23 over 65% also chooses the main route, while over 40% does not take the return route at the Java-eiland (OD32).

Table G.2: Full trips: frequencies of chosen routes

OD12 Frequency Percent Valid Percent Cumulative Percent Valid 1. Main route 52 100.0 100.0 100.0 OD21 Frequency Percent Valid Percent Cumulative Percent Valid 0. Not at OD 28 53.8 53.8 53.8 1. Alternative (Veemkade) 4 7.7 7.7 61.5 2. Main route (Piet Heijnkade) 3 5.8 5.8 67.3 3. Short cut to main 1 1.9 1.9 69.2 4. Short cut 12 23.1 23.1 92.3 5. PT 4 7.7 7.7 100.0 Total 52 100.0 100.0 OD23 Frequency Percent Valid Percent Cumulative Percent Valid 0. Not at OD 12 23.1 23.1 23.1 1. Main route 34 65.4 65.4 88.5 2. Alternative route 4 7.7 7.7 96.2 3. Short cut 2 3.8 3.8 100.0 Total 52 100.0 100.0 OD32 Frequency Percent Valid Percent Cumulative Percent Valid 0. Not at OD 23 44.2 44.2 44.2 1. Alternative route 10 19.2 19.2 63.5 2. Main route 9 17.3 17.3 80.8 3. Short cut 10 19.2 19.2 100.0 Total 52 100.0 100.0

Table G.3: Correlations half trips OD12, which turned around somewhere in between the station and the Verbindingsdam.

OD21 OD23 OD32 OD21 Pearson Correlation 1 .282 .282 Sig. (2-tailed) .163 .163 N 26 26 26 OD23 Pearson Correlation .282 1 1.000** Sig. (2-tailed) .163 0.000 N 26 26 26 OD32 Pearson Correlation .282 1.000** 1 Sig. (2-tailed) .163 0.000 N 26 26 26 **. Correlation is significant at the 0.01 level (2-tailed).

In Table G.3 one significant correlation is shown for the tips (26 in total) )which return halfway at the Veemkade. This correlation is very high, because it is only one trip (as is seen in Table G.4). So this table is biased by this one trip out of the 26 trips in total. It is seen that for OD21 the percentage of trips who took the alternative route (Veemkade), so the same route back, is much higher: 30.8% instead of 7.7%. So this means that the trips which return halfway tend to take the same route back more frequently than the trips who completed the full route. Besides also for these ’halfway’ trips, the percentage who takes a short cut back at OD21 is relatively high. 89

Table G.4: Half trips: frequencies of chosen routes

OD12 Frequency Percent Valid Percent Cumulative Percent Valid 1. Main route 26 100.0 100.0 100.0 OD21 Frequency Percent Valid Percent Cumulative Percent Valid 0. Not at OD 1 3.8 3.8 3.8 1. Alternative (Veemkade) 8 30.8 30.8 34.6 3. Short cut to main 2 7.7 7.7 42.3 4. Short cut 14 53.8 53.8 96.2 5. PT 1 3.8 3.8 100.0 Total 26 100.0 100.0 OD23 Frequency Percent Valid Percent Cumulative Percent Valid 0. Not at OD 25 96.2 96.2 96.2 3. Short cut 1 3.8 3.8 100.0 Total 26 100.0 100.0 OD32 Frequency Percent Valid Percent Cumulative Percent Valid 0. Not at OD 25 96.2 96.2 96.2 1. Alternative 1 3.8 3.8 100.0 Total 26 100.0 100.0

STARTANDENDTIMESOFTRIPS In addition to the figures which were shown in Chapter 6, Figures G.2 until G.6 show the trips differentiated in start and end times. It is visible that trips who start and end in the afternoon or evening do not take a boat trip on the IJ. And trips which are longer -so start in the morning and end in the evening- more frequently visit the city centre of Amsterdam.

Figure G.2: Start trip in the morning, end trip in the afternoon (61 trips).

Figure G.3: Start trip in the morning, end trip in the evening (22 trips). 90 G.R EVEALED PREFERENCEADDITIONALINFORMATION

Figure G.4: Start trip in the afternoon, end trip in the afternoon (16 trips).

Figure G.5: Start trip in the afternoon, end trip in the evening (46 trips).

Figure G.6: Start trip in the evening, end trip in the evening (10 trips).

QUANTIFICATION SCALAR ATTRIBUTES In the main text the scalar values of the attributes were given. It was explained briefly how these values were derived from the environment. This appendix shows in Table G.5 what lengths of the routes were used to calculate the shares for the attributes attractions, signs, trees and water. 91 0.00 0.00 ater h Water W 50 0.76 000 0.70 50 0.42 0.76 000 0.70 0050 0.42 1.00 00 1.00 00 1.00 50 1.00 00 1.00 1.00 oute ratio lengt r 22 0 15 90 22 0 15 90 15 15 16 15 15 16 rees h Trees T 0500 0.05 0.64 00 0.07 50 0.07 0 0.05 0.64 00 0.07 0 0.07 0 0.63 0 0.35 0 0.50 0 0.63 0.35 0.50 oute ratio lengt r 15 13 15 15 15 13 15 15 95 55 80 95 55 80 16 12.5 16 6.5 21 6.5 26 Width 24.5 34.5 16 oad Size R Heinkade > West Heinkade > East .Van Hengelstr > South ajangracht > North ajangracht > South roenhoedenveem > North 10 roenhoedenveem > South 10 ame riesseveem > North eemkade > West riesseveem > South treet umatrakade/Azartplein > North 40 umatrakade > East avakade/Azartplein > Westavakade > East 9 S n Piet Piet V G V J.F V G J S M J S M igns h Signs S 500 1.00 0050 0.26 0.98 0 0.67 000 0.19 1.00 5000 0.26 0.53 0 1.00 00 0.23 50 0.53 0 0.23 1.00 0.59 oute Ratio lengt r 29 55 21 14 55 21 55 11 15 35 85 35 15 95 0 0 ttractions A th Attractions 00 0.68 000 0.70 00 0.44 0.68 000 0.70 000 0.42 0.67 000 0.29 0 0.38 0.67 0 0.29 0.38 oute ratio leng r 20 0 15 95 20 0 15 90 10 45 60 10 45 60 th 50 00 50 50 50 00 50 50 00 50 00 00 50 00 otal oute T leng r 29 21 21 21 29 21 21 21 15 15 16 15 15 16 Alternative route Short cut to main route Short cut Main route Short cut to main route Short cut Alternative route Short cut Main route Short cut 12 Main route 3 3 12 Alternative route 3 4 12 Main route 3 12 Alternative route 3 12 21 23 32 OD OD OD OD able G.5: Explanation of the scalar attributes. T 92 G.R EVEALED PREFERENCEADDITIONALINFORMATION

Table G.6: The splits which are used for the crowdedness split estimations. For example for OD12 route 1 (main route) the split from sensor 1299 to sensor 1295 are used (Figure G.7 shows the sensor numbering).

OD-pair Choice WiFi-Sensor which is used to calculate the share OD12 1 1299>1295 2 1299>1293 3 1299 1326 0.5 > × 4 1299 1326 0.5 > × 1326 1295 OD21 1 1326 1295>1326 1299 27 21 > + > × > 2 27>1293 1326 1299 3 1326 1295>1326 1299 27 21 0.5 > + > × > × 4 1326 1299 27 21 0.5 1326 1295>1326 1299 × > × > 1329+ 34> OD23 1 1329 34 1329> 1308 27 1329 > + > × > 2 27>1294 1329 34 3 1329 1308 >1329 1308 27 1329 > + > × > OD32 1 1324>34 2 1308 1294 1324 26 1308 1294>1308 1329 × > >1308+1329> 3 1308 1294>1308 1329 1324 26 > + > × >

The crowdedness attribute values are shown per trip and per OD-pair in Table G.7. The values are dynamic in time, derived from the WiFi-data. The splits in Table G.6 are used to estimate these attribute values. The WiFi-sensors which are used are shown in Table G.7.

Figure G.7: Position of the WiFi-sensors. Circled are the sensors which are used to calculate the splits of the different route choices per OD-pair. 93

Table G.7: Crowdedness attribute values, per trip and per OD-pair. The values are dynamic in time, derived from the Wifi data. Of the 155 sample, the first and last data is shown.

Alternative Trip ID OD-pair Choice 1 2 3 4 6 12 1 0.519 0.078 0.159 0.159 7 12 1 0.519 0.078 0.159 0.159 7 21 4 0.172 0.084 0.131 0.131 7 23 1 0.185 0.174 0.185 7 32 1 0.822 0.150 0.028 10 12 3 0.000 0.000 0.000 0.000 10 21 2 0.187 0.121 0.122 0.122 10 23 3 0.000 0.000 0.000 10 32 1 0.746 0.225 0.029 11 21 1 0.260 0.043 0.136 0.136 12 12 1 0.000 0.000 0.000 0.000 12 21 4 0.171 0.141 0.106 0.106 12 23 2 0.000 0.000 0.000 12 32 1 0.739 0.220 0.041 13 21 2 0.260 0.043 0.136 0.136 13 23 2 0.155 0.322 0.111 13 32 1 0.713 0.251 0.036 18 23 1 0.155 0.322 0.111 18 32 2 0.768 0.200 0.032 19 32 1 0.793 0.184 0.023 26 12 1 0.273 0.051 0.299 0.299 26 21 4 0.290 0.050 0.123 0.123 26 23 1 0.158 0.307 0.121 26 32 2 0.841 0.133 0.025 28 12 4 0.273 0.051 0.299 0.299 28 21 2 0.171 0.141 0.106 0.106 28 23 1 0.158 0.307 0.121 28 32 2 0.739 0.220 0.041 30 12 1 0.273 0.051 0.299 0.299 30 21 1 0.290 0.050 0.123 0.123 30 23 1 0.158 0.307 0.121 30 32 3 0.841 0.133 0.025 32 32 1 0.793 0.184 0.023 33 21 1 0.290 0.050 0.123 0.123 33 32 1 0.841 0.133 0.025 34 12 1 0.238 0.061 0.242 0.242 36 21 4 0.290 0.050 0.123 0.123 36 32 1 0.841 0.133 0.025 38 12 2 0.267 0.051 0.193 0.193 39 21 3 0.290 0.050 0.123 0.123 39 23 2 0.202 0.267 0.100 39 32 1 0.841 0.133 0.025 42 12 1 0.409 0.052 0.164 0.164 42 21 3 0.281 0.266 0.161 0.161 42 23 1 0.152 0.204 0.093 42 32 2 0.846 0.120 0.034 46 12 3 0.333 0.065 0.165 0.165 48 12 1 0.333 0.065 0.165 0.165 48 21 1 0.281 0.266 0.161 0.161 50 12 1 0.382 0.079 0.148 0.148 52 12 4 0.382 0.079 0.148 0.148 52 23 2 0.170 0.121 0.077 52 32 3 0.841 0.133 0.025 54 12 1 0.382 0.079 0.148 0.148 54 21 4 0.281 0.266 0.161 0.161 54 23 1 0.170 0.121 0.077 54 32 2 0.846 0.120 0.034 ...... 319 12 1 0.331 0.100 0.184 0.184 319 21 4 0.268 0.098 0.194 0.194 321 12 1 0.364 0.089 0.131 0.131 321 21 3 0.268 0.098 0.194 0.194