UNIVERSITY OF CALGARY

Development and Application of an Urban Geographic

Information System for Traffic Safety

by

Dae-Won Kwon

A THESIS

SUBMITTED TO THE FACULTY OF GRADUATE STUDIES

IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE

DEGREE OF DOCTOR OF PHILOSOPHY

INTERDISCIPLINARY GRADUATE PROGRAM

CALGARY,

JULY, 2010

© Dae-Won Kwon 2010

Library and Archives Bibliothèque et Canada Archives Canada

Published Heritage Direction du Branch Patrimoine de l’édition

395 Wellington Street 395, rue Wellington Ottawa ON K1A 0N4 Ottawa ON K1A 0N4 Canada Canada

Your file Votre référence ISBN: 978-0-494-69486-2 Our file Notre référence ISBN: 978-0-494-69486-2

NOTICE: AVIS:

The author has granted a non- L’auteur a accordé une licence non exclusive exclusive license allowing Library and permettant à la Bibliothèque et Archives Archives Canada to reproduce, Canada de reproduire, publier, archiver, publish, archive, preserve, conserve, sauvegarder, conserver, transmettre au public communicate to the public by par télécommunication ou par l’Internet, prêter, telecommunication or on the Internet, distribuer et vendre des thèses partout dans le loan, distribute and sell theses monde, à des fins commerciales ou autres, sur worldwide, for commercial or non- support microforme, papier, électronique et/ou commercial purposes, in microform, autres formats. paper, electronic and/or any other formats. . The author retains copyright L’auteur conserve la propriété du droit d’auteur ownership and moral rights in this et des droits moraux qui protège cette thèse. Ni thesis. Neither the thesis nor la thèse ni des extraits substantiels de celle-ci substantial extracts from it may be ne doivent être imprimés ou autrement printed or otherwise reproduced reproduits sans son autorisation. without the author’s permission.

In compliance with the Canadian Conformément à la loi canadienne sur la Privacy Act some supporting forms protection de la vie privée, quelques may have been removed from this formulaires secondaires ont été enlevés de thesis. cette thèse.

While these forms may be included Bien que ces formulaires aient inclus dans in the document page count, their la pagination, il n’y aura aucun contenu removal does not represent any loss manquant. of content from the thesis.

Abstract

The objective of this research is to develop an urban Geographic Information

System (GIS) database and to apply spatial analysis techniques to traffic safety using the

GIS database. This study begins with a literature review of the geocoding method for collision locations and of GIS applications to traffic safety analysis.

In order to examine the currently available geocoding tools in GIS software packages, i.e. ArcGIS® 9.2 and GeoMedia® 6.1, an intersection-based geocoding method is applied to 281 manually selected sample collision records in the City of ’s

Motor Vehicle Collision Information System (MVCIS). In order to overcome those deficiencies found during the geocoding test, a new approach for geocoding collision locations utilizing the Location Code is proposed. Using this approach, all collision records from 2006 to 2008 are successfully geocoded in the GIS environment.

Using the collision GIS database, applications of spatial analysis in traffic safety are performed in the following areas; query analysis, raster-based analysis, and display analysis. These demonstrate that GIS can provide useful spatial analytical tools as well as display analysis results on maps. Most particularly, change detection using the kernel density estimation method and map algebra provides a tool with which to identify collision density changes around the study area. Finally, recommendations are offered for continued research in this area.

iii Acknowledgements

First and foremost, I thank my supervisor, Dr. Nigel Waters, for his continuous support throughout my graduate studies. Nigel was always there to listen and to give advice. Without his encouragement and constant guidance, I could not have finished this dissertation. I am also very fortunate to have Dr. Richard Tay as my co-supervisor, and offer thanks for his guidance on the study of traffic safety.

In addition to my supervisors, I would like to thank the rest of my supervisory committee: Dr. Mike Boyes and Dr. Richard Levy. Special thanks to Dr. Lina Kattan and

Dr. Shih-Lung Shaw for acting as my external examiners. I would also like to mention and thank Dr. Tom Keenan, Interdisciplinary Graduate Program Director, and Ms.

Pauline Fisk, Interdisciplinary Graduate Program Administrator, for their support during my Ph.D. program.

Thanks to Mr. Gerry Shimko and Dr. Laura Thue for employing me as a GIS traffic safety analyst in the City of Edmonton, the Office of Traffic Safety and for their support during my dissertation project. Also, thanks to all of the members of the Office of

Traffic Safety and friends in Canada and Korea.

My family deserves more mention than space allows. Without them, I would not have been able to complete this project. To my sisters, Yookyung and Yoohee, brothers- in-law, and nephews and niece, my greatest thanks for your continued support.

Last, but not least, I thank my mother, Young-Geon Shin, for her patience and encouragement, and for inspiring me to pursue higher education. My mother gave me everything I have. Without her sacrifice and love, I could not have reached my present place.

iv Dedication

To my parents…

v Table of Contents

Approval Page...... ii Abstract...... iii Acknowledgements...... iv Dedication...... v Table of Contents...... vi List of Tables ...... viii List of Figures...... ix List of Symbols and Abbreviations ...... xi

CHAPTER ONE: INTRODUCTION...... 1 1.1 Background...... 1 1.2 Objectives ...... 3 1.3 Outline of Methodology and Scope...... 4 1.4 Organization...... 5

CHAPTER TWO: LITERATURE REVIEW...... 7 2.1 Introduction...... 7 2.2 GIS database development for traffic safety analysis...... 8 2.2.1 Geocoding Procedure ...... 8 2.2.2 Accuracy and Precision of Geocoding ...... 10 2.2.3 Linear Reference System and Use of GPS...... 15 2.3 Types of Analyses for Traffic Safety using GIS...... 18 2.3.1 Display / Query Analysis...... 18 2.3.2 Spatial Analysis...... 19 2.3.3 Network Analysis ...... 21 2.3.4 Raster-based Modelling...... 22

CHAPTER THREE: A NEW APPROACH TO GEOCODING COLLISION LOCATIONS...... 25 3.1 Introduction...... 25 3.2 Study Area and Datasets...... 26 3.3 Evaluation of Geocoding Method...... 27 3.3.1 Location Information in MVCIS...... 28 3.3.2 Geocoding Test...... 33 3.3.3 Test Results and Discussion ...... 36 3.4 A New Approach for Geocoding Collision Locations...... 54 3.4.1 Location Code and Reference Layer ...... 54 3.4.2 Geocoding results and discussion of the new approach ...... 63 3.5 Chapter Summary ...... 72

CHAPTER FOUR: SPATIAL ANALYSIS FOR TRAFFIC SAFETY...... 73 4.1 Introduction...... 73 4.2 Attribute and Spatial Query ...... 74 4.3 Hot Spot Detection and Density Analysis ...... 80 4.3.1 Kernel Density Estimation ...... 80

vi 4.3.2 Applications for Hot Spot Detection and Density Analysis...... 82 4.4 Display Analysis...... 92 4.5 Chapter Summary ...... 102

CHAPTER FIVE: CONCLUSIONS ...... 103 5.1 Summary of Chapters ...... 103 5.2 Concluding Remarks...... 104 5.3 Recommendations and Future Study Direction...... 106 5.3.1 Data Acquisition and Data Integration...... 106 5.3.2 Other Spatial Analysis and the Implementation of Other Analytical Methods...... 110

REFERENCES ...... 111

APPENDIX A: Alberta Collision Report Form ...... 118

APPENDIX B: Example Of Actual Collision Report ...... 120

APPENDIX C: Geocoding Test Sample Address ...... 123

vii List of Tables

Table 3.1 Portion Code Description...... 30

Table 3.2 2006 Collision Count by Location Type and Portion...... 34

Table 3.3 Test Sample Statistics ...... 34

Table 3.4 Geocoding Test Match Rate Results...... 37

Table 3.5 Geocoding Test Match Accuracy Results...... 38

Table 3.6 Geocoding Match Analysis...... 39

Table 3.7 Location Code Structure and Examples...... 55

Table 3.8 Results of Geocoding Using Location Code...... 65

Table 3.9 Analysis of Unidentified Location...... 70

viii List of Figures

Figure 2.1 Geocoding Stage (Levine and Kim, 1998)...... 12

Figure 2.2 K-function local indicators of network-constrained clusters result (Yamada and Thill, 2007)...... 21

Figure 2.3 Illustrative travel corridors of shortest paths between origin-destination pairs (Kam, 2003) ...... 22

Figure 2.4 Illustration of different overall patterns produced by the network KDE and Planar KDE (Xie and Yan, 2008) ...... 24

Figure 3.1 Illustrations of Portion Code Locations within Intersection...... 31

Figure 3.2 Accident Zone Map: -Calgary Trail/Gateway Boulevard (Example of Mapped Location) ...... 32

Figure 3.3 Example of Tied Match Score Locations ...... 41

Figure 3.4 Example of One Intersection Point (Node) Street Network...... 42

Figure 3.5 Example of Tied Match Score Locations in One Intersection Point Street Network...... 43

Figure 3.6 Example of Multiple Intersections Caused from Street Naming...... 44

Figure 3.7 Example of Multiple Intersections Caused by the Nature of the Traffic Circle...... 45

Figure 3.8 Example of Incorrect Information in the Address Table...... 47

Figure 3.9 Example of Intersections that Do Not Intersect ...... 50

Figure 3.10 Example of Phantom Intersection ...... 53

Figure 3.11 Roadway Portion Layer Design by Feature Type ...... 58

Figure 3.12 Linking Collision Data Table to Reference Data Layer...... 60

Figure 3.13 Assigning Coordinate Values to Each Record in the Collision Data Table.. 61

Figure 3.14 Example of Geocoding Results of Numbered or Named Locations ...... 66

Figure 3.15 Example of Geocoding Results of Mapped Locations...... 67

Figure 3.16 Geocoding of Unknown and Unidentified Locations...... 68

Figure 4.1 2008 Collision Locations and Counts in Parkallen (Source: Author) ...... 77

ix Figure 4.2 Example of Spatial Query on a Corridor...... 78

Figure 4.3 Example of Spatial Query on grade separated interchange...... 79

Figure 4.4 Ran-Off-Road Collision Location Map (a) and Density Map (b) (Source: Author)...... 84

Figure 4.5 Collision Density Maps of 118 Ave & 101 St (period of Feb 6 – May 31) .... 87

Figure 4.6 Collision Density Maps of High Collision Intersections in South Edmonton (period of Feb 6 – May 31) ...... 87

Figure 4.7 Collision Density Changes from 2006 to 2008 for the period of Feb 6 to May 31 (blue-decrease (high), green-decrease (low), yellow- increase (low), red- increase (high)) ...... 91

Figure 4.8 Speed Zone Bylaw 2008 Map (The City of Edmonton; Source: Author)...... 93

Figure 4.9 Map of Speed Violation and Offenders’ Residence Location by Postal Code (Source: Author)...... 95

Figure 4.10 Curb The Danger Call Density Change Map from 2007 to 2008 (Source: Author)...... 99

Figure 4.11 Difference of CTD Call Density (density of geocoded points using the Location Code minus density of address geocoded points by EPS, Source: Author) (1-Valid address, but unmatched by address geocoding, 2-Mismatched location, and 3-Matched to one of tied match score locations by address geocoding)...... 100

Figure 4.12 Comparison between Geocoded using the Location Code and Address Geocoding ...... 101

Figure 5.1 Configuration of Integrated Traffic Safety Database (Source: Author)...... 109

Figure A.1 Alberta Collision Report Form (Front)...... 118

Figure A.2 Alberta Collision Report Form (Back) ...... 119

Figure A.3 Example of Actual Collision Report (Front) ...... 120

Figure A.4 Example of Actual Collision Report (Back)...... 121

Figure A.5 Intersection of and 111 Street...... 122

x List of Symbols and Abbreviations

® Registered Trademark AADT Annual Average Daily Traffic CAD Computer-Aided Design CTD Curb The Danger DBMS Database Management System DOT Department of Transportation EPS Edmonton Police Service GIS Geographic Information System(s) GPS Global Positioning System HSIS Highway Safety Information System ITS Intelligent Transportation System KDE Kernel Density Estimation LLRS or LRS Linear (Location) Referencing System LRM Linear Referencing Method MVCIS Motor Vehicle Collision Information System SLIM Spatial Land Inventory Management

SWITRS California Statewide Integrated Traffic Records System

xi 1

Chapter One: Introduction

1.1 Background

According to Transport Canada (2007), in 2006, 2,889 road users were killed and over 199,000 were injured due to traffic collisions. More than 15,000 of the injured victims were seriously injured and subsequently hospitalized for treatment or observation. The cost due to traffic collisions is a burden not only for those individuals who are involved in them, but also for society as a whole. Additionally, the statistics state that 69.96% of the fatal and injury inducing collisions occurred in urban areas.

In 1996, the Canadian government announced the Road Safety Vision 2001 plan, the goal of which is “to have the safest roads in the world” (Transport Canada, 2001). In

2000, the Council of Ministers for Transportation and Highway Safety approved the

Road Safety Vision 2010 plan and thereby agreed to carry out the necessary work involved in implementing it. This national target calls for a 30% reduction in the average number of road users killed and seriously injured. In 2006, the Alberta Traffic Safety Plan was released, its aim being to meet those targets set out in the national strategy. Also in

2006, the City of Edmonton Office of Traffic Safety introduced the Traffic Safety

Strategy for the City of Edmonton 2006-2010. In support of the national and provincial targets noted above, the vision of the Edmonton Traffic Safety Strategy is to achieve a

30% reduction in traffic collisions in Edmonton by the year 2010. This includes targets relating to reductions in intersection- and speed-related collisions.

In the previously mentioned plans, GIS were introduced as a tool for infrastructure database development and management. Geographic Information Systems

2

(GIS) are a system of hardware, software and procedures designed to support the capture, management, manipulation, analysis, modeling and display of spatially-referenced data for solving complex planning and management problems. The power of graphic displays coupled with the flexible data handling capabilities of a database management system

(DBMS) offers the GIS user an extremely powerful analytical tool (Arthur, 2002).

In the early stage of GIS usage in transportation applications, they were simply used to display the model results, with other GIS datasets as a backdrop. In more sophisticated applications, GIS were used to update and manage the network as well as to display the model results. Recently GIS have become the most popular methodology used in the transportation field. The acronym GIS-T is often employed to refer to the application and adaptation of GIS to research, planning, and management in transportation (Thill, 2000; Waters, 1999). Various specialized transportation related packages incorporated within GIS software packages have been developed, and currently more sophisticated forms of analysis are available in the GIS-T area. Additionally, there is a need to integrate Geographic Information Science for Transportation (GISci-T) with

GIS-T (Waters, 1999). GISci-T refers to the theory and methods that underlie GIS-T

(Miller and Shaw, 2001)

GIS methodologies have been applied to traffic safety research and analysis, and have been proven to improve the traffic safety research process (Smith et al., 2001). As a result, many government agencies at all levels including municipal, provincial, and federal have adopted GIS in traffic safety projects. GIS have been used for creating, managing, and updating collision databases (Liang et al., 2005); representing collision locations on maps (e.g. road network or transportation analysis zone) (Pulugurtha et al.,

3

2007); producing crash statistics and diagrams (Noland and Quddus, 2004); and identifying and analysing high crash locations (Erdogan et al., 2008). Even though the role of GIS in the traffic safety process to date has been limited and partial, GIS can be implemented in all aspects of the traffic safety process, and thus improve the process effectively and efficiently. For example, a police officer at the scene of an accident can input an accident record to the main collision database wirelessly using a portable device that is equipped with a GPS and GIS. The traffic engineers can then identify and analyse crash prone locations using GIS. Furthermore, other traffic engineers, law enforcement officers, and planners can easily access, analyse, and share the information.

Much of the previous research into GIS as a spatial data management and analytical tool for traffic safety, has been confined to primary roadways such as rural highways. However, in urban areas, many different functional and physical types of roadways exist and interact with one another, which, therefore, sometimes make it hard to develop a GIS database for an urban area. Furthermore, in the case of migration from a conventional database (not spatially referenced data) to a GIS database many technical problems are caused, and this is usually a time and cost consuming process.

1.2 Objectives

The primary objective of this dissertation is to adapt an urban GIS database and spatial analysis applications to traffic safety using the GIS database. This dissertation will focus on traffic collisions occurring in an urban area, which is the City of Edmonton. It will discuss technical problems which might arise when the geocoding technique is applied to geo-referencing traffic collision data stored in a conventional collision

4 database. The study will attempt to migrate traffic collision data from a non- geographically referenced conventional collision information system to a GIS database for traffic safety. A new approach for the geocoding technique will be used for assigning coordinate information to each collision location.

Also, this study will demonstrate spatial relationships using GIS with urban traffic collision data. The technique of raster-based density function will be applied mainly to identify problematic locations within an urban area. In addition, this dissertation will illustrate the utility of GIS in urban traffic safety analysis as a framework within agencies.

1.3 Outline of Methodology and Scope

As stated in the objectives section, the geographically referenced collision database is a key component of any urban GIS traffic safety database. Currently, conventional collision database development and management involves many error prone steps. For example, once the paper forms of collision reports are collected either by a police officer or by motorists, data entry persons input the records based on these forms.

During the reporting stage missing or incomplete records are often found, and these lead to the loss of valid collision records.

In order to minimize the data loss during the database migration process, the suitability of address geocoding tools provided in GIS software packages is tested on address/intersection samples from a non-geographically referenced collision database in terms of geocoding match rate and geocoding match accuracy. In addition to discussion of the results from the address geocoding test, error sources in the collision database,

5 reference data, and the geocoding process are scrutinized. Then, a new approach for the geocoding technique is proposed and applied to the collision database. The results from the new geocoding approach and discussion are reported next. This study focuses on converting a non-geographically referenced collision database into a geographically referenced collision database using a new geocoding technique; however, discussion of how to geographically referencing other datasets, such as CAD and text files, is included in this study.

Once an urban GIS database for traffic safety is developed, various spatial analysis techniques are applied and examined in order to highlight traffic safety issues in an urban area, the City of Edmonton. This study utilizes spatial analysis techniques available in most GIS software packages. Maps and results from spatial analysis are presented and discussed.

1.4 Organization

This dissertation is organized into five chapters. An introduction to the study and its scope is presented in Chapter 1.

Chapter 2 provides a review of the literature on GIS database development for traffic safety and the types of traffic safety analyses that have been performed using GIS.

GIS database development for traffic safety focuses on the geocoding process as well as its match rate and accuracy issues. Also, a discussion of previous studies on GIS applications for traffic safety is given.

Chapter 3 describes the GIS database development for this study. This includes the test of the address geocoding method and an evaluation of the results. In doing so, a

6 detailed discussion about the various error sources that may reduce the match rate and accuracy of address geocoding is presented. Then, a new approach to geocoding collision locations is introduced. A discussion about the integration or fusion of other transportation data into the GIS database is also presented in this chapter.

Chapter 4 describes the GIS applications using the database that are discussed in

Chapter 3. The GIS analyses include spatial and attribute query, hot spot detection and density analysis, display analysis, and proximity analysis.

Finally, Chapter 5 provides a general discussion, conclusions, and recommendations for future research resulting from this study.

7

Chapter Two: Literature Review

2.1 Introduction

Traffic safety analysts and researchers use historical collision data to perform statistical and spatial analyses for deriving useful information that directly leads to the improvement of traffic safety in a specific region or location. The findings and results from these studies have been used to analyse particular problems such as prioritizing high collision prone locations in the area, and evaluating the effectiveness of an intervention that has been applied to solve an identified traffic safety problem. Additionally, these findings and results allow professionals to select appropriate approaches among engineering, enforcement, and education countermeasures in order to improve safety.

GIS provide spatial analysis abilities to the user in addition to mapping and tabulating capabilities. The use of GIS in traffic safety analysis has been increasing not only as a graphical representation of analysis results on a map but also as a powerful spatial analysis tool. This chapter discusses some of the research and published literature regarding this topic. Studies that were conducted in the area of GIS database development for traffic safety and spatial analysis for collisions using GIS are described.

8

2.2 GIS database development for traffic safety analysis

Traffic operational data as well as other related information, such as traffic volume and road inventory, can be integrated into a GIS database and utilized for traffic safety analyses; however, collision data are the most crucial part of the safety analyses.

This section focuses on developing a collision database for GIS, and discusses the issue of location information in collision data when it is converted to geographically referenced information.

2.2.1 Geocoding Procedure

In order to use GIS for traffic safety, data migration from a conventional collision database to a GIS database, which is a geographically referenced database is usually required. The location information of incident data (e.g. traffic collision, crime, disease) is normally stored and maintained in a database as a form of address. In order to utilize the address based incident data in a GIS environment, the assignment of geographic or

XY coordinates to records is required initially. The records in an incident database are not directly referenced by a geographic base map but have fields that can be converted into a geographical reference. This process is often called geocoding (Zhan, 2005).

The logic of geocoding is relatively straightforward. It starts with finding a match between an address database (non-geographical) and a graphical feature database

(geographical) with a coordinate system. A geocoding program then assigns coordinates to the address database.

Initially, the geocoding process requires reference data. Reference data refers to a

GIS feature class or geo-referenced dataset containing the address or location attributes.

9

The most common type of geocoding reference data is street network with a linear referencing feature. However, other feature types can be used for the reference data. For example, postal code information can be geocoded to a postal zone map (polygon) or buildings and schools can be geocoded to a point of interest file (point). Once the reference data is available, a geocoding index and a geocoding match object data should be built. The index and match data include a pointer to the reference data and guidelines for an address style which specifies the rules for matching addresses to the reference data.

In this step, building match-able reference data, the address style should be determined based on the reference data and the address dataset. The next step is locating addresses.

This process includes finding candidate locations in the reference data, assigning match scores to the locations, and narrowing down the best candidates.

The most common forms of address to express the location of collisions in an urban area generally fall into one of four categories:

• House address type: ordinary house address style where the house number appears

first followed by road name, road type, and quadrant suffix, e.g. ‘9803 102A Avenue

NW’.

• Intersection type: represented by two intersecting roads demarcated by an identifier

such as ‘&’, e.g. ‘102A Avenue & 98 Street NW’.

• Road segment type: can be represented by a street segment with two intersecting roads

or by the distance from an intersection, e.g. ‘On 102A Avenue NW between 98 Street

NW and 99 Street NW’ or ‘50m west from 97 Street on 102A Avenue NW’.

• Named location type: uses building name or known place name, e.g. ‘In front of

Century Place’ or ‘Northbound at Mayfair Interchanges’.

10

The Federal Highway Administration (Smith et al., 2001) recommends the use of the Linear Referencing System (LRS) to develop and maintain a GIS database for traffic safety and many US state DOTs have implemented LRS in their GIS-based highway safety data analysis systems. LRS is the set of procedures for determining and tracking locations either in a point or line along a linear network. However, the geocoding process is necessary for migration from a conventional collision database because the majority of agencies use text description as the method for locating collisions (Ogle, 2007), and this means a procedure for geographically referencing this collision location information in text description is required.

2.2.2 Accuracy and Precision of Geocoding

Hauer (1997) stated that traffic safety professionals and analysts should recognize the uncertainty surrounding collision data and, if possible, account for this problem. The uncertainty may include problems with under-reporting, incomplete reporting, and errors in reporting. These shortcomings of collision databases can limit full access to the phenomenon caused by a traffic accident, and make it difficult for analysts to choose the methods that match the nature of the data. Moreover, the loss of data quality as well as quantity is expected when a conventional collision database is migrated to a geographically referenced collision database. The loss is usually caused by the geocoding process.

In epidemiology and health research, address geocoding is widely applied to identify the location of study objects in order to estimate the influence of incident locations. In some studies, address geocoding tools were compared and the quality of

11 geocoding was evaluated in terms of match rate and the positional accuracy of the geocoded location (Kravets and Hadden, 2007; Zhan et al., 2006; Zandbergen, 2008).

Kravets and Hadden (2007) found that the geocoding match rate and accuracy are more successful in urban areas, while the problem of missing information is more prevalent in rural areas. Zhan et al. (2006) compared the match rate and positional accuracy of two commercial geocoding tools; the geocoding tool in ArcGIS® 9.1 and the Centrus®

GeoCoder for ArcGIS, and found that the Centrus GeoCoder was 10% better in match rate but less accurate in geocoded location. Zandbergen (2008) evaluated and compared three address data geocoding models; address point, parcels and street network, and found that address point geocoding produced match rates similar to those observed for street network geocoding. These studies show that most test cases could not attain a

100% match rate, and positional accuracy varied depending on the quality of the source data (address data and reference data) and geocoding tool. Moreover, the above two studies examined geocoding methods using house address type data (See section 2.2.1).

However, it is evident that all types of addresses are found in the urban area collision report form regardless of whether or not a police officer visited the scene and completed the form which includes location information to depict collision location (See Appendix

A for Alberta Collision Report Form). Therefore, a different geocoding strategy may be required for locating collision locations in urban areas.

There are a number of studies on utilizing address geocoding techniques to locate collision locations and improving geocoding match accuracy within GIS environments

(Karimi et al., 2004). Levine and Kim (1998) describe locating motor vehicle accidents by using the intersection-based geocoding method and the nature of errors for intersection

12 matching. In their study of Honolulu, Hawaii, missing and incorrect location components in the accident database were identified as a major source of errors. They also presented other sources of error which include intersections that don’t intersect; streets referred to by place locations; multiple intersections; street names that no longer exist; slang; and coded terms. In order to improve intersection matching and coding accuracy, they proposed a series of procedures as shown in Figure 2.1. However, they did not recommend relaxing exact name matching in the geocoding procedure which allows the street name to be relaxed. They argued that the relaxing procedure has the potential for introducing a large amount of error into matched locations although it may increase the match rate. They also mentioned that the use of a GPS might provide a more accurate methodology for locating collisions; this would require a well established and consistent procedure for producing a high degree of spatial accuracy.

Figure 2.1 Geocoding Stage (Levine and Kim, 1998)

13

Kam (2003) used several alternatives for geocoding collisions depending on the level of information available. The first alternative was exact address geocoding which was conducted if the location of the collision was identifiable. Approximate geocoding, such as randomly selecting a point along a given road, was used when only the name of the road was given. The third was map grid geocoding which was resorted to if the only information available was a map grid. A point, then, would be randomly selected within the grid to represent the location of the accident. This method depended on an internal program to locate the site of a collision within permissible locations, such as along the road network. Kam’s (2003) paper focused on crash rate analysis, and no further discussion of geocoding methods and results was provided.

Loo (2006) developed an algorithm within the GIS environment to validate crash locations. The GIS-based spatial data validation system compares spatial variables, grid references, road names, and other texture descriptive variables, in Hong Kong’s Traffic

Accident Data System with the road network database and the district board database.

The study found that the police crash database contained about 12.7% mistakes for road names and 9.7% mistakes for district boards. The author recommended validating the spatial data in the crash database by conducting spatial analyses. This paper focused on validating spatial information in the collision database; however, there was no detailed discussion about acquiring gird information or whether it was done by geocoding or by other positioning methods.

Bigham et al. (2009) created a geocoded database of police reported fatal and severe injury collisions in the California Statewide Integrated Traffic Records System

(SWITRS) between the years 1997-2006 using the multi-step process of linear

14 referencing and address/intersection geocoding tools in ArcGIS® 9.2 and Google Earth®

Pro. They were able to geocode 91% of 142,007 fatal and severe injury collisions.

Postmile-based highway collisions were matched with a 99.8% match rate; on the other hand, collisions which occurred on local roads were geocoded using the address/intersection geocoding method with an 86% match rate and 98% accuracy. In order to measure and verify the accuracy of geocoded locations, the authors examined a random sample of 500 local road collisions, and any measured distances within 50 feet between geocoded locations and manually identified true locations were considered correct. The authors indicated that the geocoding of postmile-coded collisions is dependent on the quality of the location information obtained from the database, mainly due to the collection and processing of postmile information. The authors also discussed the tied match score problem (this topic is discussed in section 3.3.3) and the reasons for low match score records during geocoding of local road collisions. These reasons included the following: the intersection did not exist, the entry contained misspellings, or the street network did not cover new or renamed streets.

As I carried out my review I discovered that there is little literature that focuses on the accuracy and precision of collision location geocoding. Although advanced geocoding software has been introduced, which provides more flexibility in order to assist each user’s requirement and geocoding characteristics, it is assumed that there is no perfect geocoding software available in the GIS market; however, this deficiency is mainly due to the errors which occur in the collision database during the data entry process. It is necessary to discuss how to deal with un-geocoded collision records, especially, when a GIS-based collision database is replacing a conventional collision

15 database. It is clear that a detailed study on the possible sources which affect the geocoding match rate and accuracy is needed.

2.2.3 Linear Reference System and Use of GPS

In a report that discussed the efforts to expand the analytical features of the

Highway Safety Information System (HSIS) in the United States by integrating GIS capabilities, Smith et al. (2001), recommended that a linear location referencing system

(LLRS or LRS) must be developed and made available for integration. Therefore, a good understanding of how the spatial reference method can be implemented is necessary to plan for the development of an appropriate GIS that avoids linkage-related issues.

The Linear Location Referencing System (LLRS or LRS; Smith et al., 2001) is the total set of procedures for determining and retaining a record of specific points along a roadway. The system includes the location referencing methods, together with the procedures for storing, maintaining and retrieving location information about points and segments on the roadways. A LRS typically consists of the following components (Miller and Shaw, 2001): a transportation network, a location referencing method (LRM), and datum. The transportation network consists of the traditional node-arc topological network. The LRM determines an unknown location within the transportation network using a defined path and an offset distance along that path from some known location.

This provides the basis for maintaining event data within the network. The datum is the set of objects with known geographically referenced locations. The datum ties the LRS to the real world.

16

The use of GPS for locating collision locations has been growing more popular in many agencies since a relatively inexpensive receiver became available on the market and positioning accuracy was improved by removing Selective Availability (The White

House, 2000). Based on the availability of in-vehicle GPS systems in police vehicles, the collision location information can be automatically captured and recorded to a connected laptop computer by a software program. Another option for capturing collision location is through a handheld GPS receiver. The portable GPS receiver can be used to obtain coordinate values (longitude and latitude) and then to record the position information on the written report form (Ogle, 2007).

Sarasua et al. (2008) evaluated the use of GPS units by South Carolina enforcement agencies to record collision locations. The research indicates that approximately 80% of the collision records can be located within reasonable levels of accuracy. The authors identify the various types of errors that may occur: multiple representations of coordinate information by officers (decimal degree, degree-minutes- second, and state plane) to log data, the reversal of the latitude and longitude fields, and the truncation of the number of decimals, and recommend targeted training which should have a significant impact on error reduction.

Khan et al. (2004) stated that GPS devices have been used to record the coordinates of accidents in Abu Dhabi, United Arab Emirates, since 2003. However, the authors argued that the use of GPS has not been entirely helpful for a number of reasons; first, because it has been demonstrated that in the urban high rise environment of Abu

Dhabi, these GPS coordinates can have errors of up to 100m; second, GPS coordinates are in longitude and latitude and not a geographically referenced easting and northing;

17 and third, it was found that there were inaccuracies and inconsistencies in the transcription of GPS coordinates from the device onto the paper forms.

18

2.3 Types of Analyses for Traffic Safety using GIS

The benefits of GIS are well established. GIS provide the capability for storing and maintaining a large number/amount of spatial and tabular datasets. Once a spatial database is complete, it can be displayed and viewed in the GIS environment, and GIS can also provide a hard copy of the screen view. Then the user can easily query the information for viewing and exporting to other applications. GIS provide the capability for sophisticated spatial analysis methods as well as for simple analysis. In addition, GIS offer a programming or scripting environment that allows the user to develop specific analysis programs or to customize existing programs. More importantly, GIS can be integrated into more mainstream enterprise applications, as well as web-based applications. Spatially enabling a website to include maps of high collision locations would be one example of the latter applications.

The following section discusses the capabilities of using GIS in Traffic Safety in more detail as discussed in the report by Smith et al. (2001). Examples from other studies are also discussed where appropriate.

2.3.1 Display / Query Analysis

GIS permit users to replace paper maps and to display geographically referenced digital map background and layers of additional attribute information, which can be viewed in any desired combination and at any scale. Nowadays, most transportation departments at all levels of government develop and maintain digitalized road network drawings and transportation related datasets (Waters, 1999). With GIS, these datasets can be integrated into a spatial database, and can be easily accessed by users. For instance,

19 using the database capabilities of GIS, the safety analyst can query the database and the results graphically displayed. This query analysis can handle a simple question, such as

‘Can you show me all of the locations of pedestrian involved collisions that resulted in a fatality?’, as well as more complex queries, for example, ‘Query and count the number of collisions that occurred on urban highways with a speed limit ≥ 80 km/h and AADT ≥

10000’.

The use of imagery in GIS in conjunction with the digital elevation model can provide virtual reality displays for road safety analysis, giving a realistic view of the landscape. Satellite imagery and digital aerial photographs are two sources that can be used for this application.

Data integration provides the ability to spatially integrate and merge the data into a single view. For instance, demographic data, environmental data, socio-economic data, and terrain data can be integrated using GIS, thus expanding the data sources available to the safety analyst. With the integrated database and GIS, a safety analyst can produce various thematic maps by type or class. These simple mapping capabilities are most commonly used to process quickly large amounts of information, such as for showing high collision locations through the use of size and color circle symbolization.

2.3.2 Spatial Analysis

GIS provide tools to combine data, to identify overlaps across data, and to join the attributes of datasets together using feature location and feature extent as the selection criteria. An overlay function enables the combining of spatial data, such as features that can be combined to simply add one spatial dataset to another, or to update or replace

20 portions of one dataset with another dataset. Overlay analysis can be used to merge spatial data by combining two or more spatial datasets to produce a new spatial dataset where the feature attributes are a union of the input datasets.

Proximity analysis is a type of GIS query capability and a category of spatial analysis that represents the fundamental difference of GIS from all other information systems. Buffering is a means of performing this practical spatial query to determine the proximity of neighbouring features. In GIS, buffering will locate all features within a prescribed distance from a point, line, and area, such as determining the number of crashes that occurred within 50m of an intersection.

Intersection analysis can be a tool for evaluating collisions on a user-selected intersection or point for a given search radius. The intersection can be chosen by clicking on the map or by querying attributes, such as two intersecting road names. Once an intersection or a point of interest is chosen, the process would automatically generate a report that shows the number of collisions, fatalities, injuries, etc. and a summary of the related information, such as the cause breakdown list and temporal analysis summary.

The report would also include a print-able map that depicts the site.

Segment analysis looks into collisions which occurred along a road segment while intersection analysis focuses on a finite location or intersection. The user can select the road segment to be used for the analysis, and then the result of this analysis will produce a report that lists statistics such as collision counts by severity. The result would be similar to the result of an intersection analysis.

Cluster or hot spot analysis is used to study clustering patterns of collisions around specific roadway features or general trends in the city. Figure 2.2 shows an

21 example of cluster analysis. Yamada and Thill (2007) used a local K-function to identify the cluster locations of traffic collisions along road networks.

Figure 2.2 K-function local indicators of network-constrained clusters result (Yamada and Thill, 2007)

2.3.3 Network Analysis

Network analysis can be used to define or identify route corridors and determine travel path, travel distance, and response time. For example, network analysis may be used to assess the traffic volume impact of a road closure on adjacent roadways. GIS networking capabilities can also be used for the selection of optimal paths or routes. For instance, the safest truck routes for dangerous goods can be determined based on the level of hazard associated with the various roadway and traffic elements (Dyck, 2006).

Corridor analysis provides a visual means to locate high collision concentrations within a corridor. This analysis differs from segment analysis because corridor analysis considers the whole corridor. Kam (2003) used the shortest paths between origin-

22 destination as travel corridors. Then a buffer zone was placed around a travel corridor to extract the relevant collision records for each derived shortest path. (See Figure 2.3)

Figure 2.3 Illustrative travel corridors of shortest paths between origin-destination pairs (Kam, 2003)

2.3.4 Raster-based Modelling

Raster-based modelling also referred to as “cell-based” or “grid-based” analysis,

uses a grid or cells to aggregate spatial data for discrete distribution. In raster-based

modelling, the spatial data are developed as tiles of a given dimension or points of a

uniform distribution, as defined by the user, for display and analysis. Raster-based

modelling is effective in displaying patterns over larger areas, such as representing the

sum total of crashes that are located within a cell. This capability provides a quick means

23 to view spatial clustering of crash data. Since raster-based modelling aggregates data at a specified grid resolution, it would not be appropriate for site-specific spatial analysis.

In raster-based modelling, special tools are available to merge grid data for overlay analysis. Raster-based overlay analysis is similar to the GIS overlay analysis previously discussed; however, the techniques and functions available in raster-based modelling are somewhat different. When the cells of different data sets have been developed using the same spatial dimensions, they can be merged on a cell-by-cell basis to produce a resulting data set. The functions and processes used in cell-based modelling to merge grid data are referred to as ‘map algebra’ (Tomlin, 1990) because the grid data sets in raster-based modelling are merged using arithmetic and Boolean operators called

‘spatial operators.’

Density analysis makes use of raster-based modelling technique. The analysis uses a discrete point file to calculate the density of selected collisions, and generates a density map or a contour map identifying areas of high collision occurrence. Xie and Yan

(2008) presented a new network Kernel Density Estimation approach to estimating the density of spatial point events, traffic collisions, and compared it with a standard planar

KDE. (See Figure 2.4)

24

Figure 2.4 Illustration of different overall patterns produced by the network KDE and Planar KDE (Xie and Yan, 2008)

25

Chapter Three: A New Approach to Geocoding Collision Locations

3.1 Introduction

Address geocoding is the most widely used geo-referencing method when collision data are migrated from a conventional collision database to a geographically referenced database. However, it should be kept in mind that there are no perfect geocoding tools available in the GIS market, and this can lead to the loss of data in terms of both the collision data quality as well as in collision record quantity.

The highest quality of location information (accuracy, precision, and reliability) is essential for any spatial analysis, which varies from simple visualization in maps and black spot identification to more sophisticated analyses, such as spatial modelling and spatial autocorrelation. In urban areas in particular, the quality of geographically referenced data is very important, and more so than in rural areas because there are many different functional and physical types of roadways which closely exist and interact with one another; thus, the quality of the data in urban areas influences more easily the spatial analysis of nearby locations. The quality of location information is affected by the whole geocoding process, including data entry and manipulation. The most common errors are blunders. For example, many spelling mistakes and inappropriate forms of entries can be found in collision report databases due to human errors. Another source of errors is systematic. For instance, in the geocoding process, certain addresses can be referenced to the wrong location because of the logic of geocoding and geocoding option setting.

Sometimes systematic errors are hard to find and fix.

26

First, this chapter examines the adequacy of the geocoding method for collision locations which are stored in intersection-based text format, then it proposes and discusses a new approach to geocoding collision locations.

3.2 Study Area and Datasets

The research area encompasses the City of Edmonton, Alberta. This municipality is one of the fastest growing cities in North America. According to the Transportation

Master Plan (City of Edmonton, 2008), the population of Edmonton has reached almost

750,000, and it is expected to grow by 400,000 people by 2040. The city has a well- designed road network that consists of 1,526 km of arterial roads, 762 km of collector roads, and 2,359 km of local roads. Currently, a number of transportation projects are underway in the city in an attempt to improve the transportation system.

According to the Edmonton Police Service Citizen Survey (EPS, 2007), ‘traffic’ was the number one issue that respondents felt should be addressed by the Edmonton

Police Service. In 2007, a total of 28,521 motor vehicle collisions were reported and

5,513 of these accidents were fatal or involved injury (City of Edmonton, 2008).

This study utilizes the Motor Vehicle Collision Information System (MVCIS) maintained by the Office of Traffic Safety, for the City of Edmonton. The MVCIS stores information about motor vehicle collisions that occur on public roads in the City of

Edmonton. The information contained in the system comes directly from the provincial

Collision Report Form (See Appendix A for more detail) completed by members of the

Edmonton Police Service, and reflects all reported collisions on public roadways that

27 result in property damage in excess of $1,000, as well as any collision resulting in a minor, major, of fatal injury.

In addition to the MVCIS, this study utilizes Spatial Land Inventory Management

(SLIM), a GIS data warehouse of the City of Edmonton. The SLIM data warehouse acts as a foundation for the City’s spatial data infrastructure, and includes road networks of the City as well as other spatial datasets.

However, some transportation operational data, such as the speed bylaw and traffic control devices, were stored and maintained in written form and/or the CAD file format. Therefore, these datasets were converted to geographically referenced datasets.

This data conversion process involved importing CAD file format into the GIS environment, adjusting the coordinate system, modifying geometry, adding attribute information, and checking for data quality.

3.3 Evaluation of Geocoding Method

In order to analyse collisions spatially in the GIS environment, collision location information in the conventional database has to be geographically referenced unless other spatial information, such as coordinate data, is available in the database. The most common way to assign geographically referenced information to an address-based incident dataset is the address geocoding method. Most GIS software provides its own geocoding tool within the packages, and there are many stand-alone geocoding program utilities available in the GIS market. This section reviews a geocoding tool provided by

ArcGIS® 9.2 and GeoMedia® 6.1, and examines the adequacy of the tool for geocoding

location information in MVCIS. It is believed that the geocoding tools provided by

28

ArcGIS and GeoMedia are the latest version in their GIS software packages. The geocoding result is also discussed in this section.

3.3.1 Location Information in MVCIS

Basically, collision locations in MVCIS are represented as intersections. Two intersecting roads provide the closest intersection of a collision, and portion information

(see below) is used to describe detailed location information around the intersection. The location information is coded and stored in the database. The numbered roads are coded by their number, for example 137 Avenue is coded as 137 and stored in the ‘AVENUE’ column in the database; however, the named roads are coded by a specific number from

300 to 899. In addition, bridges and large or complex interchanges are coded by a 900s number, and these locations are mapped to describe detailed portion locations within an interchange.

The portion codes provide specific location information around the intersection.

Table 3.1 shows the description of the location around an intersection using portion codes, and Figure 3.1 illustrates the intersection and midblock locations depending on the portion codes. These portion codes are only applicable to intersections with numbered or named roads, and are not applicable to mapped (900s numbered code) locations. In the latter case, the portion codes 01 to 49 represent conflict points, e.g. intersection, merging point, and diverging point, within an interchange, and the portion codes 50 to 80 represent the midblock within an interchange (see Figure 3.2 for an example).

The location information in MVCIS is represented by two codes in the database; the intersection code and the portion code. However, the coded information in MVCIS

29 does not provide pin point accuracy for the collision location because of the ambiguity of the portion code. Moreover, locational ambiguity is much greater where adjacent intersections are placed apart from each other. In spite of the deficiency of location information in MVCIS, collision locations can be positioned relatively accurately when the location information is combined with other attributes of a collision in the database, such as traveling direction, driving lane, and traffic control.

30

Table 3.1 Portion Code Description

Code Description 00 Intersection 01 Midblock on AV (immediately west of street) 02 Midblock on AV (at lane) 03 Midblock on AV (btw the lane and the next intersection within AV) 04 Midblock on AV (only by house #, e.g. XXX XX Ave.) 05 Midblock on AV (more than 1 block btw intersections to west) 10 Midblock on ST (immediately north of avenue) 20 Midblock on ST (at lane) 30 Midblock on ST (btw the lane and the next intersection within ST) 40 Midblock on ST (only by house #, e.g. XXX XX St.) 50 Midblock on ST (more than 1 block btw intersections to north) 66 Lane or Alleyways 73 East of Intersection 75 South of Intersection 99 Unknown +1 West of service road: Midblock on AV (immediately west of street) +2 West of service road: Midblock on AV (at lane) +4 West of service road: Midblock on AV (only by house #) +5 West of service road: Midblock on AV (more than 1 block) -1 East of service road: Midblock on AV (immediately east of street) -2 East of service road: Midblock on AV (at lane) -4 East of service road: Midblock on AV (only by house #) -5 East of service road: Midblock on AV (more than 1 block) 1+ North of service road: Midblock on ST (immediately north of avenue) 4+ North of service road: Midblock on ST (only by house #) 5+ North of service road: Midblock on ST (more than 1 block) 1- South of service road: Midblock on ST (immediately south of street) 4- South of service road: Midblock on ST (only by house #) 5- South of service road: Midblock on ST (more than 1 block)

31

Figure 3.1 Illustrations of Portion Code Locations within Intersection

32

Figure 3.2 Accident Zone Map: Anthony Henday Drive-Calgary Trail/Gateway Boulevard (Example of Mapped Location)

33

3.3.2 Geocoding Test

In 2006 there were 26,066 collisions in the City of Edmonton, and approximately half of them occurred at intersections (see Table 3.2 for more detail). Table 3.2 shows that 48.0% of the collisions occurred at the numbered and named road intersection locations. Thus, geocoding tests were performed focusing on the intersection locations.

Initially 295 intersection type addresses were selected from the 2006 collision data in

MVCIS. The criteria for sample selection were: no spelling or grammar errors in the addresses, manually identifiable on the road networks with human perception, and complex named road addresses. In this study, human perception means the ability to find and figure out a location on the map with adequate knowledge of the locale. For example, the bridge section of Rowland Road is known as the Dawson Bridge and local people can figure out where the Dawson Bridge is; however, a geocoding tool cannot identify where the bridge is unless information about it is provided in the reference data, and this could result in an error from the geocoding process. The reason for choosing complex named road addresses is in order to assess the effects of spelling complexity on the geocoding process. Additionally, the reason why manually selected sample addresses are used for the geocoding test rather than random sampling is in order to highlight and find existing and potential error sources due to the address matching process and deficiency of reference data rather than to find possible errors in the sample address dataset.

This sample of intersection type addresses can be categorized into four groups as shown in Table 3.3. The reason that more addresses with named road combinations were selected than numbered road combinations is that it is assumed to be more difficult to geocode named road combinations than numbered road combinations.

34

Table 3.2 2006 Collision Count by Location Type and Portion Location Type and Count (Percentage) Portion Description Numbered & Named 23,685 (90.9%)

Intersection 12,503 (48.0%)

Midblock 9,738 (37.4%)

Lane or Alley 361 (1.4%)

Service Road 91 (0.3%)

Unknown 992 (3.8%)

Mapped 1,980 (7.6%)

Intersection 1,004 (3.9%)

Midblock 779 (3.0%)

Unknown 197 (0.8%)

Unknown 401 (1.5%)

Total 26,066 (100.0%)

Table 3.3 Test Sample Statistics Total Count in Intersection Type Initial Selection Final Selection 2006 Collision Numbered & Numbered 1,991 45 43

Named & Numbered 256 80 73

Numbered & Named 202 80 78

Named & Named 90 90 87

Total 2,539 295 281

35

Initially 295 selected sample addresses were checked on the street network layer in order to find out if the sample addresses could be identified on the GIS layers, and then true XY coordinate values were assigned to each address location in order to calculate the distance between the true location and the geocoded location. The distance between the true location and the geocoded location determines the level of geocoding accuracy; the closer the distance, the greater the geocoding accuracy. During the sample verification process, several problems were found which could be serious error sources during the actual geocoding. The problems will be discussed in a later section. As a result of this process, 281 intersection type addresses were selected for the final sample, excluding unidentifiable addresses and error suspected addresses mainly resulting from data entry errors (See Table 3.3 and Appendix B).

Because reference data are required for geocoding, a street network layer was exported from SLIM, and imported to ArcGIS. There are three centerline road networks available in SLIM; however, the V_LRS_DATUM_OND layer was selected as a more suitable road network for geocoding because it contains the full road name of each road segment as well as the house number range information in separate columns. In order to increase match rate and match accuracy, the STREET_NAME_FULL column was parsed into STREETNAME, STREETTYPE, and SUFFIX columns as recommended in the ArcGIS Online Geocoding Guide. For example, ‘Ellerslie Road SW’ in

STREET_NAME_FULL column was parsed into three parts as ‘Ellerslie/Road/SW’, and was stored in each STREETNAME, STREETTYPE, and SUFFIX column separately.

No modification on the street network was made in GeoMedia.

36

The ArcGIS geocoding tool allows the user to specify option values such as spelling sensitivity and minimum match score. However, the option values are set to the default values because Levine and Kim (1998) do not recommend relaxing the sensitivity of spelling in the geocoding procedure which allows the street name to be relaxed. They argue that the relaxing procedure has the potential for introducing a large amount of error into matched locations although it may increase the match rate. Therefore, all user settings were set to default in both test trials. The first test trial used geocoding tool in

ArcGIS, and GeoMedia’s Geocode Address function was used for the second test trial. In order to compare the positional accuracy of geocoded locations from each method, the distance between each pair of geocoded intersection addresses and its corresponding true intersection location was computed using the following distance equation.

2 2 GT  YYXXDist GT )()(. (3.1)

where, XT, YT = X, Y coordinates of true location

XG, YG = X, Y coordinates of geocoded location

3.3.3 Test Results and Discussion

This section shows the results from the geocoding tests and discusses them. As

mentioned in the previous section, the difference between the two tests is the geocoding

tools; ArcGIS and GeoMedia. The following tables (Table 3.4, 3.5, and 3.6) show the

geocoding test results on match rate, match accuracy, and match analysis.

Table 3.4 shows that 209 out of 281 test addresses were geocoded with ArcGIS

(74.4%), and 176 out of 281 test addresses were geocoded with GeoMedia (62.6%). The

37 results show that the match rate is better with ArcGIS than with GeoMedia. It is hard to conclude that ArcGIS is better than GeoMedia in geocoding intersection based address based on these results because postal code information was not available and GeoMedia recommends using additional postal code information for geocoding. However, based on the availability of information, ArcGIS shows a better performance on the geocoding match rate.

Table 3.4 Geocoding Test Match Rate Results

ArcGIS GeoMedia Match Score Count % Count % 100~81 206 73.3% 0 0.0% 80~61 3 1.1% 139 49.5% 60~41 0 0.0% 33 11.7% 40~21 0 0.0% 4 1.4% 20~0 72 25.6% 105 37.4% Matched 209 74.4% 176 62.6% Unmatched 72 25.6% 105 37.4% Total 281 100.0% 281 100.0%

Table 3.5 shows the match accuracy for matched addresses. 160 out of 209 test

addresses which were geocoded by ArcGIS were located within less than a 5m distance

from the true locations. 113 out of 176 test addresses which were geocoded by GeoMedia were located within less than a 5m positional distance from the true locations. The results show that ArcGIS has better match accuracy as well as match rate than GeoMedia.

However, this does not mean that the geocoding tool in ArcGIS is superior to the

38 geocoding tool in GeoMedia because of the previously mentioned reason. It should be noted that there are possible errors which exceed more than 100m distance difference between geocoded and true location in both geocoding tools.

Table 3.5 Geocoding Test Match Accuracy Results

Distance between True ArcGIS GeoMedia and Geocoded Location Count % Count % 0~5m 160 76.6% 113 64.2% 5~10m 0 0.0% 0 0.0% 10~20m 0 0.0% 1 0.6% 20~40m 12 5.7% 7 4.0% 40~100m 27 12.9% 23 13.1% >100m 10 4.8% 32 18.2% Total 209 100.0% 176 100.0%

Table 3.6 shows the geocoding match analysis of the tests. In urban areas, various

intersection types, such as three legged intersections and four legged intersections exist

and depending on the geometry, such as the number of lanes and median, the size of the intersection varies. Therefore, it is hard to determine the level of accuracy based on the distance difference between the true location and the geocoded location. It is also true that different types of roads exist in urban areas, for example, arterial, collector, and local

roads with various speed limits, lane widths, and number of lanes. Moreover, there are

large urban interchanges between major highways and other urban streets. Thus, it is

difficult to determine the levels of accuracy based on the distance difference for the urban

area intersections. However, in this geocoding test analysis, a distance difference between

39 the true location and the geocoded location of less than 1m is considered to be a good match, and a distance difference of less than 10m is considered to be an acceptable geocoding match.

Table 3.6 Geocoding Match Analysis

Match Code ArcGIS GeoMedia Remark

Good Match 160 56.9% 109 38.8% Difference < 1m

Acceptable 0 0.0% 4 1.4% Difference < 10m

Multiple Intersections 43 15.3% 33 11.7% - Matched to wrong Mismatch 6 2.1% 30 10.7% location Unmatched 72 25.6% 105 37.4% -

Total 281 100.0% 281 100.0% -

As seen in Table 3.6, 160 out of 281 addresses were matched within 1m accuracy

using ArcGIS, and 109 out of 281 addresses were matched within 1m accuracy using

GeoMedia. ArcGIS shows better performance on the overall geocoding process than

GeoMedia. However, it should be noted that there are some mismatched addresses using both geocoding tools. Mismatched addresses usually introduce a large number of

positional errors, and these errors would affect the reliability of the spatial analysis. A

problem associated with this error is that it is hard to distinguish from among the matched

addresses. Therefore, a validation procedure should be performed after the geocoding in

order to take account of the positional error effect in the spatial database.

40

It is noted that many matched addresses are located at one of the tied match score locations (43 out of 281 using ArcGIS, and 33 out of 281 using GeoMedia). In urban areas, the fact that there are multiple intersections with exactly the same two intersecting street names is often identified. Moreover, there is a spatial data model problem when the real world is represented in the GIS environment. This multiple intersection problem is discussed in the next section along with a discussion of the existing and potential problems found during the geocoding tests.

Case 1: The problem with tied match score locations

There is a problem with tied geocoding match score locations when real world features are represented in the GIS environment. As seen in Figure 3.3 (a), streets are usually represented in single line network in the GIS environment regardless of lane width and the number of lanes. In this case there is only one intersection point between two streets; however, it is often the case that two parallel lines are used to represent a street segment by travel directions when the street is divided by a median or barrier. As seen in Figure 3.3 (b), for example, there is a possibility for four intersection points with the same geocoding match score when a geocoding tool tries to find an intersection of 90

Avenue NW and 75 Street NW. If the exact location is not specified among the four possible locations with the same geocoding match score after the geocoding process, a point among the candidates will be arbitrarily selected, and this might introduce a positional error that may vary up to tens or hundreds of meters depending on the size of

41 the intersection. For example, there is a distance of 244m across the Anthony Henday

Drive NW and Terwilliger Drive NW intersection.

(a) (b) Figure 3.3 Example of Tied Match Score Locations

In order to avoid multiple intersection locations with the same match score at an intersection between double centre line streets, four lines are used to represent road segments in each direction which meet at one intersection point as seen in Figure 3.4.

Although this network data model reduces the numbers from four possible tied match score locations to one intersection location, the problem may still exist where separated right turn cut-offs are represented in the street network (see Figure 3.5). As shown in

Figure 3.5, there are five possible locations with the same match score for an intersection at 23 Avenue NW and Parsons Road NW because of the right turn cut-offs. This problem can be fixed by differentiating each street by adding travelling direction (e.g. east bound

(EB)) and segment type (e.g. right turn cut-off), and then redefining the rule base in the

42 geocoding tool. However, this process might be very complicated because it involves the standardization of the current collision database (MVCIS) which does not contain detailed and precise location information, such as coordinates, as well as the modification of its data format. Furthermore, the additional location information, such as the travelling direction and segment type, is not always available in the collision database, so this might affect the geocoding collision records that do not include the additional information, which were geocoded fairly well using the ordinary geocoding method. Therefore, modification of the geocoding tool, reference data, and collision database to resolve the problem with tied match score locations should be considered carefully, and this requires an extensive and detailed examination to verify the appropriateness of the approach.

111 AVE [WB] NW

111 AVE [EB] NW 109 AVE109 [NB] NW 109 AVE109 [SB] NW 109 AVE109 [SB] NW

Figure 3.4 Example of One Intersection Point (Node) Street Network

43

Figure 3.5 Example of Tied Match Score Locations in One Intersection Point Street Network

Case 2: Multiple intersections

The problem with multiple intersections may produce similar consequences to those seen in the tied match score locations. One difference, however, is that the previous case was mainly caused by representing real world features in the GIS environment, while the problem with multiple intersections involves more fundamental issues such as street naming and specialized road structures, e.g. the traffic circle.

Figure 3.6 shows an example of the multiple intersections problem caused by street naming. As seen in Figure 3.6, there are three intersections which share the same

44 intersection name of Dunluce Road NW and Warwick Road NW. This kind of problem is often found in residential areas. Adding another identifier in both the reference data and the collision address table is a possible solution for this problem.

Figure 3.6 Example of Multiple Intersections Caused from Street Naming

Figure 3.7 shows one of the traffic circle locations in the City of Edmonton. As seen in Figure 3.7, there are at least four candidates for the intersection of 107 Avenue

NW and 142 Street NW. The geocoding process would pick these four intersection locations as tied match score candidates unless other identifiers were available.

45

1

4

2

S

T

R

E

E

T

N

W

W 1 N 0 T 7 E A E V TR E S N 2 U 14 E N 107 AVENUE NW W 107 AVENUE NW

1 W 0 N 7 T A E V E E R N T U S E 2 N 4 W 1

1

4

2

S

T

R

E

E

T

N

W

Figure 3.7 Example of Multiple Intersections Caused by the Nature of the Traffic Circle

These multiple intersection problems are mostly due to road naming and the transportation system; thus, additional identifiers are needed to distinguish each intersection location with the same road name. This involves the use and standardization of additional identifiers with the manual manipulation of the collision database as well as the reference data.

46

Case 3: Incorrect information in reference data and address table

It is seldom noted when there are discrepancies between the reference data and address table. This problem is likely to arise in relation to standardization and update in both reference data and collision location information.

Because of the complexity of the road network, it is hard for a data entry person to identify the correct road name and place. Additionally, unclear information in the initial collision reports may lead to incorrect information on MVCIS. For example, as seen in Figure 3.7, University Avenue NW is discontinued at 105 Street NW and then reconnected at 106 Street NW. Thus, it is possible to misinterpret University Avenue NW as 75 Avenue NW. It is recorded as 75 Avenue NW and Gateway Boulevard NW in

MVCIS, but this should be corrected to University Avenue NW and Gateway Boulevard

NW.

There are many transportation projects in progress within the City of Edmonton, and these cause rapid changes in the street network system. Furthermore, especially in newly developed areas, it is hard to update information, such as street names and geometry in the reference data in a timely manner. Because of the update issue with the reference data, it is often found that there are inconsistencies between the reference data and address table. More frequent updates are needed of the reference data as well as the address table to accommodate the rapid changes in the transportation system. It should also be considered necessary to keep records of the historical changes due to transportation system upgrades or development.

47

Figure 3.8 Example of Incorrect Information in the Address Table

Case 4: Street with two names

Highway 16, which is part of the Yellowhead branch of the Trans-Canada highway system, forms one of the main east-west routes and is known as Yellowhead

Trail within the City of Edmonton. In addition, 82 Avenue NW runs through the Old

Strathcona neighbourhood, one of the major shopping and entertainment districts in the

City of Edmonton, and is also called Whyte Avenue within the area. Both examples show typical cases of two named streets, and similar cases can be found in many other cities.

48

In the above examples, the official names for these streets are

NW and 82 Avenue NW in the reference data. However, these road segments are often called Highway 16 and Whyte Avenue, especially when a driver attends a police station and files a Collision Report Form (Appendix A). Even though the boundaries of these road segments are usually ambiguous to the public, most local people understand where they are, and the alternative names are widely accepted. However, a geocoding tool cannot figure out the road segments with alternative names unless additional information is available in the reference data. Therefore, standardization of both the reference data and the collision address table is required, and in particular a validation process for the address table should be performed before the collision data release in order to minimize possible errors and inconsistency with the reference data.

Case 5: Intersection that does not intersect

Conceptually, a geocoding program will look through the reference database for all instances of the first and second street and put them into each array. It will then compare combinations of “from” and “to” nodes in each array to find a combination where the end node of one segment in the first array is identical to the end node of a segment in the second array (Levine and Kim, 1998).

As seen in Figure 3.9, the intersection between Freeway NW and

34 Street NW is separated by the overpass and there is no actual intersection point between these main roads. When a collision occurs on NW near

49 the overpass, a police officer logically records Sherwood Park Freeway NW as the primary street and the nearest intersecting street (34 Street NW) as the secondary street.

However, the geocoding tool cannot pick up the overpass location because there is no node between the two streets. It is more likely to select the locations where the ramps from Sherwood Park Freeway NW and 34 Street NW meet. Levine and Kim (1998) used an additional GIS layer that contained pseudo-intersections for intersections which did not actually intersect on the reference data or for locations of places. For instance, in the reference data, there is no node between Sherwood Park Freeway NW and the 34 Street overpass because these two roads are grade separated. If a collision record indicates the location of Sherwood Park Freeway NW and the 34 Street overpass, it would be more reasonable to locate a collision on the intersecting point of two roads where no node exists. Because of this, a pseudo-intersection (point or node) is placed on the intersecting point of two roads in order to match the intersection of Sherwood Park Freeway NW and the 34 Street overpass.

50

Ex it Ramp mp Ra nce era NW STREET 34

Ent

s

s a

AY NW p W r OOD PARK FREE

SHERW e

v O mp e Ra ranc E Ente xit Ramp

Figure 3.9 Example of Intersections that Do Not Intersect

Case 6: Complex and large interchanges or intersections

There are many types of intersections and interchange structures in urban areas.

Depending on its function and feature, the size of the intersection or interchange extends to several hundreds meters. As seen in Figure 3.9, the interchange between Sherwood

Park Freeway NW and 34 Street NW forms a typical diamond interchange, and consists of a grade separation with an overpass and two signalized intersections between the non- freeway and ramps from/to the freeway. Unless each portion of the interchange is specified by direction and road segment type, this location may have the possibility of being denoted as the intersection of Sherwood Park Freeway NW and 34 Street NW. This

51 representation could produce a large amount of positional error during the geocoding process. For instance, there are two merging points from 34 Street NW to Sherwood Park

Freeway NW in each direction (east and west bound), and the distance between the two merging points is 670m.

However, in MVCIS, these kinds of complex or large interchanges are mapped to provide specific locations within the area as seen in Figure 3.2. Though mapped locations provide relatively accurate location information, it is difficult to geocode mapped locations unless the reference data contains mapped location information. Therefore, a strategy for geocoding mapped locations from the MVCIS database is needed to develop a GIS database for MVCIS.

52

Case 7: Phantom intersection

During the sample generation process, it was noted that a number of collisions are referenced to intersections that do not exist on the street network. For example, there is an intersection between an access road from/to a shopping centre and a standard roadway.

In this case, the access road is outside of the city’s property, and these kinds of roadway do not appear on the street network. Also, it appears that there is no official intersection on that location. However, there is a traffic signal to accommodate traffic from/to the shopping centre, and this location should be considered as an official intersection.

Typically, this kind of location uses an imaginary street number in order to represent the location. For example, as seen in Figure 3.10, there is a signalized intersection between

87 Avenue NW and an exit from the West Edmonton Mall, and this intersection is named as 87 Avenue NW & 177 Street NW in the collision database (MVCIS); however, this location is not an official intersection because the exit from the West Edmonton Mall is not city property, and there is no physical intersecting point between 87 Avenue NW and

177 Street NW. This location uses an imaginary street number, 177 Street NW, in order to represent this intersection.

Additionally, it is seldom noted in the MVCIS database that phantom intersections are used to represent the locations of collisions. Usually, phantom intersections are applied to those locations where official street numbers or names have not yet been determined such as in recently developed areas, and where an adjacent intersection is far away from the intersection and the numbers of two streets are not consecutive. Basically, phantom intersections should not be used to represent the location

53 of collisions because these locations cannot be geocoded using the ordinary geocoding method and have to be geocoded manually.

Figure 3.10 Example of Phantom Intersection

54

3.4 A New Approach for Geocoding Collision Locations

As reviewed in the previous section, migration from the conventional collision database to a geographically referenced database using currently available geocoding methods will lose a large amount of the data and introduce positional errors into the spatial database. Although the geocoding test in the previous section focused on the intersection addresses and used the modified reference data in order to increase the match rate, approximately 28% of the sample addresses were neither geocoded nor located accurately using the ArcGIS geocoding tool. Furthermore, the potential for mismatch or error prone sources during the geocoding process were discussed, and these would have a negative influence on the match rate and accuracy of the geocoding. Therefore, it should be considered that a sophisticated geocoding tool or a different approach which takes account of the problems stated in the previous section is necessary.

Additionally, the geocoding tests focused only on the collisions at intersection locations; therefore, a method or approach that is capable of geocoding other location type collisions in the MVCIS, such as at mid-block, at service roads, and at mapped location collisions, should be considered.

3.4.1 Location Code and Reference Layer

The key logic of currently available geocoding methods is comparing and finding the best match between an address and the reference data. During the geocoding process, the spelling sensitivity greatly affects the match rate and geocoding accuracy, and Levine and Kim (1998) did not recommend relaxing sensitivity of spelling because the relaxing procedure has the potential for introducing a large amount of error into matched locations

55 even though it may increase the match rate. However, more unmatched or mismatched addresses due to spelling sensitivity were found in the named streets than in the numbered streets. It is observed that all of the variables of each collision from the collision report form are coded alphanumerically and stored in the database during the data entry procedure. Thus, this study implements a new code for the collision location using a simple and alphanumeric form to represent the collision location, called the

Location Code.

The Location Code consists of two 4 digit Avenue and Street codes which describe the location of the intersection, followed by a 2 digit quadrant code and a 2 digit portion code which depicts the locations within an intersection as explained in Section

3.3.1. In order to take account of the multiple intersection problem, an additional 1 digit identifier is added to the end. See Table 3.7 for more details.

Table 3.7 Location Code Structure and Examples

0 1 0 7 0 1 0 3 - N W 0 0 - 0

Quadrant Avenue code Street code - Portion code - Detail code code

Location Code Address Details

01070103-NW00-0 107 AV & 103 ST Named/Numbered Intersection

102A0097-NW01-0 102A AV & 97 ST Named/Numbered Intersection

0009526S-SW10-0 9 AV & Calgary Trail Named/Numbered street portion

421E170N-NW00-0 Whitemud DR EB & 170 ST NB Named/Numbered Intersection

09460946-NW16-0 Whitemud DR-106 TO 111 ST Mapped Intersection Named/Numbered Intersection 05550482-NW00-E Dunluce Road & Warwick Road on the east side

56

In the case of the City of Edmonton, usually, avenues are east/west directional roads, and streets are north/south directional roads, so that the first 4 digit code indicates a road that stretches in the east/west direction (horizontal), and the next 4 digit code indicates a road that stretches in the north/south direction (vertical). The combination of these two 4 digit codes indicates the location of an intersection. The alphabetic 2 digit quadrant code shows a quadrant of collision location within the city because the city is divided into 4 quadrants based on Quadrant Avenue (1 Avenue) and Meridian Street (1

Street). The 2 digit portion code that appears after the quadrant code depicts the portion of collision locations within the intersection that is represented using the avenue code and the street code. For regular numbered or named roads, Table 3.1 and Figure 3.1 are used to refer to the portion locations within an intersection. In the case of mapped locations with 900s avenue and street code, each location is assigned a unique portion code within the mapped area (See section 3.3.1 and Figure 3.2 for more detail and an example). The last 1 digit code is used when an additional identifier is needed to represent the location of the collision. For instance, this code is used to identify multiple intersections as stated in the previous section. This code can also be used as an identifier which keeps track of historical change due to transportation system development or upgrade. For example, if a temporary detour is built due to the construction of a grade separated interchange, the temporary detour route can be coded using the last 1 digit code by assigning a sequence number from 1 to 9 in order to reflect the temporal changes while 0 is reserved to indicate a permanent location or completion of construction.

A higher match rate and accuracy are expected when using the Location Code instead of the full intersection address. Furthermore, a geocoding tool can be replaced by

57 those join or link functions which are available in most GIS software packages because the Location Code contains simple alphanumeric codes which can be directly related to the reference data. However, the success of this approach relies on the completeness of the reference data. The following section explains the creation of reference data

(geocode-able layer) utilizing the Location Code.

58

(a) Point feature type

(b) Point-line combined feature type

(c) Polygon feature type

Figure 3.11 Roadway Portion Layer Design by Feature Type

59

As seen in Figure 3.11, three kinds of GIS layers are initially considered as the reference data for geocoding utilizing the Location Code; point feature type layer (a), point-line combined feature type layer (b), and polygon feature type layer (c). The point feature type layer is more commonly used for representing incident data on a GIS layer, but the other types are suitable for the uniqueness of the Location Code. Because the portion code in MVCIS and the Location Code do not provide a pin point location and describe only relative locations within an intersection, a line feature or polygon feature might be more appropriate than point feature. However, the code contains ambiguity when expressing the precise portions; for instance, depending on the size and geometry of the intersection it is sometimes hard to determine portion locations within an intersection.

Moreover, these two feature types (line or polygon feature) are not the proper form for some spatial analysis. For example, density analysis is performed to produce high or low density areas of events, and event data are usually stored as point features in GIS. Extra effort on data feature type conversion is required if the event data is stored in line or polygon features. Thus, this study implemented a point feature type layer as the Location

Code reference data.

Two geocoding strategies are considered for geocoding using the point feature type reference layer. The first strategy involves establishing a link between the attribute table in the reference data layer and the collision data table using a common field, the

Location Code, in both tables (See Figure 3.12). Because the collision data table is linked to the reference data layer, this strategy can save disk storage. Also, this strategy is flexible in the reference data layer update or change. Because the collision data table does not contain coordinate information or geographically referenced information, the

60 collision data table does not have to be changed or updated in spite of the change or update on the reference data layer. However, this strategy requires a one-to-many relationship for linking two tables. A large number of intersections include more than one collision each year, so that one-to-many relationship must be supported to establish the link.

Figure 3.12 Linking Collision Data Table to Reference Data Layer

An alternative strategy is to assign coordinate values from the reference data layer to each collision record using the common field information, the Location Code. As seen in Figure 3.13, the XY coordinate values are assigned to the collision data table according to the Location Code, and then each collision is located on a layer based on assigned coordinate values. This strategy does not require one-to-many functionality in the database connection, and was chosen to be the geocoding strategy for this study because

61 the GIS software package, Intergraph’s GeoMedia, used in this study does not support a one-to-many relationship.

Figure 3.13 Assigning Coordinate Values to Each Record in the Collision Data Table

62

The following describes the step by step procedures for this method:

1. Generate the Location Code field in the collision data table exported from

MVCIS.

2. Create a geocode-able reference layer with the Location Code.

• Choose point feature type as a master feature type for the layer.

• Generate intersection points where two or more roads are intersecting.

• Place mapped locations based on the Accident Zone Maps as shown in

Figure 3.2.

• Place the most appropriate mid-block portion locations within an

intersection manually.

• Generate fields of the Location Code and XY coordinate values for each

location point.

3. Assign XY coordinate values to each collision record in the collision data

table exported from MVCIS using the common Location Code field in

both the geocode-able reference layer and the collision data table.

4. Convert the collision data table to a GIS layer using the assigned XY

coordinate values.

It seems that a certain amount of manual work on placing mapped locations and mid-block locations is to be expected when creating the geocode-able reference layer because the geometry and size of each intersection and the relationship between two adjacent intersections are irregular, and an automated process for placing the points requires very sophisticated and complicated rule-based script programming. In this study,

63 after intersection points and mapped location points were placed in the geocode-able reference layer, a matching process between the reference layer and the collision data table was performed, and then only collision locations that did not match, mostly mid- block locations, were placed in the reference layer manually. It is to be expected that after a few years of annual updates on the geocode-able reference layer, the amount of manual placing work would be reduced as the reference layer became populated with more collision locations that had arisen. The annual update on the geocode-able reference layer would include placing locations that had been changed or added mainly due to the city’s transportation system having undergone further development or change as well as it having had collision locations which had not previously been placed in the reference layer added to it.

A latitude and longitude coordinate system is widely used to represent geographically referenced locations. For example, GPS and online mapping services use a latitude and longitude coordinate system. However, this study used XY coordinates based on the city’s standard coordinate system, City of Edmonton- 3 Transverse

Mercator. The transformation between the two coordinate systems can be easily performed in the GIS environment.

3.4.2 Geocoding results and discussion of the new approach

This section examines the geocoding results from the approach proposed and used in this study, and discusses the advantages and limitations of this approach.

64

Three years of collision datasets from 2006 to 2008 were geocoded using the

Location Code and the proposed approach. Table 3.8 shows the results of the geocoding.

All collision records were successfully geocoded using the Location Code and the proposed approach. A large number of collision locations are classified as regular numbered or named road intersection locations and portion locations. There are collision records that show that intersection or mapped locations are known, but portion locations are unknown. These collision records were placed at the north-west corner of the intersection or mapped location, if applicable; otherwise, they were placed at the most appropriate location within the intersection or mapped location. These were classified as

‘Numbered/Named-Unknown’ or ‘Mapped-Unknown’, but were different from

‘Unknown-Unknown’. The ‘Unknown-Unknown’ locations indicate that both the intersection location and the portion location were not available in the MVCIS database, and these are located in north-west corner of the map area (See Figure 3.16).

It is hard to compare the results from this approach with those from other studies

(e.g. Bigham et al., 2009; Dutta et al., 2007) on the geocoding of traffic incidents because the approach used in this study is different from that used in ordinary address/intersection based geocoding methods. However, using this approach, among a total of 83,657 collisions from 2006 to 2008, 90.6% were properly geocoded, 8.8% were unknowns

(missing location information, but valid collision records), and 0.6% were unidentifiable locations (location does not exist or is unidentifiable). The studies by Bigham et al.

(2009) and Dutta (2007) showed an 86% and 79% match rate. Again, the results from this study are difficult to compare with the results from the other studies, the former showed a

99.3% geocoding match rate if the unknowns were excluded from the collision database

65 which was used for geocoding because none of the geocoding tools are able to geocode unknown locations, and the other studies used cleaned data (without unknowns).

Table 3.8 Results of Geocoding Using Location Code

Year Geocoded Location Classification 2006 2007 2008

Intersection 12,488 (47.9%) 13,252 (46.5%) 12,802 (44.0%)

Midblock 9,752 (37.4%) 9,745 (34.2%) 10,200 (35.1%)

Alley / Lane 338 (1.3%) 306 (1.1%) 305 (1.0%)

Service Road 90 (0.3%) 92 (0.3%) 78 (0.3%) Road Location

Numbered / Named / Named Numbered Unknown 952 (3.7%) 1,858 (6.5%) 2,154 (7.4%)

Intersection 783 (3.0%) 918 (3.2%) 885 (3.0%)

Midblock 777 (3.0%) 1,095 (3.8%) 1,188 (4.1%) Mapped Mapped Location Unknown 197 (0.8%) 273 (1.0%) 351 (1.2%)

Bridge 221 (0.8%) 200 (0.7%) 289 (1.0%)

Unidentified 67 (0.3%) 248 (0.9%) 217 (0.7%)

Unknown-Unknown 401 (1.5%) 532 (1.9%) 603 (2.1%)

Total 26,066 (100%) 28,519 (100%) 29,072 (100%)

66

112 STREET NW

-0 0 4 W -N 2 1 01 0 9 5 0

-0 6 6 W -N 2 1 01 0 9 5 0 -0 0 2 W -N 2 1 1 0 0 9 5 0

-0 9 9 W -N 2 1 01 0 0 0 - 9 - 4 5 0 0 0 0 W W -N -N 2 2 1 1 1 0 01 0 0 9 9 5 5 0 0

JASPER AVENUE NW

Figure 3.14 Example of Geocoding Results of Numbered or Named Locations

67

-0 72 NW 2- 90 20 90 -0 0 59 NW 2- 90 -0 20 04 0 W 09 -N 02 09 02 09

-0 -0 05 52 W W -N -N 02 02 09 09 02 02 09 09

-0 01 NW 2- 90 20 -0 90 01 0 W NW N - 1 90 D 10 A 90 O 0 R

T A O R G

-0 74 W -N 00 09 00 09

Figure 3.15 Example of Geocoding Results of Mapped Locations

68

Figure 3.16 Geocoding of Unknown and Unidentified Locations

69

During the geocoding procedures, it was found that there were some unidentifiable locations in the collision records. For example, the intersection of the

Calgary Trail NW and 99 Street NW cannot exist because both roads are north and south directional roads that never meet each other. It is assumed that this is either a collision report error or a data entry mistake. These unidentifiable location collisions are located in the north-west corner of the map area (outside of the city boundary; see Figure 3.16). The reason why unknown and unidentifiable location collisions are located on the map area is that the GIS collision database could replace the existing collision database; therefore, the

GIS database should include all of the collisions that were reported and stored in the

MVCIS database. Even though unknown and unidentifiable collision records contain no location information or incorrect location information, they are valid collision records and are used for generating overall collision statistics. However, unknown and unidentifiable location collisions should not be used in spatial analysis because spatial relationships might be affected by these unknown and unidentifiable location collisions.

The procedure for excluding unknown and unidentifiable location collisions can be accomplished by using a simple query function available in most GIS software packages.

Table 3.9 compares the difference in the frequency of unidentifiable locations depending on whether the police officer visited the scene of the collision or not. The information on police visits to the scene was not available before 2008 in the collision data, so the table shows 2008 data analysis only. The results show that the numbers of unidentified locations were significantly reduced if a police officer visited the collision scene. Therefore, a validating process is highly recommended when a collision is reported by a walk-in driver at a police station.

70

Table 3.9 Analysis of Unidentified Location

Police visited to the collision scene Unidentified Collision Location in 2008 Data Yes No Unknown SUM

Unidentified- Intersection 35 (23.5%) 114 (76.5%) 0 (0.0%) 149 (100%)

Unidentified- Portion location 15 (22.1%) 52 (76.5%) 1 (1.5%) 68 (100%)

Total 50 (23.0%) 166 (76.5%) 1 (0.5%) 217 (100%)

Though all collision records were successfully placed on the GIS layers, there are accuracy and precision issues with geocoded locations because of the deficiency of expressing collision locations in MVCIS. This problem is not only found in MVCIS but is also found in other collision databases, unless the database contains accurate and precise location information, such as coordinates or distance from a milepost. GPS has been introduced as an emerging technology to provide accurate and precise location information for collision field data collection. Some studies (Green and Agent, 2004;

Khan et al., 2004; Sarasua et al., 2008) have shown the successful implementation of

GPS as a traffic collision location tool; however, these studies recommended proper training and use of GPS units to improve accuracy and precision. Moreover, a direct connection between a GPS unit and a recording device (e.g. a handheld computer) would reduce errors such as those that occur due to omitting decimals of coordinates. In addition to the above, the degradation of accuracy of GPS ranging and increase of geometric dilution of precision in GPS units due to multipath and masking effects in dense urban

71 areas should be evaluated and resolved, and only then should it be implemented as a location tool for field data collection in urban areas.

As mentioned previously, a large number of location errors were identified that had occurred during the process of the walk-in reporting of collisions. The Incident

Location Tool designed by CTRE (2008) is a map based utility that can be used to locate where an incident occurred for either collecting collision data at the collision scene or in a police station. This kind of tool could be used to reduce the errors that occur during the data collection and manual data entry, and could be integrated into an automated reporting system.

72

3.5 Chapter Summary

This chapter began by introducing the collision data as they were collected and described how they were organized in the database (MVCIS). In order to examine the adequacy of the geocoding method for collision locations which were stored in intersection-based text format, the geocoding tests were performed with two geocoding tools available in the ArcGIS® 9.2 and GeoMedia® 6.1 GIS software packages. The

geocoding test results showed that the geocoding tool in ArcGIS performed better than

that in GeoMedia for intersection location geocoding in terms of match rate and

accuracy; however, neither of the tools provided satisfactory levels of geocoding match

rate and accuracy. Moreover, the existing and potential geocoding problems were

discussed.

In order to overcome the shortcomings which were found during the geocoding

test, a new approach to geocoding collision locations was proposed. This new approach

utilized alphanumeric codes for representing collision locations, called the Location

Code. Also, matching methods between the reference layer and collision data table were

discussed. Last, the geocoding results using the new approach were presented and

discussed.

73

Chapter Four: Spatial Analysis for Traffic safety

4.1 Introduction

A traffic safety analysis often involves examinations of particular locations that have been identified as collision prone locations. Many techniques have been applied to identify such collision hazardous locations (Elvik et al., 2009). Recently, spatial analysis techniques have been widely used in traffic safety studies (Anderson, 2009; Pulugurtha et al., 2007; Xie and Yan, 2008). Spatial analysis using GIS is useful, not only for visualizing results from other statistical techniques on maps, but also for analyzing data spatially within the GIS environment.

This chapter deals with the utilization of the geographically referenced collision database mentioned in the previous chapter, to perform spatial analyses within the GIS environment. Various spatial analysis techniques were applied to traffic safety analysis, and the results from this spatial analysis are presented in maps, as well as in tables with statistics. Additionally, each section includes a discussion of the analysis results.

The GIS software package that was used for these spatial analyses is Intergraph’s

GeoMedia® Professional 6.1 version. GeoMedia is Intergraph’s base GIS application

which enables the user to bring data from various sources into a GIS database

environment for viewing, analysis, and presentation. For the raster-based modelling,

GeoMedia requires an extension for grid data manipulation; thus, GeoMedia® GRID was

used for raster-based data viewing and analysis as well as for integration with vector

format data.

74

4.2 Attribute and Spatial Query

In general, a query is a request for information. A query function is an information retrieval technique that searches for information or data within databases or multiple datasets. Users are able to build a set of queries that select only the required information or data. A set of queries can include one criterion or multiple criteria. For example, a set of queries can filter out weekend collisions that were caused by ‘run off road’, and resulted in injuries or fatalities. This set of queries consists of multiple criteria that apply to three fields (e.g. day of week, cause, and severity fields). Once a query retrieves data, the data can be displayed in the form of maps or data tables in the GIS environment, or can be used for other analyses. The above query is applied to attribute information from the GIS collision database; therefore, an attribute query is used to search the database for a specific value or a range of values for one attribute or for a combination of attributes that apply to one feature class (Intergraph Corporation, 2007).

Most spreadsheet or database programs provide functions which are similar to the attribute query mentioned previously. However, there are difficulties with retrieving information based on spatial criteria. For instance, when retrieving collision data or information on a specific area or along a corridor, locations have to be manually identified and selected by consulting a map unless spatial identifiers are presented in the database, such as zone identifiers or coordinates. These queries based on spatial features are important for analysing collision trends within a certain area or corridor. However, these queries can be easily built if collision data are geographically referenced. A spatial query requests information from the database where the criteria output is based on the spatial operator that forms a relationship between two geographic record sets (Intergraph

75

Corporation, 2007). The following two examples show how spatial queries can be built in the GIS environment.

Figure 4.1 shows a map of a spatial query relating to the collision locations and the collision frequency of each location in the Parkallen neighbourhood in the City of

Edmonton. This map can be effectively used not only for educational purposes such as displaying where collisions occurred within the neighbourhood to the residents (Tang and

Waters, 2005), but also for engineering and enforcement purposes such as providing additional information for transportation rehabilitation programs to engineers, or for establishing enforcement plans by police officers within the neighbourhood. This map was generated using the collision GIS point data layer and neighbourhood polygon data layer. First, a 20 meter buffer was placed around the Parkallen neighbourhood, and then collision data within the buffered boundary were retrieved using spatial query function in

GeoMedia. The reason for placing a 20 meter buffer around the neighbourhood polygon was that the neighbourhood boundary is usually formed by roads that run between the neighbourhoods; therefore, some boundaries are unclear as guides (for example, in determining whether a road between neighbourhoods should be included in one neighbourhood or in another) and so the buffer was placed around the neighbourhood in order to include roads which are adjacent to it. Once the collision data were retrieved, the geographically referenced collision locations were displayed on a map with other useful features, such as the road network and neighbourhood boundary. Moreover, the retrieved data could be used for further analysis in the form of spatial or tabular data. The numbers on the map (Figure 4.1) indicate the frequency of collisions at each location. The numbers were generated using an analytical merge function which allows the user to

76 group records in a single table and summarize information from each of the group(s)

(Intergraph Corporation, 2007).

Figure 4.2 shows an example of spatial query on a corridor (137 Avenue NW).

First, polygons along the corridor were placed on intersections and midblock portions, and then a spatial query function was used to retrieve collision data or summary on each portion. In particular, an aggregation function was used to generate a collision summary of each portion of the corridor. The aggregation function in GeoMedia allows the researcher to aggregate detailed feature class attributes (collision point feature, red) into the summary feature class (corridor portion polygon feature, light green) (Intergraph

Corporation, 2007). Without the aid of GIS techniques, this process would require intensive manual work to identify where collisions belong in a specific portion of a corridor. Once collision data on each portion of a corridor are retrieved, the data can be used for further analysis to find out the collision trends of the corridor.

77

Figure 4.1 2008 Collision Locations and Counts in Parkallen (Source: Author)

78

Figure 4.2 Example of Spatial Query on a Corridor

Both attribute query and spatial query functions are important and fundamental functions for retrieving required information from databases in the GIS environment.

However, there was a potential error source that resulted from the spatial query. As seen in Figure 4.3, red dots indicate collision locations along 97 Street NW, and light green polygons are placed along 97 Street NW to identify portions of intersections or midblocks. When collision data were queried from 97 Street NW, the query successfully selected collisions along 97 Street NW based on the polygons; however, there was a possibility that some locations on 97 Street NW might be included in the query when a spatial query was performed on Yellowhead Trail NW. This location has a grade

79 separated interchange and thus it is hard to distinguish whether a collision occurred on the upper or on the lower section of the interchange unless height information is available in the database. Thus, additional information should be utilized to verify the exact location of a collision, such as a text description of the location. The verifying process was done through a minimal amount of manual work in this study, but this process could be done automatically if spatial and attribute queries were combined.

YELLOWHEAD TR

T

S

7

9

Figure 4.3 Example of Spatial Query on grade separated interchange

80

4.3 Hot Spot Detection and Density Analysis

While many GIS users are familiar with using and analysing vector data in a GIS environment, raster-based analysis can provide more powerful analysis tools which are not available using vector data. In traffic safety analysis, most data are vector data types

(e.g. incident-point, road network-line, and neighbourhood-polygon); thus, vector-based analysis is more commonly applied. However, raster-based analysis can be more effectively utilized in some types of traffic safety analysis, for example hot spot detection and density analysis.

Raster is a grid-based data structure that breaks an image or map into square grid cells of equal size. Each grid cell is assigned a value that represents a condition that exists in the equivalent square parcel in the real world. A cell is a square unit representing a specific real world area. The cell is the fundamental unit of raster-based GIS analysis.

Raster GIS layers are composed of a grid of cells that are referenced by their row and column positions within the grid. (Intergraph Corporation, 2006)

This section reviews the kernel density estimation method used to perform raster- based GIS analysis, and discusses the analysis results from this method.

4.3.1 Kernel Density Estimation

Kernel density estimation is a way of estimating the probability density functions of a random variable. Specifically, it can extract areas where the concentration of incidents is high, and it can identify clusters within the geographic space, and in this way present complex information in a simple map or picture (de Smith et al., 2007; Intergraph

Corporation, 2006).

81

Kernel density estimation starts with placing a symmetric search surface on each point, and then calculating the distances between a reference point and the incident points based on a mathematical function. The distances are summed at a reference point. This procedure is repeated for other reference points. This allows a “hump” (kernel) to be placed over each incident point, and this provides a density estimate for the distribution of incident points by summing the hump functions (Anderson, 2009; Fotheringham et al.,

2000).

The hump function can be referred to as the kernel density estimation, and is given by:

n 1  d   x  k i (4.1)   2   i1 h  h 

where λ(x) is the density estimate at the location, x, n is the number of observations, h is the bandwidth or kernel size, k is the kernel function, and di is the distance between the

location x and the location of the ith observation. The kernel function k can be chosen from Gaussian, quartic, exponential, triangular, and uniform functions; however, the specific kernel function chosen does not tend to have a major impact on the density estimates (Xie and Yan, 2008). Of much greater impact is the choice of bandwidth. The choosing of the bandwidth is quite a complicated matter, but, typically, too large of a bandwidth tends to smooth out interesting effects while too small of a bandwidth tends to produce ‘spikey’ intensity functions, with the spikes centred on the observed data points

(Fotheringham et al., 2000). The bandwidth selection in kernel density estimation may result in misleading density values and maps that are either too smooth or too spiky. The

GeoMedia GRID employs an adaptive bandwidth technique to overcome these

82 limitations. The adaptive bandwidth technique analyses each point within the input layer to determine an appropriate bandwidth value for each point in the map layer (i.e. the bandwidth adapts to match the local conditions) (Intergraph Corporation, 2006). For instance, the algorithm searches for the user-defined-number of nearest points for every point location, then determines the local bandwidth to be the average distance to these closest points, and then the kernel density estimation is applied using the calculated bandwidth value. This approach is more suitable for identifying local spatial relationships, thereby effectively reflecting the local clustering within the data.

4.3.2 Applications for Hot Spot Detection and Density Analysis

Figures 4.4 (a) and 4.4 (b) show a ‘Ran-Off-Road’ collision location map and a density map. There were a total of 7,358 (out of 81,589 total collisions, 9%) ‘Ran-Off-

Road’ collisions during the time period from 2006 to 2008, and ‘Ran-Off-Road’ collisions generally have been ranked as one of the top causes of those collisions that occurred on the midblock of roads. The location map (Figure 4.4 (a)) shows the exact location of the collisions on the map; however, it does not reflect any spatial relationship between ‘Ran-Off-Road’ collisions and roadway traits. On the other hand, the density map (Figure 4.4 (b)) shows where the concentrations of ‘Ran-Off-Road’ collisions are, known as hot spots. On the density map, it is not difficult to find a few trends of ‘Ran-

Off-Road’ collisions. First, the bridge locations can be identified as hot spots, as well as those road segments where the geometry is winding. The other trend that can be found from the density map is that there are hot spots along the major inner city highways, where relatively sufficient median and shoulder widths are presented. Based on the

83 identified trends of ‘Ran-Off-Road’ collisions, transportation engineers can consider carrying out detailed road safety audits on the hot spots or applying safety interventions to these locations, such as installing guard rails, if needed.

84 Figure 4.4 Ran-Off-Road Collision Location Map (a) and Density Map (b) (Source: Author) Density Map (b) (Source: Location Map (a) and Collision Figure 4.4 Ran-Off-Road (a) (b) Figure 4.4 Ran-Off-Road Collision Location Map (a) and Density Map (b) (Source: Author)

85

In 2007 the Office of Traffic Safety in the City of Edmonton identified the top 20 high collision intersections in Edmonton based on the 2006 collision data, using the crash score method which considered the frequency, rate, and severity of the collisions (Office of Traffic Safety, 2007). The report of the top 20 high collision intersections was given to the Edmonton Police Service for targeted enforcement. The goal set for this initiative was to reduce the number of collisions at each respective intersection by 20% by the end of

2008 through enforcement, education, evaluation, and engineering improvements. The targeted enforcement was implemented during the time period between February 6 and

May 31, 2008. Figure 4.5 and Figure 4.6 show the collision density maps of some high collision intersection locations in 2006 and 2008, for the time period between February 6 and May 31. The maps show the collision density of a 15 km radius buffer zone around some of the high collision intersections.

On the maps it is easy to identify some levels of density concentration on selected high collision intersections (high collision intersections are presented as small red circles on the maps), but there are also other locations with higher concentrations of collisions.

Because the density maps take into account collision frequency by location only, the density results could be different from the list of the top 20 high collision intersections which used a scoring method that considers not only collision frequency but also rate and severity.

As the U.S. Department of Transportation (2002) noted, there is a “lack of a simple analysis system to identify unsafe intersections,” even though there are many methods for selecting high collision locations, each of which has advantages and disadvantages (City of Edmonton, 2007). Thus, it is hard to conclude which method

86 should be used to identify hazardous locations. Currently, the Safety Performance

Function and Network Screening Process based on the empirical Bayes method is the most recognized method for identifying hazardous locations (Elvik et al., 2009; Persaud,

2001). However, this procedure requires additional datasets, such as traffic volume and sophisticated statistical modelling tools. Because of the availability of geographically referenced data for this study (e.g. traffic volume, traffic operation, and roadway inventory data were not available in geographically referenced format) and the technical issue of the implementation or integration of statistical modelling tools with GIS environments, this part of the study is listed in the recommendations for future studies

(discussed in section 5.3).

There are many ways to evaluate the effectiveness of a countermeasure. For example, simply comparing the before and after outcomes of an incident is one way to evaluate the countermeasure. However, other factors might affect the outcome, such as the regression to the mean effect. The regression to the mean effect refers to the statistical phenomenon that locations with a high number of collisions in a particular period are likely to have fewer collisions during the following period, even if no interventions are taken; this is just because of random fluctuations in collision numbers (Hauer, 1997).

Figure 4.7 suggests another way to examine the effect of a countermeasure using spatial analysis.

87

(a)-2006 (b)-2008

Figure 4.5 Collision Density Maps of 118 Ave & 101 St (period of Feb 6 – May 31)

(a)-2006 (b)-2008

Figure 4.6 Collision Density Maps of High Collision Intersections in South Edmonton (period of Feb 6 – May 31)

88

Figure 4.7 depicts the change of collision density for the period of time both before and after enforcement in selected locations. Figure 4.7 (a) shows the collision density change on 118 Avenue NW and 101 Street NW, and Figure 4.7 (b) shows the collision density change on high collision intersections in south Edmonton. In order to generate the change maps, first, collision density maps for the same period of time in

2006 and 2008 were produced using a probability normalization option which generated the probability, in percentage, of an incident occurring within each cell. The normalization scheme defines the overall, uniform scaling of the density surface, such that the sum of all non-void grid cells within the input layer (volume under surface) equals N/A, where N is the number of incidents with selected study area and A is the area of a grid cell (Intergraph Corporation, 2006). However, if the probability normalization option is selected then the volume under the surface equals 100%. Therefore, the two maps have the same scale of density value even though the total number of collisions to be used for estimating the density is different in 2006 and in 2008. Then, the 2006 density map was subtracted from the 2008 density map to detect the change in collision density using the map algebra function in the GeoMedia GRID. On the map the yellow to red colour scheme indicates an increase in the collision density; while the green to blue colour scheme indicates a decrease in the collision density during the period between

2006 and 2008.

As seen in Figure 4.7 (a), it can be noted that the high increase in collision density concentration has shifted slightly to the northeast from its previous high collision location.

Furthermore, the low density increases are noted to the north and south of the high collision intersection. The collision frequency at the high collision intersection (118

89

Avenue NW and 101 Street NW) has increased slightly from 10 to 15 during the time period being studied; however, the collision frequency to the northeast of the adjacent intersections and midblock has increased from 12 to 33. Although it is hard to say if the shift is due to targeted enforcement, this situation could be suspected to have occurred due to the possibility of spillover effects. In the context of traffic safety, a spillover effect is a phenomenon whereby the collision rates increase or decrease at sites that are untreated but that are neighbours to treated sites (McGee and Eccles, 2003; Miaou and

Song, 2005). Another possibility is the regression to the mean effect. The high increase might be a random fluctuation in collision frequency.

On the other hand, as seen in Figure 4.7 (b), the collision density has decreased or not changed from 2006 to 2008 during the period of February 6 to May 31 on the high collision intersections in the south part of Edmonton. The total collision count in the 15 km buffered area around 7 high collision intersections has increased slightly from 719 to

732, but the sum of collision counts on the 7 high collision intersections has decreased slightly from 130 to 111. In this case, it could be assumed that this is due to the positive effects of their being targeted enforcement locations. However, other factors might affect these outcomes. For instance, the construction of the 23 Avenue and Calgary

Trail/Gateway Boulevard Interchange started in October 2007, and thus this construction could be a factor in the decrease of collisions on the intersection and surrounding areas

(ISL Engineering and Land Services, n.d.). Further study is required to verify the effects of the increases and decreases; however, the change maps can provide a method for verifying the effects of the interventions.

90

As seen in the examples of hot spot detection and density analysis, raster-based analysis offers a method for verifying the spatial aspects of selected outcomes as well as for visualizing the results on a map more efficiently. Though density analysis has an advantage over qualitative visualization, it seems that density analysis needs to be enhanced by other types of quantitative analysis because the estimated density values can vary greatly by bandwidth size and cell resolution leading to some ambiguity. Thus further statistical analysis should be coupled with the original density analysis to confirm those aspects that were found in the raster-based analysis.

Using kernel density estimation to identify the spatial hot spots of traffic collisions is not new. Previous research by Anderson (2009), Pulugurtha et al. (2007), and Xie and Yan (2008) has shown that kernel density estimation is a successful spatial hot spot detection method. However, this study suggests a way of visualizing temporal changes on two density maps using a simple map algebra function available in most GIS environments. By showing positive/negative temporal changes on a map, analysts can easily identify locations where a big change in the number of collisions has occurred and can verify the effects of these by an intervention on the surrounding areas. This GIS technique can be applied to detect the differences between two density maps created from the different data sources (see section 4.4). Though this GIS technique provides a good visual way of identifying locations that have undergone big changes and differences, it should be combined with statistical analysis in order to confirm those changes and differences.

91

(a) High Collision Intersection 2006 (118 Avenue NW and 101 Street NW)

(b) High Collision Intersections 2006 in the south part of Edmonton

Figure 4.7 Collision Density Changes from 2006 to 2008 for the period of Feb 6 to May 31 (blue-decrease (high), green-decrease (low), yellow- increase (low), red- increase (high))

92

4.4 Display Analysis

Traffic safety analysis results are usually presented as statistical values such as the frequency or probability of collisions. However, the numbers are seldom easy to understand, and it may not be possible to grasp the general trends or specific aspects of a given analysis. One of most appealing advantages of GIS is the capability for visualization. As has been said many times, “a picture is worth a thousand words.” Maps are used as pictures in GIS to convey the results of complex analysis of spatial relationship to the users. Thus maps help one’s understanding of the spatial relationships.

When the analysis results are presented in maps, the maps show specific aspects of the spatial relationship as well as an overview of the general spatial trends. Though current computer systems and GIS software make it possible, the user determines what data or spatial relationships are analyzed and depicted, and how to present thematically the data or relationships to the viewers. The following figures show some thematic maps that were designed to permit the visualization of traffic safety data and analysis results, and thus these examples explain how maps can be used for display analysis.

Figure 4.8 shows the 2008 speed zones bylaw map for the City of Edmonton. The speed bylaw enacted by the municipal council of the City of Edmonton is provided in written form, and gives the speed limit and descriptions of the location or road segment.

By converting the written form of the speed bylaw to geographically referenced data with speed zone attribute information, the speed zone bylaw map provides not only a visual representation of the speed limit information on a map, but also acts as a data resource for other forms of spatial analysis. The map can be published in hard copy format or on a web-based application (Tang and Waters, 2005).

93

Figure 4.8 Speed Zone Bylaw 2008 Map (The City of Edmonton; Source: Author)

94

Figure 4.9 shows the map of speed violation and offenders residence location by postal code. This map makes use of the first 3 digits of the postal code vehicle registration information of speed offenders at the east bound NW and

133 Street NW intersection, and shows the 3 digit postal code residence information zones with more than 200 speed violations. This map clearly illustrates where the majority of the offenders’ vehicles were registered, and explains the route that is being used for travel and/or to commute. For example, it can be assumed that Stony Plain Road

NW is a major corridor to/from Downtown for residents of the west and central areas.

This kind of map provides useful information for target sites when an educational campaign is planned.

95

Figure 4.9 Map of Speed Violation and Offenders’ Residence Location by Postal Code (Source: Author)

96

Figure 4.10 shows the map of the ‘Curb The Danger’ program and the associated call density change from 2007 to 2008. The ‘Curb The Danger’ program was introduced in 2006, and is a simple program. If a citizen spots someone they suspect is driving while impaired they call 9-1-1 and report the last direction of travel of the suspect vehicle, the make of vehicle and the license plate information (EPS, n.d.; see also Waters (2009) for a discussion of the strengths and weaknesses of these so-called “reputation systems”).

There have been 26,062 calls since the implementation of the program, of which 2,498 resulted in charges of impaired driving. Using the same method that was mentioned in section 4.3, the map showing the call density changes that took place between 2007 and

2008 as a result of the ‘Curb The Danger’ program is displayed in Figure 4.10, and it shows clearly those areas where the calls have either increased in number or decreased.

Based on the spatial information that is displayed on the map, the most effective locations for the Checkstop program can be planned, and additionally, trends can be analysed.

Additionally, the ‘Curb The Danger’ call data which were provided by Edmonton

Police Service (EPS) were geographically referenced; however, it was noted that approximately 8% of the total call locations did not contain geographically referenced information (e.g. 662 unmatched records out of the total 8,414 call records in the 2007 data) and it was assumed that the possibility of mismatched records existed in the data.

Therefore, the geocoding approach which was proposed in this study was applied to the data. After the geocoding, all of the call records were properly geocoded, except for approximately 2% of the total records (e.g. 158 records out of 8,414 records in the 2007 data) which were unknowns and unidentifiable locations due to incorrect address information. Using the new geocoding approach improved not only the geocoding match

97 rate but also the reliability of the geocoding results and thus the analysis results because the new geocoding approach takes into account the need to eliminate the potential error sources which were discussed in the previous chapter.

Figure 4.11 shows the difference in CTD call density between density values using geocoded points by the Location Code method and address geocoding by EPS. As seen in Figure 4.11, there are differences between the two density maps created from the two different data sources; geocoded CTD locations using the Location Code and address geocoding by EPS. By examining the results in the density difference map, the following was found (marked in green circles in the map):

Example 1: The density map using the Location Code generated geocoding points and picked these locations as small hot spots. There are 11 geocoded points identifies by the Location Code method; however, there was only one point in the EPS geocoded data.

It was identified that the address geocoding method which is used by EPS could not geocode 10 CTD records because those 10 records were classified as ‘Unmatched’ in the

EPS data. The records contain proper naming for the intersection location (i.e. Walterdale

Hill NW/Queen Elizabeth Park Road NW); however, it was not possible to verify the reason why those records were unmatched. It is assumed this was due to a systematic error which relates to a geocoding algorithm used in the address geocoding method by

EPS.

Example 2: A hot spot which was generated from the EPS geocoded point data was identified as an error because this hot spot was generated from mismatched locations

(matched to wrong location). The intersection address for this location in the CTD records is ‘116 st nw/116 st nw’, which is not an identifiable location, because there is no

98 such intersection in the reference data. It is assumed this is a data entry error; however, the address geocoding tool used by EPS located these records to an incorrect location.

These records should be classified as unmatched points. In addition, it is unclear why the address geocoding tool located the records to an incorrect location. It can only be assumed that the parsing mechanism and the matching algorithm might be related to this problem.

Example 3: These kinds of examples were found in many locations, especially on large interchanges or intersections. As seen in Figure 4.12, a hot spot generated from the geocoded points using the Location Code is located in the centre of the interchange between Anthony Henday Drive NW and Whitemud Drive NW; however, a hot spot generated from an address geocoded point by EPS is placed on the left where a ramp to

Anthony Henday Drive NW starts. There were the same 40 records associated with each hot spot; however, the EPS geocoded points were located on the left side rather than in the centre. The distance between two points is approximately 440m. It seems that the address geocoding tool used by EPS located the records to the location among tied match score locations as discussed in section 3.3.3. In these kinds of cases, it is more reasonable to locate the centre of an intersection or interchange as geocoding using the Location

Code unless more specific location information is provided in the CTD records.

By comparing the two density maps generated from address geocoding points by

EPS and geocoded points using the Location Code, the density map created from geocoded points using the Location Code provided more accurate, precise, and reliable results.

99

Figure 4.10 Curb The Danger Call Density Change Map from 2007 to 2008 (Source: Author)

100

Figure 4.11 Difference of CTD Call Density (density of geocoded points using the Location Code minus density of address geocoded points by EPS, Source: Author) (1-Valid address, but unmatched by address geocoding, 2-Mismatched location, and 3-Matched to one of tied match score locations by address geocoding)

101

Figure 4.12 Comparison between Geocoded using the Location Code and Address Geocoding

102

4.5 Chapter Summary

This chapter introduced spatial analyses within the GIS environment using the geographically referenced collision datasets which were developed using the new geocoding approach. The spatial analyses were focused on the following topics; query analysis, raster-based analysis, and display analysis. In particular, the kernel density estimation method was applied to generate density maps. The results from these analyses were presented in the form of maps and tables. In addition, discussions of the results of each analysis were provided.

103

Chapter Five: Conclusions

5.1 Summary of Chapters

Chapter 1 presented the background information about traffic safety and GIS. The objectives of this research and the dissertation organization were also presented.

Chapter 2 reviewed the literature on GIS database development and GIS applications to the study of traffic safety. The literature review of GIS database development focused on the geocoding technique and the associated accuracy and precision issues of address geocoding. Previous studies on traffic safety using GIS were also reviewed.

Chapter 3 examined the adequacy of geocoding methods for collision data, and proposed a new approach to geocoding collision data. The geocoding tests were performed using the geocoding tools that were available in ArcGIS® 9.2 and GeoMedia®

6.1. Next, the geocoding errors and potential problems during the geocoding tests were discussed. A new approach for geocoding collision data was proposed, and the geocoding results achieved by using this new approach were presented.

Chapter 4 discussed spatial analysis for traffic safety in the following three areas; query analysis, raster-based analysis, and display analysis. In particular the raster-based analysis focused on the kernel density estimation method as a tool to identify hot spots and to analyse density change. Various examples were presented to demonstrate the efficacy of GIS in traffic safety applications.

104

5.2 Concluding Remarks

Based on the GIS database development and spatial analysis for traffic safety from this study, the following was found during the research:

• The current geocoding tools that are included in the GIS software packages

did not provide a suitable level of accuracy and precision for spatial analysis

when a conventional collision database was converted to a geographically

referenced database. Many potential sources for error were found during the

geocoding tests for intersection-based collision data. This was especially the

case for the MVCIS database, because of the method of expressing location

and the complexity of its coding structure, the current geocoding tools were

not appropriate for geocoding the collision location.

• In order to overcome the shortcomings of the address-based geocoding

method, a new approach to geocoding was proposed for the migration from

the conventional collision database to a geographically referenced database.

The new method was based on an alphanumeric Location Code, and utilized

the Location Code based reference layer to assign spatial information to the

collision data.

• In this study the Location Code was proposed and used as the primary link

between location information in the MVCIS and the reference data in order to

reduce geocoding errors mainly due to spelling sensitivity in the address-

based geocoding method. The proposed Location Code structure is optimized

for the MVCIS; however, the coding scheme used in this study could be

easily applied to other address-based databases. Furthermore, reference data

105

which employ the Location Code approach could be used as secondary

reference information that supports address-based geocoding methods.

• Using the new approach all collision records were successfully placed on a

GIS layer; however, there were some unknown and unidentifiable locations

in the collision datasets. These were usually due to a deficiency in the initial

reporting process, especially when the collision reports were completed by

walk-ins.

• Various spatial analysis techniques were applied to the study of traffic safety.

These showed that GIS can provide useful spatial analytical tools as well as

visualize analysis results on maps.

• Query analysis enables GIS users to extract the required data from a database

based on attribute information and/or spatial features. Density analysis

provides a useful tool for detecting hot spots and trends, and thus the analysts

can easily analyze spatial relationships in the study of traffic safety.

Displaying data or results on a map is not only an effective tool for

presenting geographically referenced information, but also for providing a

visual, analytical method.

106

5.3 Recommendations and Future Study Direction

Similar to other research, this study leads to more questions and sets the next stage for further projects. Specifically, the database development and spatial analysis in this study encountered problems which suggest new avenues for research. The following sections discuss recommendations and future study directions.

5.3.1 Data Acquisition and Data Integration

Despite the successful migration from the conventional collision database to the geographically referenced database, a few drawbacks were found. During the verification process for collision data, many incomplete and incorrect records were found in the conventional collision database. It is assumed that most of these occurred during the reporting stage of the collisions, especially when the collision scene was not visited and the report was not completed by a police officer. Additionally, unidentifiable locations in the conventional collision database caused the loss of collision records on the map.

In order to reduce data entry errors in the reporting stage, an electronic collision report system can be used as an alternative solution to the problem. The system uses an electronic form to fill in collision information, and this completed information is automatically connected to a database for further use. The use of an electronic collision report system can reduce data entry errors and improve the overall quality of collision reports. Usually, the system utilizes a GPS unit that is equipped in police vehicles to acquire the location information of an incident. However, the level of accuracy and precision that is needed for the database has to be tested and determined. For example, it should be determined just how many decimal points would be needed for representing the

107 locations of a collision on a map when coordinate values are used for recording the collision in the electronic collision report system. Additionally, a method for representing locations (e.g. text format of address or pointer location electronic forms of a map) should be considered when a GPS unit is not available at the time of reporting, especially when a collision report is completed by a walk-in.

During this study, obtaining required datasets and converting these to geographically referenced datasets were among the biggest challenges. For example, though SLIM maintains and provides an extensive amount of spatial datasets, some useful attribute information is not included in these, such as demographic and socio- economic data; therefore, an additional data entry process was required for an appropriate traffic safety analysis. There were no traffic safety specified datasets available in the agency, and even most of those which were available within the agency were not geographically referenced. For instance, as mentioned in chapter 4, speed limit information can be utilized to analyze speed aspects in a traffic safety study with respect to collision occurrence and severity; however, the written form of the speed bylaw and

CAD format map were available within the agency, so the geographically referenced speed bylaw map had to be developed in this study in order to achieve spatial analysis as well as visualization of the data. Additionally, map data in CAD file format, such as the map of traffic control devices, had to be converted to GIS format; however, the map in

CAD format included inaccurate location information and insufficient attribute data, and these problems required much manual work in order to correct errors and add the necessary attribute data. Therefore, a great deal of effort was spent on the database development (See section 3.2); however, this study shows the feasibility of GIS as a hub

108 for datasets of traffic safety analysis. Furthermore, the need for a timely, accurate, and integrated traffic safety data system has increased within the agencies, such as the Office of Traffic Safety in the City of Edmonton.

Figure 5.1 shows a proposed configuration of an integrated traffic safety database.

The proposed data system includes geographically referenced collision data as well as an inventory of roadway, traffic operational data, and speed and enforcement data. The system supplies the required data to various stakeholders for various analysis platforms.

The construction of this database system provides essential information for reducing collisions, and for reducing the severity of any collisions that do occur, and for improving traffic safety.

109

Data Source

Roadway Enforcement & Inventory Data Adjudication Data

Traffic Collision Data Medical & Quality of Operations Data Life Outcomes

Vehicle Other Information

Driver Miscellaneous Information

Spatial Information e.g. GPS Integrated Traffic Safety Database Quality Control / Stakeholders Quality Assurance Transportation Dept. SQL (Structured Query Police Language) EMS

Application / Technology Fire Dept.

GIS Other Dept. Automated Collision Record System Business Intelligence Researchers

Safety Analyst Government Agencies ITS Public Artificial Intelligence

Figure 5.1 Configuration of Integrated Traffic Safety Database (Source: Author)

110

5.3.2 Other Spatial Analysis and the Implementation of Other Analytical Methods

This study utilized vector-based and raster-based GIS techniques to provide useful information for traffic safety analysis. In addition, network-based analysis could provide useful information for traffic safety because of the nature of traffic collisions which occur on the road network. For instance, a network-based technique could be applied to the planning of an optimal patrol route for high collision locations and collision prone locations or for allocating optimal stand-by locations for emergency crews, based on response time. Once roadway inventory data, traffic operations data, and traffic safety enforcement data were established on a network-based spatial database, network-based

GIS techniques could be utilized more effectively for traffic safety analysis.

Current GIS technologies enable the user to perform and analyse various spatial or non-spatial statistical modelling within the GIS environment. However, the most popular statistical modelling techniques in traffic safety research, such as the generalized linear model, are not available in vendor provided GIS software packages. A statistical modelling module could be implemented in a GIS package as an extension, using an embedded programming language such as Visual Basic for Application. Alternatively, a linkage could be established between a GIS package and a statistical software program.

Thus, an effective and efficient way for integrating or implementing statistical analysis methods should be studied. Additionally, this is recommended for implementing or integrating other analytical methods with GIS environments. For example, reasoning techniques in artificial intelligence could be useful in prioritizing collision prone locations or in providing possible countermeasures for problematic traffic safety locations.

111

References

Anderson, T. K. (2009). Kernel density estimation and K-means clustering to profile road

accident hotspots. Accident Analysis and Prevention, 41, 359-364.

Arthur, R. M. (2002). Modelling hazardous locations with geographic information

systems. In P. J. Rothe (Ed.), Driving lessons. Alberta, Canada: The University of

Alberta Press.

Austin, K. (1995). The identification of mistakes in road crash records: part 1, locational

variables. Accident Analysis and Prevention, 27, 261–276.

Bigham, J. M., Rice, T. M., Pande, S., Lee, J., Park, S. H., Gutierrez, N., and Ragland, D.

R. (2009). Geocoding police collision report data from California: a

comprehensive approach. International Journal of Health Geographics, 8:72.

Center for Transportation Research and Education (CTRE). (2008). Incident Location

Tool. Retrieved May 13, 2008, from http://www.ctre.iastate.edu/research/

locationtool/

City of Edmonton. (2008). Motor Vehicle Collisions 2007. Retrieved from

http://www.edmonton.ca/transportation/RoadsTraffic/2007_Annual_Collision_Re

port_FINAL.pdf

City of Edmonton. (2008). Transportation Master Plan. Retrieved from

http://www.edmonton.ca/city_government/documents/RoadsTraffic/08859COE_

TMP-WEB.pdf

112

City of Edmonton, Office of Traffic Safety. (2007). Top 20 High Collision Intersections

in Edmonton. Alberta, Canada: The City of Edmonton.

City of Edmonton, Office of Traffic Safety. (2007). Traffic Safety Strategy for the City of

Edmonton 2006-2010. Alberta, Canada: The City of Edmonton. de Smith, M. J., Goodchild, M. F., and Longley, P. A. (2007). Geospatial analysis: a

comprehensive guide to principles, techniques and software tools, 2nd Edition.

Leicester, UK: Matador.

Dutta, A., Parker, S., Qin, X., Qiu, Z., and Noyce, D.A. (2007). System for digitizing

information on Wisconsin's crash locations. Transportation Research Record,

Statistical Methods, Safety Data, Analysis, and Evaluation, 2019, 256-264.

Dyck, M. (2006). Analyzing Dangerous Goods Truck Routes in Calgary, AB, Canada,

Using GIS-Based Methods . Mastr of GIS Paper, University of Calgary, Alberta,

Canada.

Edmonton Police Service. (2007). Citizen Survey. Retrieved from

http://www.edmontonpolice.ca/sitecore/media%20library/~/media/EPS%20Exter

nal/Files/Reports/2007CitizenSurveySummary.ashx

Edmonton Police Service. (n.d.). Curb The Danger. Retrieved September 19, 2009, from

http://www.edmontonpolice.ca/EPS%20External/Home/TrafficVehicles/

AlcoholBreathTesting/CurbTheDanger.aspx

Elvik, R., Hoye, A., Vaa, T., and Sorensen, M. (2009). The Handbook of Road Safety

Measures (2nd ed.). Bingley, UK: Emerald.

113

Erdogan, S., Yilmaz, I., Baybura, T., and Gullu, M. (2008). Geographical information

systems aided traffic accident analysis system case study: city of Afyonkarahisar.

Accident Analysis and Prevention, 40, 174–181.

Fotheringham, A. S., Brunsdon, C., and Charlton, M. (2000). Quantitative Geography:

Perspectives on Spatial Data Analysis. London, UK: Sage.

Green, E. R. and Agent, K. R. (2004). Evaluation of the Accuracy of GPS Coordinates

Used on Traffic Collision Reporting Forms. Kentucky Transportation Center,

Lexington.

Government of Alberta. (2006). Alberta Traffic Safety Plan: Saving Lives on Alberta’s

Roads. Alberta, Canada: Government of Alberta.

Hauer, E. (1997). Observational Before-After Studies in Road Safety. Oxford, England:

Pergamon Press, Elsevier Science Ltd.

Intergraph Corporation. (2007). GeoMedia Analysis Overview. Huntsville, AL.

Intergraph Corporation. (2006). GeoMedia Grid Help. Huntsville, AL.

ISL Engineering and Land Services. (n.d.). 23 Avenue Interchange Project. Retrieved

September 19, 2009, from http://23avenue.com/

Kam, B.H. (2003). A disaggregate approach to crash rate analysis. Accident Analysis and

Prevention, 35, 693–709.

Karimi, H. A., Durcik, M., and Rasdorf, W. (2004). Evaluation of uncertainties

associated with geocoding techniques. Computer-Aided Civil & Infrastructure

Engineering, 19 (3), 170-185.

114

Khan, M. A., Kathairi, A. S., and Garib, A. M. (2004). A GIS based traffic accident data

collection, referencing and analysis framework for Abu Dhabi. Codatu

(Cooperation for urban mobility in the developing world) XI. Bucharest,

Romania.

Kravets, N. and Hadden, W. C. (2007). The accuracy of address coding and the effects of

coding errors. Health & Place, 13, 293-298.

Levine, N. and Kim, K.E. (1998). The location of motor vehicle crashes in Honolulu: a

methodology for geocoding intersections. Computers, Environment and Urban

Systems, 22, 557–576.

Levine, N., Kim, K.E., and Nitz, L.H. (1995). Spatial analysis of Honolulu motor vehicle

crashes: I. spatial patterns. Accident Analysis and Prevention, 27 (5), 663–674.

Liang, L. Y., Ma’some, D. M., and Hua, L. T. (2005). Traffic accident application using

geographic information system. Journal of the Eastern Asia Society for

Transportation Studies, 6, 3574-3589.

Loo, B. P. Y. (2006). Validating crash locations for quantitative spatial analysis: A GIS-

based approach. Accident Analysis and Prevention, 38, 879-886.

McGee, H. W. and Eccles, K. A. (2003). Impact of Red Light Camera Enforcement on

Crash Experience. NCHRP Synthesis 310, Transportation Research Board,

Washington, D.C.

115

Miaou, S-P. and Song, J. J. (2005). Bayesian ranking of sites for engineering safety

improvements: Decision parameter, treatability concept, statistical criterion, and

spatial dependence. Accident Analysis and Prevention, 37, 699–720.

Miller, H. J., & Shaw, S.-L. (2001). Geographic Information Systems for Transportation:

Principles and Applications. Oxford, UK: Oxford University Press.

Noland, R. B. and Quddus, M. A. (2004). A spatially disaggregate analysis of road

casualties in England. Accident Analysis and Prevention, 36, 973–984.

Ogle, J. H. (2007). Technologies for Improving Safety Data. NCHRP Synthesis 367,

Transportation Research Board, Washington, D.C.

Persaud, B. N. (2001). Statistical Methods in Highway Safety Analysis- A Synthesis of

Highway Practice. NCHRP Synthesis 295, Transportation Research Board,

Washington, D.C.

Pulugurtha, S. S., Krishnakumar, V. K., and Nambisan, S. S. (2007). New methods to

identify and rank high pedestrian crash zones: An illustration. Accident Analysis

and Prevention, 39, 800–811.

Sarasua, W., Ogle, J. H., and Geoghegan K. (2008). Location, location, location - using

GPS to identify crash location: the South Carolina experience. Transportation

Research Board Annual Meeting, January 2008, Washington D.C.

Smith, R. A., Harkey, D. A., and Harris, B. (2001). Implementation of GIS-Based

Highway Safety Analyses: Bridging The Gap. Report FHWA-RD-01-039, Federal

116

Highway Administration, Washington, D.C. Retrieved from http://www.tfhrc.gov/

safety/1039.htm

Tang, K.X. and Waters, N.M. (2005). The internet, GIS and public participation in

transportation planning. Progress in Planning, 64 (1), 1-62.

Tegge, R. and Ouyang, Y. (2009). Correcting erroneous crash locations in transportation

safety analysis. Accident Analysis and Prevention, 41, 202–209.

The White House. (2000, May 1). Statement By The President Regarding The United

States’ Decision To Stop Degrading Global Positioning System Accuracy. Office

of Science and Technology Policy, Washington D.C. Retrieved from

http://clinton3.nara.gov/WH/EOP/OSTP/html/0053_2.html

Thill, J-C. (2001). Geographic Information Systems in Transportation Research. Oxford,

England: Pergamon Press, Elsevier Science Ltd.

Tomlin, C.D. (1990). Geographic Information Systems and Cartographic Modeling. New

Jersey: Prentice Hall

Transport Canada. (2007). 2006 Canadian Motor Vehicle Traffic Collision Statistics,

Ottawa, Ontario: Transport Canada.

Transport Canada. (2001). Canada’s Road Safety Targets To 2010, Ottawa, Ontario:

Transport Canada.

U.S. Department of Transportation, Federal Highway Administration. (2002). National

Agenda for Intersection Safety, Washington, D.C.

117

Waters, N. M. (2009). Reputation systems: the dark side of VGI. Geoworld, 22 (8), 18-

19.

Waters, N. M. (1999). Transportation GIS: GIS-T, In P. A. Longley, M. F. Goodchild, D.

J. Maguire, & D. W. Rhind (Ed.), Geographic Information Systems, Vol. 2:

Management Issues and Applications. New York, NY: Wiley.

Xie, Z. and Yan, J. (2008). Kernel density estimation of traffic accidents in a network

space. Computers, Environment and Urban Systems, 32, 396-406.

Yamada, I., and Thill. J-C. (2007). Local indicators of network-constrained clusters in

spatial point patterns. Geographical Analysis, 39, 268-292.

Zandbergen, P. A. (2008). A comparison of address point, parcel and street geocoding

techniques. Computers, Environment and Urban Systems, 32, 214-232.

Zhan, C. (2005). Daniel B. Fambro student paper award: Geocoding and analysis of

freeway service patrol data. ITE (Institute of Transportation Engineers) Journal,

75 (11), 16-21.

Zhan, F. B., Brender, J. D., Lima, I. D., Suarez, L., and Langlois, P. H. (2006). Match

rate and positional accuracy of two geocoding methods for epidemiologic

research. Annals of Epidemiology, 16 (11), 842-849.

118

APPENDIX A: Alberta Collision Report Form

Figure A.1 Alberta Collision Report Form (Front)

119

Figure A.2 Alberta Collision Report Form (Back)

120

APPENDIX B: Example of Actual Collision Report

Figure A.3 Example of Actual Collision Report (Front)

121

Figure A.4 Example of Actual Collision Report (Back)

122

As seen in Figure A.3 and A.4, the collision report indicates the collision location at the intersection at 111 Street Southbound and Whitemud Drive. However, according to the report, the collision location is ambiguous because there are two southbound intersections between 111 Street and Whitemud Drive (See Figure A.5). In this case, the collision location is recorded as Whitemud Dr-106 Street-111 Street (946) zone and unknown portion (99) location in the MVCIS.

Figure A.5 Intersection of Whitemud Drive and 111 Street

(Note: Personal information, e.g. name, address, license number, in Figure A.3 and A.4 is intentionally erased due to privacy issues)

123

APPENDIX C: Geocoding Test Sample Address

ID Category Address 1 1 101 AVENUE NW & 50 STREET NW 2 1 102A AVENUE NW & 100 STREET NW 3 1 103 AVENUE NW & 170 STREET NW 4 1 105 AVENUE NW & 105 STREET NW 5 1 106A AVENUE NW & 85 STREET NW 6 1 107 AVENUE NW & 153 STREET NW 7 1 109A AVENUE NW & 97 STREET NW 8 1 112 AVENUE NW & 75A STREET NW 9 1 114 AVENUE NW & 104 STREET NW 10 1 116 AVENUE NW & 158 STREET NW 11 1 118 AVENUE NW & 82 STREET NW 12 1 118A AVENUE NW & 184 STREET NW 13 1 121 AVENUE NE & 17 STREET NE 14 1 121 AVENUE NW & 56 STREET NW 15 1 123 AVENUE NW & 142 STREET NW 16 1 128 AVENUE NW & 94 STREET NW 17 1 130 AVENUE NW & 117 STREET NW 18 1 133 AVENUE NW & 132 STREET NW 19 1 135 AVENUE NW & 113A STREET NW 20 1 137 AVENUE NW & 123A STREET NW 21 1 144 AVENUE NW & 58 STREET NW 22 1 152A AVENUE NW & 95 STREET NW 23 1 161A AVENUE NW & 74 STREET NW 24 1 167 AVENUE NE & 3 STREET NE 25 1 168 AVENUE NW & 100 STREET NW 26 1 23 AVENUE NW & 94 STREET NW 27 1 30 AVENUE SW & 111 STREET SW 28 1 32A AVENUE NW & 119 STREET NW 29 1 38 AVENUE NW & 62 STREET NW 30 1 41 AVENUE SW & 101 STREET SW 31 1 45 AVENUE NW & 151 STREET NW 32 1 58 AVENUE NW & 91 STREET NW 33 1 76 AVENUE NW & 18 STREET NW 34 1 79 AVENUE NW & 104 STREET NW 35 1 82 AVENUE NW & 95 STREET NW 36 1 84 AVENUE NW & 108 STREET NW 37 1 87 AVENUE NW & 153 STREET NW 38 1 9 AVENUE NW & 113B STREET NW 39 1 9 AVENUE SW & 119 STREET SW 40 1 90 AVENUE NW & 90 STREET NW

124

ID Category Address 41 1 92A AVENUE NW & 142 STREET NW 42 1 97 AVENUE NW & 104 STREET NW 43 1 99 AVENUE NW & 110 STREET NW 44 2 ARGYLL ROAD NW & 79 STREET NW 45 2 ARGYLL ROAD NW & 83 STREET NW 46 2 BEAUMARIS ROAD NW & 109 STREET NW 47 2 BELGRAVIA ROAD NW & 116 STREET NW 48 2 CALLINGWOOD ROAD NW & 184 STREET NW 49 2 CAPILANO STREET NW & 58 STREET NW 50 2 CASTLEDOWNS ROAD NW & 112 STREET NW 51 2 CONNORS ROAD NW & 92 STREET NW 52 2 CORONET ROAD NW & 86 STREET NW 53 2 CUMBERLAND ROAD NW & 129 STREET NW 54 2 DAVIES ROAD NW & 83 STREET NW 55 2 DELWOOD ROAD NW & 74 STREET NW 56 2 ELENIAK ROAD NW & 50 STREET NW 57 2 FAIRWAY DRIVE NW & 119 STREET NW 58 2 FORT ROAD NW & 63 STREET NW 59 2 GRIERSON HILL NW & 95A STREET NW 60 2 GRIESBACH ROAD NW & 97 STREET NW 61 2 HARVEST ROAD NW & 40 STREET NW 62 2 HERMITAGE ROAD NW & 50 STREET NW 63 2 JAMHA ROAD NW & 50 STREET NW 64 2 NW & 124 STREET NW 65 2 JASPER AVENUE NW & 95A STREET NW 66 2 AVENUE NW & 106 STREET NW 67 2 KINGSWAY AVENUE NW & 121 STREET NW 68 2 KIRKNESS ROAD NW & 32 STREET NW 69 2 KIRKWOOD AVENUE NW & 38 STREET NW 70 2 KNOTTWOOD ROAD SOUTH NW & 80 STREET NW 71 2 LAKEWOOD ROAD NORTH NW & 79 STREET NW 72 2 LESSARD ROAD NW & 199 STREET NW 73 2 MANNING FREEWAY EBD NW & 50 STREET NW 74 2 MANNING FREEWAY NW & 18 STREET NW 75 2 MANNING FREEWAY WBD NW & 50 STREET NW 76 2 MAPLE RIDGE DRIVE NW & 17 STREET NW 77 2 MCINTYRE ROAD NW & 75 STREET NW 78 2 MCLEOD ROAD NW & 50 STREET NW 79 2 MEADOWLARK ROAD NW & 156 STREET NW 80 2 ROAD NW & 87 STREET NW 81 2 MILL WOODS ROAD SOUTH NW & 49A STREET NW 82 2 MILLBOURNE ROAD EAST NW & 74 STREET NW 83 2 MILLBOURNE ROAD WEST NW & 76 STREET NW 84 2 NORWOOD BOULEVARD NW & 95 STREET NW 85 2 PARK DRIVE NW & 146 STREET NW

125

ID Category Address 86 2 PRINCESS ELIZABETH AV NW & 109 STREET NW 87 2 RAVINE DRIVE NW & 136 STREET NW 88 2 RIVER VALLEY ROAD NW & 105 STREET NW 89 2 ROPER ROAD NW & 54 STREET NW 90 2 ROPER ROAD NW & 84 STREET NW 91 2 ROWLAND ROAD NW & 95 STREET NW 92 2 ROYAL ROAD NW & 119 STREET NW 93 2 SADDLEBACK ROAD NW & 112 STREET NW 94 2 NW & 116 STREET NW 95 2 SHERWOOD PARK FREEWAY NW & 71 STREET NW 96 2 SHERWOOD PARK FWY.EBD NW & 50 STREET NW 97 2 SHERWOOD PARK FWY.WBD NW & 50 STREET NW 98 2 ST.ALBERT TRAIL NW & 156 STREET NW 99 2 STEELE CRESCENT NW & 53 STREET NW 100 2 STONY PLAIN ROAD NW & 127 STREET NW 101 2 STONY PLAIN ROAD NW & 186 STREET NW 102 2 SUMMIT DRIVE NW & 148 STREET NW 103 2 UNIVERSITY AVENUE NW & 112 STREET NW 104 2 UNIVERSITY AVENUE NW & 119 STREET NW 105 2 WAGNER ROAD NW & 83 STREET NW 106 2 WEDGEWOOD BOULEVARD NW & 184 STREET NW 107 2 WESTBROOKE DRIVE NW & 119 STREET NW 108 2 WHITEMUD DRIVE EBD NW & 178 STREET NW 109 2 WHITEMUD DRIVE EBD NW & 50 STREET NW 110 2 WHITEMUD DRIVE NW & 34 STREET NW 111 2 WHITEMUD DRIVE WBD NW & 178 STREET NW 112 2 YELLOWHEAD TRAIL EBD NW & 170 STREET NW 113 2 YELLOWHEAD TRAIL NW & 215 STREET NW 114 2 YELLOWHEAD TRAIL NW & 66 STREET NW 115 2 YELLOWHEAD TRAIL WBD NW & 170 STREET NW 116 2 YOUVILLE DRIVE WEST NW & 58 STREET NW 117 3 100 AVENUE NW & 170 STREET NORTHBOUND NW 118 3 100 AVENUE NW & 170 STREET SOUTH BOUND NW 119 3 100 AVENUE NW & ANTHONY HENDAY DRIVE NORTH NW 120 3 100 AVENUE NW & ANTHONY HENDAY DRIVE SOUTH NW 121 3 102 AVENUE NW & CLIFTON PLACE NW 122 3 102 AVENUE NW & CONNAUGHT DRIVE NW 123 3 102 AVENUE NW & STONY PLAIN ROAD NW 124 3 102 AVENUE NW & WELLINGTON CRESCENT NW 125 3 106 AVENUE NW & CAPILANO STREET NW 126 3 106 AVENUE NW & ST.GABRIEL SCHOOL ROAD NW 127 3 107 AVENUE NW & NBD NW 128 3 107 AVENUE NW & GROAT ROAD SBD NW 129 3 109 AVENUE NW & MAYFIELD ROAD NW 130 3 109A AVENUE NW & GROAT ROAD NW

126

ID Category Address 131 3 111 AVENUE NW & KINGSWAY AVENUE NW 132 3 116 AVENUE NW & NW 133 3 118 AVENUE NW & ABBOTTSFIELD ROAD NW 134 3 119 AVENUE NW & WAYNE GRETZKY DRIVE NBD NW 135 3 119 AVENUE NW & WAYNE GRETZKY DRIVE SBD NW 136 3 12 AVENUE NW & KNOTTWOOD ROAD SOUTH NW 137 3 122 AVENUE NW & DOVERCOURT CRESCENT NW 138 3 124 AVENUE NW & ST.ALBERT TRAIL NW 139 3 130 AVENUE NW & DUNVEGAN ROAD NW 140 3 132 AVENUE NW & DELWOOD ROAD NW 141 3 132A AVENUE NW & NW 142 3 137 AVENUE NW & ST.ALBERT TRAIL NW 143 3 144 AVENUE NW & FORT ROAD NW 144 3 144 AVENUE NW & MANNING FREEWAY NBD NW 145 3 144 AVENUE NW & MANNING FREEWAY SBD NW 146 3 151 AVENUE NW & KIRKNESS ROAD NW 147 3 151 AVENUE NW & VICTORIA TRAIL NW 148 3 153 AVENUE NW & MILLER BOULEVARD NW 149 3 155 AVENUE NW & BEAUMARIS ROAD NW 150 3 158A AVENUE NW & CASTLE KEEP ROAD NW 151 3 162 AVENUE NW & CASTLEDOWNS ROAD NW 152 3 164 AVENUE NW & OZERNA ROAD NW 153 3 167 AVENUE NW & MANNING FREEWAY NW 154 3 16A AVENUE NW & MILL WOODS ROAD SOUTH NW 155 3 172 AVENUE NW & KLARVATTEN LAKE WYND NW 156 3 23 AVENUE NW & HODGSON WAY NW 157 3 23 AVENUE NW & MILL WOODS ROAD EAST NW 158 3 23 AVENUE NW & MILL WOODS ROAD NW 159 3 23 AVENUE NW & PARSONS ROAD NW 160 3 23 AVENUE NW & RABBIT HILL ROAD NW 161 3 23 AVENUE NW & TEGLER GATE NW 162 3 23 AVENUE NW & TERWILLEGER DRIVE NW 163 3 23 AVENUE NW & TOWNE CENTRE BOULEVARD NW 164 3 27 AVENUE NW & SADDLEBACK ROAD NW 165 3 28 AVENUE NW & HEWES WAY NW 166 3 28 AVENUE NW & LAKEWOOD ROAD EAST NW 167 3 29A AVENUE NW & PARSONS ROAD NW 168 3 31 AVENUE NW & LAKEWOOD ROAD WEST NW 169 3 33 AVENUE NW & SILVER BERRY ROAD NW 170 3 35A AVENUE NW & WOODVALE ROAD WEST NW 171 3 38 AVENUE NW & MILL WOODS ROAD EAST NW 172 3 38 AVENUE NW & MILL WOODS ROAD NW 173 3 38 AVENUE NW & MILLBOURNE ROAD EAST NW 174 3 39 AVENUE NW & ASPEN DRIVE WEST NW 175 3 39A AVENUE NW & CALGARY TRAIL SBD NW

127

ID Category Address 176 3 39A AVENUE NW & GATEWAY BOULEVARD NW 177 3 40 AVENUE NW & RHATIGAN ROAD EAST NW 178 3 42A AVENUE NW & MILLBOURNE ROAD WEST NW 179 3 51 AVENUE NW & MALMO ROAD NW 180 3 53 AVENUE NW & RIVERBEND ROAD NW 181 3 53 AVENUE NW & WHITEMUD DRIVE NBD NW 182 3 53 AVENUE NW & WHITEMUD DRIVE SBD NW 183 3 56 AVENUE NW & RIO TERRACE DRIVE NW 184 3 57 AVENUE NW & LESSARD ROAD NW 185 3 62 AVENUE NW & ALLENDALE ROAD NW 186 3 62 AVENUE NW & GLASTONBURY BOULAVARD NW 187 3 76 AVENUE NW & ARGYLL ROAD NW 188 3 76 AVENUE NW & GIRARD ROAD NW 189 3 81 AVENUE NW & BUENA VISTA ROAD NW 190 3 91 AVENUE NW & SASKATCHEWAN DRIVE NW 191 3 93 AVENUE NW & SCONA ROAD NW 192 3 95 AVENUE NW & CONNORS ROAD NW 193 3 97 AVENUE NW & ROSSDALE ROAD NW 194 3 98 AVENUE NW & OTTEWELL ROAD NW 195 4 BLACKMUD CREEK DRIVE SW & BLACKMUD CREEK CRESCENT SW 196 4 BRECKENRIDGE DRIVE NW & BEVINGTON PLACE NW 197 4 BRECKENRIDGE DRIVE NW & LEWIS ESTATES BOULEVARD NW 198 4 BULYEA ROAD NW & BURTON ROAD NW 199 4 BURLEY CLOSE NW & BURLEY DRIVE NW 200 4 CARTER CREST ROAD NW & CARSE LANE NW 201 4 CONNORS ROAD NW & CLOVERDALE HILL NW 202 4 CONNORS ROAD NW & STRATHEARN DRIVE NW 203 4 CUMBERLAND ROAD NW & HUDSON WAY NW 204 4 DOVERCOURT AVENUE NW & ST.ALBERT TRAIL NW 205 4 DUNLUCE ROAD NW & WARWICK ROAD NW 206 4 GUARDIAN ROAD NW & GRANTHAM DRIVE NW 207 4 HADDOW DRIVE NW & TERWILLEGER DRIVE NW 208 4 HEACOCK ROAD NW & HERRIG-COOPER WAY NW 209 4 HEATH ROAD NW & HEALY ROAD NW 210 4 HEATH ROAD NW & RIVERBEND ROAD NW 211 4 HENDERSON STREET NW & RIVERBEND ROAD NW 212 4 HERMITAGE ROAD NW & HABITAT CRESCENT NW 213 4 HERMITAGE ROAD NW & HENRY AVENUE NW 214 4 HERMITAGE ROAD NW & HOOKE ROAD NW 215 4 HERMITAGE ROAD NW & HYNDMAN CRESCENT NW 216 4 HERMITAGE ROAD NW & VICTORIA TRAIL NW 217 4 HERMITAGE ROAD NW & WILLIAM HUSTLER CR. NW 218 4 HODGSON BOULEVARD NW & RABBIT HILL ROAD NW 219 4 HODGSON ROAD NW & HODGSON WAY NW 220 4 HODGSON ROAD NW & HOLLINGSWORTH GREEN NW

128

ID Category Address 221 4 HOOKE ROAD NW & VICTORIA TRAIL NW 222 4 HUDSON WAY NW & CUMBERLAND ROAD NW 223 4 JACKSON ROAD NW & JAMHA ROAD NW 224 4 JAMHA ROAD NW & JEFFERYS CRESCENT NW 225 4 JASPER AVENUE NW & ALEX TAYLOR ROAD NW 226 4 KARL CLARKE ROAD NW & PARSONS ROAD NW 227 4 KINGSWAY AVENUE NW & AIRPORT ROAD NW 228 4 KINGSWAY AVENUE NW & PRINCESS ELIZABETH AV NW 229 4 KINGSWAY AVENUE NW & TOWER ROAD NW 230 4 KNOTTWOOD ROAD NORTH NW & MILL WOODS ROAD SOUTH NW 231 4 KRAMER PLACE NW & KAASA ROAD EAST NW 232 4 LAKEWOOD ROAD NORTH NW & MILL WOODS ROAD NW 233 4 LAKEWOOD ROAD SOUTH NW & MILL WOODS ROAD NW 234 4 LEE RIDGE ROAD NW & MILLBOURNE ROAD EAST NW 235 4 LEGER BOULAVARD NW & RABBIT HILL ROAD NW 236 4 LESSARD ROAD NW & ANTHONY HENDAY DRIVE NW 237 4 MACEWAN ROAD SW & MCALLISTER LOOP SW 238 4 MAYFIELD ROAD NW & 170 STREET NORTHBOUND NW 239 4 MAYFIELD ROAD NW & 170 STREET SOUTH BOUND NW 240 4 MILL WOODS ROAD SOUTH NW & KNOTTWOOD ROAD EAST NW 241 4 POTTER GREENS DRIVE NW & LEWIS ESTATES BOULEVARD NW 242 4 RABBIT HILL ROAD NW & BULYEA ROAD NW 243 4 RABBIT HILL ROAD NW & FALCONER ROAD NW 244 4 RABBIT HILL ROAD NW & RIVERBEND ROAD NW 245 4 RABBIT HILL ROAD NW & TERWILLEGER DRIVE NBD NW 246 4 RABBIT HILL ROAD NW & TERWILLEGER DRIVE SBD NW 247 4 RHATIGAN ROAD EAST NW & REHWINKEL CLOSE NW 248 4 RHATIGAN ROAD EAST NW & RIVERBEND ROAD NW 249 4 RHATIGAN ROAD WEST NW & RICHARDS CRESCENT NW 250 4 RHATIGAN ROAD WEST NW & RIVERBEND ROAD NW 251 4 RIVER VALLEY ROAD NW & FORTWAY DRIVE NW 252 4 RUTHERFORD ROAD SW & RUTHERFORD CLOSE SW 253 4 SASKATCHEWAN DRIVE NW & GATEWAY BOULEVARD NW 254 4 SASKATCHEWAN DRIVE NW & QUEEN ELIZABETH PRK RD NW 255 4 SHERWOOD PARK FREEWAY NW & ARGYLL ROAD NW 256 4 STIRLING ROAD NW & BEAUMARIS ROAD NW 257 4 STONY PLAIN ROAD NW & 170 STREET NORTHBOUND NW 258 4 STONY PLAIN ROAD NW & 170 STREET SOUTH BOUND NW 259 4 STONY PLAIN ROAD NW & ANTHONY HENDAY DRIVE NORTH NW 260 4 STONY PLAIN ROAD NW & ANTHONY HENDAY DRIVE SOUTH NW 261 4 STONY PLAIN ROAD NW & CONNAUGHT DRIVE NW 262 4 STONY PLAIN ROAD NW & MAYFIELD ROAD NW 263 4 TERWILLEGER DRIVE NBD NW & HADDOW DRIVE NW 264 4 TERWILLEGER DRIVE NW & ANTHONY HENDAY DRIVE NORTH NW 265 4 TERWILLEGER DRIVE NW & ANTHONY HENDAY DRIVE SOUTH NW

129

ID Category Address 266 4 TOMLINSON COMMON NW & TERWILLEGER VISTA NW 267 4 TOMLINSON COMMON NW & TOMLINSON WAY NW 268 4 TOMLINSON CRESCENT NW & TOMLINSON WAY NW 269 4 TORY GATE NW & TORY ROAD NW 270 4 UNIVERSITY AVENUE NW & SASKATCHEWAN DRIVE NW 271 4 WAGNER ROAD NW & DAVIES ROAD NW 272 4 WANYANDI ROAD NW & WHISTON ROAD NW 273 4 WELBOURN LANE NW & WERSHOF ROAD NW 274 4 WERSHOF ROAD NW & WELLWOOD WAY NW 275 4 WHITEMUD DRIVE EBD NW & 170 STREET NORTHBOUND NW 276 4 WHITEMUD DRIVE EBD NW & 170 STREET SOUTH BOUND NW 277 4 WHITEMUD DRIVE WBD NW & 170 STREET NORTHBOUND NW 278 4 WHITEMUD DRIVE WBD NW & 170 STREET SOUTH BOUND NW 279 4 WOLF WILLOW CRESCENT NW & WOLF WILLOW ROAD NW 280 4 WOLF WILLOW ROAD NW & WANYANDI ROAD NW 281 4 YELLOWHEAD TRAIL NW & HAYTER ROAD NW