Test of Complete Spatial Randomness on Networks

Test of Complete Spatial Randomness on Networks A PROJECT SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY Xinyue Chang IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF APPLIED AND COMPUTATIONAL MATHEMATICS YANG LI May, 2016 c Xinyue Chang 2016 ALL RIGHTS RESERVED Acknowledgements Firstly, I would like to thank my advisor Professor Yang Li for his incredible support, guidance, and encouragement on my project and graduate study. I would also like to thank Professor Barry James and Professor Haiyang Wang for serving as my committee members and their time reading my project report. Last but not least, I am very grateful to Professor Kang James for her valuable suggestions and comments in the statistical seminar class. i Abstract Test of complete spatial randomness (CSR) is an essential part of spatial analysis and regarded as a minimal prerequisite to any serious attempt to model an observed point pattern. It has been investigated, discussed and verified in planar region by researchers for more than 40 years. Recently more and more data of spatial point processes on networks have been collected. This project aimed to apply CSR test method to any spatial point pattern on the network. The study started with the derivation of the cumulative distribution function (CDF) of inter-event distances between two locations randomly distributed on a grid network. We then carried out a test procedure based on Monte Carlo simulation. The procedure was proposed when considering both inter- event distances and nearest-neighbor distances. It was found that this method worked well when the process was constrained on a network. Finally, the car accident pattern on Minnesota major roads network was tested by both inter-event distances method and nearest-neighbor distances method. ii Contents Acknowledgements i Abstract ii List of Tables vi List of Figures vii 1 Introduction 1 1.1 Background and Organization . 1 1.2 Definitions . 3 1.2.1 Complete Spatial Randomness . 3 1.2.2 Spatial Point Processes on Networks . 3 1.2.3 Inter-event Distances . 4 1.2.4 CSR Test Based on Inter-event Distances . 4 1.2.5 Nearest-neighbor Distances . 5 1.2.6 CSR Test Based on Nearest-neighbor Distances . 5 2 CDF of Inter-event Distances on a Grid Network under CSR 6 2.1 Preliminary . 6 2.2 CDF of Inter-event Distances if t < 1.................... 9 2.2.1 Cumulative Distribution Function . 9 2.2.2 Simulation . 12 2.3 CDF of Inter-event Distances if t > 1.................... 13 2.3.1 Cumulative Distribution Function . 13 iii 2.3.2 Simulation . 18 3 CSR Test Based on Inter-event Distances 20 3.1 CSR Test Implementation . 20 3.2 Simulations . 21 3.2.1 Random Process . 22 3.2.2 Cluster Process . 24 3.2.3 Regular Process . 25 3.2.4 Conclusion . 27 4 CSR Test Based on Nearest-neighbor Distances 28 4.1 CSR Test Implementation . 28 4.2 Simulations . 29 4.2.1 Random Process . 30 4.2.2 Cluster Process . 31 4.2.3 Regular Process . 32 4.2.4 Conclusion . 33 5 Car Crash Point Pattern on the Minnesota Major Roads 34 5.1 Dataset . 34 5.2 Implementation . 36 5.2.1 CSR Test Based on Inter-event Distances . 37 5.2.2 CSR Test Based on Nearest-neighbor Distances . 37 5.3 Result and Analysis . 37 6 Conclusion 39 References 40 Appendix A. Glossary and Acronyms 41 A.1 Glossary . 41 A.2 Acronyms . 41 iv Appendix B. Code 43 B.1 R Code . 43 B.1.1 Random Process . 43 B.1.2 Cluster Process . 44 B.1.3 Regular Process . 47 B.1.4 Random Network . 48 B.1.5 Car Crash Pattern on the MN Roads . 50 B.2 Python Code . 52 v List of Tables A.1 Acronyms . 41 vi List of Figures 2.1 An example of a regular grid network with m = n = 11. 7 2.2 Four possible locations of two arbitrary points on the 5 × 5 grid . 8 2.3 The 11 × 11 grid network . 13 2.4 Simulation result and plot for CDF when t < 1 (blue is the theoretical function) . 13 2.5 The 11 × 11 grid network . 19 2.6 Simulation result and plot for CDF when t > 1 (blue is the theoretical function) . 19 3.1 Random point pattern on the grid network . 23 3.2 Envelope plot for random process on the grid network . 23 3.3 Random point pattern on a random network . 23 3.4 Envelope plot for random process on a random network . 23 3.5 Cluster point pattern on the grid network . 24 3.6 Envelope plot for cluster process on the grid network . 24 3.7 Cluster point pattern on a random network . 25 3.8 Envelope plot for cluster process on a random network . 25 3.9 Regular point pattern on the grid network . 26 3.10 Envelope plot for regular process on the grid network . 26 3.11 Regular point pattern on a random network . 27 3.12 Envelope plot for regular process on a random network . 27 4.1 Grid network and random point pattern . 30 4.2 Envelope plot for random process on grid network . 30 4.3 Random network and random point pattern . 30 4.4 Envelope plot for random process on random network . 30 vii 4.5 Grid network and cluster point pattern . 31 4.6 Envelope plot for cluster process on grid network . 31 4.7 Random network and cluster point pattern . 31 4.8 Envelope plot for cluster process on random network . 31 4.9 Grid network and regular point pattern . 32 4.10 Envelope plot for regular process on grid network . 32 4.11 Random network and regular point pattern . 32 4.12 Envelope plot for regular process on random network . 32 5.1 Location of fatal crashes in Minnesota in 2013 . 35 5.2 R Plot of Minnesota Major Roads Network. 35 5.3 R Plot of the car crash pattern on the Minnesota major roads. 35 5.4 Display of the car crash pattern on the Minnesota major roads in ArcGIS 35 5.5 R plot of a CSR point pattern on the Minnesota major roads . 36 5.6 Display of a CSR point pattern on the Minnesota major roads in ArcGIS 36 5.7 Envelope plot for CSR test for car crash pattern by inter-event method. 38 5.8 Envelope plot for CSR test of car crash pattern by nearest-neighbor method. 38 viii Chapter 1 Introduction 1.1 Background and Organization Practical investigation in ecology, epidemiology, and transportation often involves ob- servation and study of spatial distribution of events. Researchers are interested in the classification of a spatial point pattern and need to know if it is complete spatial randomness (CSR) in the very beginning. Then the method of testing CSR for spatial point process draws researchers more and more attention. The techniques proposed for detecting non-randomness may be divided broadly into two groups, described respectively as quadrant methods and distance methods [1]. The power of randomness tests and, particularly tests based on nearest-neighbor distances, inter-point distances and estimators of moment measures have been investigated by ar- ticle [2]. Some papers have also tried to develop some other methods besides distances and quadrants. In paper [3], the author introduced testing spatial randomness based on angles between the vectors joining each sample point to its nearest neighbors. And a method of qualifying spatial pattern where sample point move to a regular arrangement which resembles a hexagonal lattice was discussed by [4]. To explore deeper the per- formance of the CSR test, paper [5] presents results confirmed by ecological data and illustrates that tests without edge-effect correction proposed by Diggle have a higher power for small sample sizes. The assumption of all these works is that spatial point events can be located any- where on the planar region. However, spatial points can only be located on edges of 1 2 a specific network in some practical scenarios. For example, car crash locations lie on roads, which are able to form a roads network. Then the CSR test should become different and complicated in the sense that inter-event distances are not Euclidean distance any more, and have to adjust to the geometry of network. Motivated by the concern, CSR tests based on inter-event distances and nearest-neighbor distances [6] are discussed under network scenario and verified to be applicable to the network point pattern in this thesis. The result is confirmed by three point processes simulated on a grid network and random network. In terms of the test method based on the inter-event distance, it would be precise and simple enough to implement if the theoretical cumulative distribution function (CDF) of CSR were known. Not surprisingly, there are already some fancy results from the most common cases of square or circular regions. For a square of unit side, the distribution function of inter-event distances is 8 > πt2 − 8t3=3 + t4=2 0 ≤ t ≤ 1 <> 2 4 2 1 2 H(t) = 1=3 − 2t − t =2 + 4(t − 1) 2 (2t + 1)=3 > p :> +2t2 arcsin(2t−2 − 1) 1 < t ≤ 2 For a circle of unit radius the corresponding expression is p H(t) = 1 + π−1f2(t2 − 1) arccos(t=2) − t(1 + t2=2) 1 − t2=4g for all 0 ≤ t ≤ 2 [6]. If we consider the CSR point pattern on the network, the distances relying on the geometry of network would make a difference from the case of planar region.

Test of Complete Spatial Randomness on Networks

POISSON PROCESSES 1.1. the Rutherford-Chadwick-Ellis

Rescaling Marked Point Processes

Lecture 26 : Poisson Point Processes

A Course in Interacting Particle Systems

Point Processes, Temporal

Martingale Representation in the Enlargement of the Filtration Generated by a Point Process Paolo Di Tella, Monique Jeanblanc

Markov Determinantal Point Processes

Notes on the Poisson Point Process

Stochastic Differential Equations with Jumps

Stochastic Point Processes

Point Processes on Directed Linear Network Arxiv:1812.09071V2 [Math

ARCH/GARCH Models in Applied Financial Econometrics