Means & to : Spatial Basics and Innovations Lauren Bennett, Flora Vale, Alberto Nieto

esriurl.com/spatialstats What are Spatial Statistics? Spatial Statistics are a set of exploratory techniques for describing and modeling spatial distributions, patterns, processes, and relationships. coincidence area connectivity proximity orientation length direction

Spreadsheets or Information? Maps Data or Information? When you look at a spreadsheet… You ask for more

• Standard Deviations • Min and Max • … Same goes for maps! We can do more and Medians summarizing spatial distributions

Machine Learning clustering methods Means and Medians summarizing spatial distributions Central Feature identifies the most centrally located feature in a point, line, or polygon feature class

Y

X Y

X Y

X Y

X Y

X Y

central feature

X Mean Center identifies the geographic center (or the center of concentration) for a set of features Y

(13,12) (25,24) (22,23) (9,18) (14,14) (24,16)

(12,12) (18,12) (14,8)

X (14,14) (13,12)

Y (25,24) (24,16) (22,23) (18,12) (12,12) (17,15) mean (14,8) center (9,18) mean = (17,15) X Center identifies the location that minimizes overall Euclidean distance to the features in a dataset X

(13,12) (25,24) (22,23) (9,18) (14,14) (24,16)

(12,12) (18,12) (14,8)

Y X: 25 . 24 . 22 . 18 . 14 . 14 . 13 . 12 . 9 Y Y: 24 . 23 . 18 . 16 . 14 . 12 . 12 . 12 . 8

median = (14,14) median center

X Mean vs Median? (176,138) Y

(13,12) (25,24) (22,23) (9,18) (14,14) (24,16)

(12,12) (18,12) (14,8)

X Y

X Linear Directional Mean identifies the mean direction, length, and geographic center for a set of lines Y

X Y

X Directional Distribution (Standard Deviational Ellipse) creates standard deviational ellipses to summarize the spatial characteristics of geographic features: , dispersion, and directional trends Y

mean center

X Y

X

Demo Machine Learning clustering methods Density-based Clustering finds clusters based on feature locations

Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5 Noise DBSCAN – defined distance

HDBSCAN – self adjusting

OPTICS – multi-scale DBSCAN – defined distance DBSCAN – defined distance DBSCAN – defined distance DBSCAN – defined distance DBSCAN – defined distance DBSCAN – defined distance DBSCAN – defined distance HDBSCAN – self adjusting OPTICS – multi-scale OPTICS – multi-scale OPTICS – multi-scale

OPTICS – multi-scale Reachability Reachability distance

Feature order OPTICS – multi-scale OPTICS – multi-scale DBSCAN HDBSCAN OPTICS

• Uses fixed search • Uses of • Uses neighbor distance search distances distances to create to find clusters of reachability plot • Clusters of similar varying densities densities • Most flexibility for • Data driven, fine tuning • Fast requires least user input • Can be computationally intensive Demo Multivariate Clustering finds clusters based on feature attributes K Means 3 groups 4 groups K Means

2 groups 3 groups 4 groups K Means

2 groups 3 groups 4 groups K Means

2 groups 3 groups 4 groups

Eligible Uninsured Americans

% Below 138 FPL

% Uninsured

% No High School

% Latino Spatially Constrained Multivariate Clustering finds clusters based on feature attributes and proximity https://xkcd.com/1364/ Minimum Spanning Tree Minimum Spanning Tree Minimum Spanning Tree Minimum Spanning Tree Minimum Spanning Tree Minimum Spanning Tree Crime in Chicago

Median Income

HS Dropout Rate

UnemploymentUnemployment

Crime Count Demo dissimilar between uniform dissimilar groups between Build Balanced Zones creates spatially contiguous zones in your study area using a genetic growth based on criteria that you specify Criteria Zone building

• Attribute target • Number of zones and attribute target • Number of zones Criteria Zone selection

• Equal area • Compactness • Equal number of features • Attribute to consider

fitness score 9.14 9.14

10.22 … top 50% move on to next generation top 50% move on to next generation top 50% move on to next generation top 50% move on to next generation and crossover to create new offspring top 50% move on to next generation and crossover to create new offspring top 50% move on to next generation and crossover to create new offspring then the next fittest 50% moves on and crosses over to create the next generation

Convergence Fitness Score Fitness

Generation Demo Want to learn more??? Please fill out a [email protected] course survey!!! [email protected] esriurl.com/spatialstats [email protected]

TUESDAY______1:45p Data Visualization for Spatial 146C

3:00p Machine Learning in ArcGIS 146C

4:15p From Means and Medians to Machine Learning: Spatial Statistics Basics and Innovations 146C

WEDNESDAY______8:30a Machine Learning in ArcGIS 146C

11a Data Visualization for 146C

1:30p From Means and Medians to Machine Learning: Spatial Statistics Basics and Innovations 146C

2:45p Spatial : Cluster Analysis and Space Time Analysis 146C

4:00p Beyond Where: Modeling Spatial Relationships and Making Predictions 146C

5:15p The Forest for the Trees: Making Predictions Using Forest-Based Classification and Regression 146C Want to learn more??? Please fill out a [email protected] course survey!!! [email protected] esriurl.com/spatialstats [email protected]

TUESDAY______1:45p Data Visualization for Spatial Analysis 146C

3:00p Machine Learning in ArcGIS 146C

4:15p From Means and Medians to Machine Learning: Spatial Statistics Basics and Innovations 146C

WEDNESDAY______8:30a Machine Learning in ArcGIS 146C

11a Data Visualization for Spatial Analysis 146C

1:30p From Means and Medians to Machine Learning: Spatial Statistics Basics and Innovations 146C

2:45p Spatial Data Mining: Cluster Analysis and Space Time Analysis 146C

4:00p Beyond Where: Modeling Spatial Relationships and Making Predictions 146C

5:15p The Forest for the Trees: Making Predictions Using Forest-Based Classification and Regression 146C