Spatial Tools for Econometric and Exploratory Analysis

Michael F. Goodchild University of California, Santa Barbara Luc Anselin University of Illinois at Urbana-Champaign http://csiss.org Outline

¾A Quick Tour of a GIS ¾Spatial Data Analysis ¾CSISS Tools Spatial Data Analysis Principles: 1. Integration

¾Linking data through common location ƒ the layer cake ¾Linking processes across disciplines ƒ spatially explicit processes ƒ e.g. economic and social processes interact at common locations

2. Spatial analysis

¾Social data collected in cross- section ƒ longitudinal data are difficult to construct ¾Cross-sectional perspectives are rich in context ƒ can never confirm process ƒ though they can perhaps falsify ƒ useful source of hypotheses, insights

3. Spatially explicit theory

¾Theory that is not invariant under relocation ¾Spatial concepts (location, distance, adjacency) appear explicitly ¾Can spatial concepts ever explain, or are they always surrogates for something else? 4. Place-based analysis

¾Nomothetic - search for general principles ¾Idiographic - description of unique properties of places ¾An old debate in Geography The Earth's surface

¾Uncontrolled variance ¾There is no average place ¾Results depend explicitly on bounds ¾Places as samples ¾Consider the model: y = a + bx

Tract Pop Location Shape 1 3786 x,y 2 2966 x,y 3 5001 x,y 4 4983 x,y 5 4130 x,y 6 3229 x,y 7 4086 x,y 8 3979 x,y

Iij = EiAjf (dij) / ΣkAkf (dik)

Aj d Ei ij

Types of Spatial Data Analysis

¾ Exploratory Spatial Data Analysis • exploring the structure of spatial data • determining the nature of spatial dependencies • spatial data as context ¾ Confirmatory Spatial Data Analysis • testing data against spatial models • spatial econometrics • econometric models that incorporate spatial effects Four Elements of Spatial Econometrics

¾ Specifying the Structure of Spatial Dependence • which locations/actors interact ¾ Testing for the Presence of Spatial Dependence • what type of dependence • what alternative ¾ Estimating Models with Spatial Dependence • spatial lag, spatial error, higher order ¾ Spatial Prediction • interpolation, missing values Abstraction of geographic space

¾Cartograms

¾Invariance under rotation, displacement, reflection Space as a matrix

¾W where wij is some measure of interaction ƒ adjacency ƒ decreasing function of distance ƒ invariant under rotation, displacement, reflection ƒ readily obtained from GIS ƒ a measure of spatial vs social interaction Specification = Spatial Weights

¾ Spatially Lagged Variables ƒ Wy, Wε, WX ¾ Data Structure ƒ W not as N by N matrix ƒ sparse structure • 3000+ counties 5/6 neighbors ¾ Computation ƒ distance based: easy • uses point coordinates • generalized distance ƒ contiguity: requires GIS interface • uses boundary file or tessellation Classification

¾ Mainstream Packages ƒ SAS, SPSS, Systat ¾ Mainstream Econometric Packages ƒ , TSP, Rats, Limdep, E-Views, Shazam ¾ Toolboxes ƒ Matlab, S-Plus, Bugs/WinBugs ¾ Open Source Toolboxes ƒ , XLispstat ¾ Self-Contained Specialized ƒ SpaceStat Spatial Regression Functionality

¾ Rare in ƒ spatial functionality tends to be geostatistics, point pattern analysis • variogram, kriging • e.g., SAS, Systat ¾ Specialized Scripts/Macros ƒ routines with specific functionality ƒ constrained by data format, size, speed ¾ “Comprehensive” ƒ SpaceStat, S+Spatialstats, Spdep (R), LeSage-Pace (Matlab), Pisati (Stata Ado) Review (not comprehensive)

¾ SpaceStat ƒ linear spatial regression • weights construction, diagnostics, ML, IV/GM • outdated architecture and interface ¾ S+Spatialstats (Splus) ƒ linear spatial regression • weights construction (ArcView bridge), ML • no diagnostics ¾ Spdep (Bivand) ƒ R package, open source ƒ linear spatial regression • weights construction (from boundary file), diagnostics • ML estimation only ƒ other R packages: Venables-Ripley, GWR, etc. Review (continued)

¾ Spatial Toolbox (LeSage, Pace) ƒ Matlab and routines ƒ linear spatial regression • some weights construction (Thiessen), ML estimation (sparse weights), Bayesian estimation (Gibbs sampler) ƒ spatial probit/tobit: Gibbs sampler ¾ Stata (Pisati, Conley) ƒ Spatreg • diagnostics, ML estimation, no weights from polygons ƒ GMM Conley estimator ¾ GeoBugs ƒ MCMC for conditional spatial model Miscellaneous

¾ SAS ƒ spatial regression estimation (Griffith) ¾ SPSS ƒ spatial autocorrelation (Tiefelsdorf) ¾ Xlispstat ƒ “Ord” ML estimation using eigenvalues ¾ TSP ƒ ML estimation spatial error, spatial lag (based on Upton and Fingleton) ¾ RATS ƒ Driscoll-Kraaij estimator Issues

¾Performance Problems ƒ generic optimization routines ƒ inefficient matrix data structure ƒ slow loops ¾Little/No Diagnostics ƒ focus tends to be on estimation ¾No Interoperability ¾No Standards CSISS Tools CSISS Tools Project Mission

¾ Goals: ƒ facilitate dissemination of spatial analysis software tools to social scientists ƒ develop a library/libraries of spatial data analysis modules ƒ develop prototypes implementing state of the art methods ƒ initiate and nurture a community of open source developers

GeoDa

¾ ESDA with Dynamically Linked Windows ƒ freestanding ƒ reads ESRI shape files • points and polygons ƒ MapObjects LT2 technology ƒ free ¾ Replaces ArcView Extensions ƒ SpaceStat Extension and DynESDA extension now obsolete ¾ Download ƒ http://sal.agecon.uiuc.edu/csiss/geoda.html Supporting Materials Requirements and Challenges

¾ Interoperable ƒ need for standards • data structures, model formulation •XML ƒ adhere to/contribute to OGC ¾ Open ƒ open source promotes quality control ƒ standards allow a modular approach ¾ Fast ƒ need for efficient (new) algorithms ƒ large data set latent variable models ƒ space-time models