Visual Bandwidth Selection for Kernel Density Maps
Total Page:16
File Type:pdf, Size:1020Kb
Photogrammetrie • Fernerkundung • Geoinformation 5/2009, S. 445–454 Visual Bandwidth Selection for Kernel Density Maps Jukka M. krisp, stefan peters,Christian e. Murphy & hongChao fan, München Keywords: Kernel Density, Visualization, Visual Analytics, Geostatistics, Kernel Bandwidth, Cartography Summary: Within this paper we investigate the Zusammenfassung: Visuelle Radius Festlegung challenge to find an appropriate bandwidth in ker- für Kernel Dichte Abschätzungen. In dieser Arbeit nel density estimation. Kernel density estimation untersuchen wir die Möglichkeiten, um einen ange- methods can be used in visualizing and analyzing messenen Radius in der Kernel-Dichte Schätzung spatial data, with the objective of understanding zu erhalten. Kernel-Dichte Schätzung Methoden and potentially predicting event patterns. Our aim können in der Visualisierung und Analyse räumli- is to provide a computational tool to visually select cher Daten zum Verständnis und zur Vorhersage the bandwidth parameter and to visually investi- von räumlichen Mustern dienen. Unser Ziel ist es, gate the effect of different parameters within the ein Computer-Werkzeug zur visuellen Parameter kernel density calculations. With this tool the user Bestimmung zu entwickeln. Die Radien verschie- is able to visually find the most appropriate band- deneR Kernel werden visuell geprüft und können width foR the particular dataset and scale. The slid- experimentell untersucht werden. Mit diesem er-tool includes a graphical user interface and can Werkzeug kann der Anwender den am besten ge- read point datasets. A specific number of kernel eigneten Radius für den jeweiligen Datensatz be- density maps are generated by using a range of stimmen. Das Schieberegler-Werkzeug beinhaltet bandwidths. In the graphical interface these kernel eine grafische Benutzeroberfläche und kann Punkt- density maps can be drawn near real-time while the datensätze einladen. Anhand statistischer Parame- user changes the respective bandwidth using a slid- ter des verwendeten Punktdatensatzes wird eine er tool. This helps to determine which bandwidth bestimmte Anzahl der Kernel Dichte Karten mit setting fits the needs and provide an appropriate verschiedenen Radien vorberechnet. In der grafi- visualization of a particular point dataset. schen Benutzeroberfläche lassen sich diese Kernel- Dichte-Karten „in real-time“ darstellen. Der Be- nutzer selektiert verschiedene Radien mit einem Slider-Tool, wobei die jeweiligen Kernel Dichte Karten nacheinander abgebildet werden. Dadurch lässt sich der optimale Radius zur zweckdienlichen Visualisierung von einem bestimmten Datensatz finden. 1Introduction specific clusters are located on the map and visualize alternatives to visually cluster spa- Geographic information systems use different tial data. methods to display spatial information, and From a theoretical perspective, this relates to within this paper we investigate new tools to the concept of Geovisualization, which provides enhance the visual clustering of data. On a theory, additionally methods and tools for the general level the aim is to detect spatial pat- visual exploration, analysis, synthesis and pres- terns in data and formulate hypotheses based entation of data that contain geographic infor- on the clusters of point data. Therefore we mation (MacEachrEn & KraaK 2001). From a need to link numerical and graphical proce- broader perspective it includes the concept of ex- dures with a map. We try to identify where ploratory spatial data analysis to detect spatial DOI: 10.1127/1432-8364/2009/00321432-8364/09/0032$2.50 © 2009 DGPF / E. SchweizeRbart'sche Verlagsbuchhandlung, D-70176 Stuttgart 446 Photogrammetrie • Fernerkundung • Geoinformation 5/2009 properties of data. There are different types of sets that are continuous/in raster form; proba- users who use geovisualization systems differ- bility distribution estimation; interpolation or ently and bRing different knowledge to the deci- hot spot detection. It is important to under- sion-making process. The use of these tools stand visualization techniques as a means to should support the interaction between different explore geographic data and to detect coRRela- actors, e. g., decision-makers, the public and ex- tions in spatial data. Methods foR kernel den- perts. The communication between these actors sity estimates have a wide variety of applica- supports the development of models and the cre- tions. These include the estimation of proba- ation of task-specific maps. Visualization is a bility distributions and the interpolation of key point in the mediation of content and deci- point data foR hot spot detection. Kernel den- sions. Therefore, the choice of visualization sity estimation can also be used in the visual- methods, the orientation of the visualization on ization and analysis of temporal patterns, such the geographic area, as well as the selection of as crime events at different times of day and / the geographic area is of high importance. The or over different time periods. The goal is an overall goal is an adequate communication for understanding and prediction of potential pat- example to gain political support for a certain terns in the available data point. Previous re- decision and enhance communication between search includes work done by LEvinE (2004), experts from various fields. It is essential to pro- who provides an excellent discussion of the vide visualization techniques as a means of ex- various techniques and options, including al- ploring spatial data in ordeR to detect features of ternative methods for bandwidth selection interest contained in them as suggested by (chiu 1991), together with examples from hEarnShaw & unwin (1994). As virrantauS et crime analysis, health research, urban devel- al. (2009) state, the impact of the visualization opment and ecology (LEvinE 2004, SMith et al. on knowledge acquisition (does the map present 2008). Research by KriSp & ŠpatEnKová (2009) unknown information, or is it used to display investigated the possibility to visually review and confirm previously known information?), its the significance of a bandwidth selection for role as an investigative tool (is the map for pri- the fire & rescue services resource planning. vate study, or is it part of a more public decision Bandwidth selection, as explained in the fol- making process?), and its didactic capabilities (is lowing, is often more of an art than a science the map being used interactively or is being read as suggested by SMith et al. (2008). passively?) can be researched through models of Historically, Point Pattern Analysis was in- visualization. In this paper the density map is re- troduced for scientific applications in the early garded as a tool to confirm and depict previously 1920s. GLEaSon (1920) and SvEdbErG (1922) known information. used Point Pattern Analysis (PPA) to statisti- Point densities can be displayed as isarithmic cally describe theiR botanical and ecological maps. An isarithmic map is created by interpo- work. Since then many different science fields lating a set of isolines between sample points of such as epidemiology, criminology, archaeol- known values (SLocuM et al. 2001, SLocuM et ogy or seismology use PPA. For instance, one al. 2005). In the special case of handling the can conduct a “Hot spot” analysis to have a locations of events as true point data the term better understanding of locations of crime. A isometric map is used. Kernel density estima- Hot Spot is a concentration of point events tion (KDE) can supply point densities from a set within a limited area. This can help the police, of point locations. FoR that reason we choose to for example, to make decisions where to inten- display point densities as isometric maps using sify daily patrols. Point Pattern Analysis in- kernel density estimation (KDE) method. In volves the ability to compare and describe these isometric maps the isolines show a curve point patterns and test whetheR there is a sig- in which the point density value is constant. nificant difference to a random spatial point Within this paper we investigate kernel densi- pattern. A point pattern is a set of locations of ty estimation methods which are included in a point events (o’SuLLivan & unwin 2003). variety of applications: point data smoothing; There are three main types of point patterns creation of continuous surfaces from point which can be analyzed, clustered, evenly spaced data in order to combine these with other data- and random spatial point patterns. In a clustered J. M. Krisp et al., Visual Bandwidth Selection 447 Clustered Point PatternEvenly Spaced Point PatternRandom Point Pattern Fig. 1: Types of Point Patterns in practice. point pattern the events are grouped in regions of tion function extends then in all directions to the study area. Because of the spatial dependen- infinity. This gives the advantage of a through- cy in a clustered pattern one can speak of a posi- out covered study area that shows the intensity tive auto-correlation. An evenly spaced point at all locations. Generally it is difficult to com- pattern is a point pattern with approximately pare point density values in different datasets, equal distances between neighboring points.A because the number of points and the size of point´s neighboR is the nearest otheR point located the study area may differ greatly. This influ- to itself. Approximately equal distances of all ences the results of the calculations. The defi- neighboring points imply that the pattern is neg- nition of “high” and