Spatial Analysis and GIS: a Primer

Spatial Analysis and GIS: A Primer Gilberto Câmara1, Antônio Miguel Monteiro1, Suzana Druck Fucks2, Marília Sá Carvalho3 1Image Processing Division, National Institute for Space Research (INPE), Av dos Astronautas 1758, São José dos Campos, Brazil 2Brazilian Agricultural Research Agency (EMBRAPA), Rodovia Brasília-Fortaleza, BR 020, Km 18, Planaltina, Brazil 3National School for Public Health , Fundacao Oswaldo Cruz R. Leopoldo Bulhoes, 1480/810, Rio de Janeiro, Brazil Introduction Understanding the spatial distribution of data from phenomena that occur in space constitute today a great challenge to the elucidation of central questions in many areas of knowledge, be it in health, in environment, in geology, in agronomy, among many others. Such studies are becoming more and more common, due to the availability of low cost Geographic Information System (GIS) with user-friendly interfaces. These systems allow the spatial visualization of variables such as individual populations, quality of life indexes or company sales in a region using maps. To achieve that it is enough to have a database and a geographic base (like a map of the municipalities), and the GIS is capable of presenting a colored map that allows the visualization of the spatial pattern of the phenomenon. Besides the visual perception of the spatial distribution of the phenomenon, it is very useful to translate the existing patterns into objective and measurable considerations, like in the following cases: 1/30 • Epidemiologists collect data about the occurrence of diseases. Does the distribution of cases of a disease form a pattern in space? Is there any association with any source of pollution? Is there any evidence of contagion? Did it vary with time? • We want to investigate if there is any spatial concentration in the distribution of theft. Are thefts that occur in certain areas correlated to socio-economic characteristics of these areas? • Geologists desire to estimate, from some samples, the extension of a mineral deposit in a region. Can those samples be used to estimate the mineral distribution in that region? • We want to analyze a region for agricultural zoning purposes. How to choose the independent variables – soil, vegetation or geomorphology – and determine what the contribution of each one of them is to define where each type of crop is more adequate? All of these problems are part of spatial analysis of geographical data. The emphasis of Spatial Analysis is to measure properties and relationships, taking into account the spatial localization of the phenomenon under study in a direct way. That is, the central idea is to incorporate space into the analysis to be made. This book presents a set of tools that try to address these issues. It is intended to help those interested to study, explore and model processes that express themselves through a distribution in space, here called geographic phenomena. A pioneer example, where the space category was intuitively incorporated to the analyses performed took place in the 19th century carried out by John Snow. In 1854, one the many cholera epidemics was taking place in London, brought from the Indies. At that time, nobody knew much about the causes of the disease. Two scientific schools tried to explain it: one relating it to miasmas concentrated in the lower and swampy regions of the city and another to the ingestion of contaminated water. The map (Figure 1) presents the location of 2/30 deaths due to cholera and the water pumps that supplied the city, allowing the clear identification of one of the locations, in Broad Street, as the epicenter of the epidemics. Later studies confirmed this hypothesis, corroborated by other information like the localization of the water pump down river from the city, in a place where there was a maximum concentration of waste, including excrements from choleric patients. This was one of the first examples of spatial analysis where the spatial relationship of the data significantly contributed to the advancement in the comprehension of a phenomenon. Figure 1 – London Map showing deaths from cholera identified by dots and water pumps represented by crosses. Data types in spatial analysis The most used taxonomy to characterize the problems of spatial analysis consider three types of data: • Events or point patterns – phenomena expressed through occurrences identified as points in space, denominated point processes. Some examples are: crime spots, disease occurrences, and the localization of vegetal species. 3/30 • Continuous surfaces – estimated from a set of field samples that can be regularly or irregularly distributed. Usually, this type of data results from natural resources survey, which includes geological, topographical, ecological, phitogeographic, and pedological maps. • Areas with Counts and Aggregated Rates – means data associated to population surveys, like census and health statistics, and that are originally referred to individuals situated in specific points in space. For confidentiality reasons these data are aggregated in analysis units, usually delimited by closed polygons (census tracts, postal addressing zones, municipalities). From the data types above, it can be verified that the problems of spatial analysis deal with environmental and socioeconomic data. In both cases, the spatial analysis is composed by a set of chained procedure that aims at choosing of an inferential model that explicitly considers the spatial relationships present in the phenomenon. In general, the modeling process is preceded by a phase of exploratory analysis, associated to the visual presentation of the data in the form of graphs and maps and the identification of spatial dependency patterns in the phenomenon under study. In the case of point pattern analysis the object of interest is the very spatial location of the events under study. Similarly to the situation analyzed by Snow, the objective is to study the spatial distribution of these points, testing hypothesis about the observed pattern: if it is random or, on the contrary, if it presents itself in agglomerates or is regularly distributed. It is also the matter of studies aiming at estimating the risk of diseases around nuclear plants. Another case is to establish a relationship between the occurrence of events with the characteristics of the individual, incorporating possible environmental factors about which there is no data available. For example, would the mortality by tuberculosis, even considering the known risk factors, vary with the address of the patient? As an example, Figure 2 illustrates the application of point pattern analysis for the case of mortality by external causes in the city of Porto Alegre, with 1996 data, carried 4/30 out by Simone Santos and Christovam Barcellos, from FIOCRUZ. The homicide locations (red), traffic accidents (yellow) and suicides (blue) is shown in Figure 2 (left). On the right, a surface for the estimated intensity is presented, that could be thought as the “temperature of violence”. The interpolated surface shows a pattern of point distribution with a strong concentration in the downtown of the city, decreasing in the direction of the more remote quarters. Figure 2 – Distribution of cases of mortality by external causes in Porto Alegre in 1996 and the intensity estimator. For surface analysis, the objective is to reconstruct the surface from which the samples were removed and measured. For example, consider the distribution of profiles and soil samples, for the state of Santa Catarina and surrounding areas, and the spatial distribution map of the saturation by bases variable, produced by Simone Bönisch, from INPE, and presented in Figure 1-3. 55,437 (%) * Perfis * Amostras 8,250 Figure 3 – Profiles and soil samples distribution in Santa Catarina (left) and estimated continuous distribution of the saturation by bases variable (right). 5/30 How did we build this map? The highlighted crosses indicate the localization of the points of soil sampling; from these measures a spatial dependency model was estimated allowing the interpolation of the surface presented in the map. The inferential model has the objective of quantifying the spatial dependence among the sample values. This model utilizes the techniques of geostatistics, whose central hypothesis is the concept of stationarity (discussed later in this chapter) that supposes a homogeneous behavior on the structure of spatial correlation in the region of study. Since environmental data are the result of natural phenomena of medium and long duration (like the geological processes), the stationarity hypothesis is derived from the relative stability of these processes; in practice, this implies that stationarity is present in a great number of situations. It must be observed that stationarity is a non-restrictive work hypothesis in the approach of non-stationary problems. Methods like universal kriging, fai-k, external derivation, co-kriging, and disjunctive kriging are meant for the treatment of non-stationary phenomena. In the case of the areal analysis, most of the data are drawn from population survey like census, health statistics and real estate cadastre. These areas are usually delimited by closed polygons where supposedly there is internal homogeneity, that is, important changes only occur in the limits. Clearly, this is a premise that is not always true, given that frequently the survey units are defined by operational (census tracts) or political (municipalities) criteria and there is no guarantee that the distribution of

Spatial Analysis and GIS: a Primer

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support