Variograms and Kriging in the Analysis of Spatial Data
Total Page:16
File Type:pdf, Size:1020Kb
Edith Cowan University Research Online Theses: Doctorates and Masters Theses 1-1-1996 Variograms and kriging in the analysis of spatial data Suresh Tripathi Edith Cowan University Follow this and additional works at: https://ro.ecu.edu.au/theses Part of the Applied Mathematics Commons Recommended Citation Tripathi, S. (1996). Variograms and kriging in the analysis of spatial data. https://ro.ecu.edu.au/theses/ 970 This Thesis is posted at Research Online. https://ro.ecu.edu.au/theses/970 Edith Cowan University Copyright Warning You may print or download ONE copy of this document for the purpose of your own research or study. The University does not authorize you to copy, communicate or otherwise make available electronically to any other person any copyright material contained on this site. You are reminded of the following: Copyright owners are entitled to take legal action against persons who infringe their copyright. A reproduction of material that is protected by copyright may be a copyright infringement. Where the reproduction of such material is done without attribution of authorship, with false attribution of authorship or the authorship is treated in a derogatory manner, this may be a breach of the author’s moral rights contained in Part IX of the Copyright Act 1968 (Cth). Courts have the power to impose a wide range of civil and criminal sanctions for infringement of copyright, infringement of moral rights and other offences under the Copyright Act 1968 (Cth). Higher penalties may apply, and higher damages may be awarded, for offences and infringements involving the conversion of material into digital or electronic form. USE OF THESIS The Use of Thesis statement is not included in this version of the thesis. VARIOGRAMS AND KRIGING IN THE ANALYSIS OF SPATIAL DATA by Suresh Tripathi A thesis submitted in partial fulfilmentof the requirements for the award of Master of Science : Mathematics and Planning Department of Mathematics Edith Cowan University. June 1996 EDITH COWAN UNIVERSITY UBRARY TABLE OF CONTENTS DECLARATION VI ACKNOWLEDGMENTS VII ABSTRACT VIII 1. INTRODUCTION I 1.1. Background and Significance I 1.2. Regionalized Variables 2 1.3. Notation and Stationarity Assumptions 3 1.4. Probability Model 6 2. EXPLORATORY DATA ANALYSIS 8 2.1. Overview 8 2.2. Stem-and-Leaf Plots 10 2.3. Box Plots 12 2.4. Surface and Contour Plots 12 2.5. Post Plots 14 2.6. Bivariate Plots 15 2.7. Row and Column Means and Medians Plots 15 2.8. Median Polish 18 3. V ARIOGRAMS 20 3.1. Overview 20 3.2. V ariogram Models 22 3.3. Sample Variograms 25 iii 3.4. Fitting a Variogram Model 31 3.5. Anisotropy 36 3.6. Relative Variograms 49 4. KRIGING 50 4.1. Overview 50 4.2. Ordinary Kriging 52 4.3. Universal Kriging 58 4.4. Median Polish Kriging 60 4.5. Indicator Kriging 63 4.6. Cross Validation 67 5. MULTIVARIATE GEOSTATISTICS 69 5.1. Overview 69 5.2. Principal Component Analysis 70 5.3. Cross Covariograms and Cross Variograms 71 5.4. Analysis of the Correlation Structure 73 5.5. Ordinary Cokriging 75 5.6. Linear Model of Coregionalization 77 iv 80 6. APPLICATIONS 80 6.1. Overview 6.2. Available Phosphate Analysis 81 98 6.4. Potassium Analysis 119 6.5. Banksia Analysis 7. CONCLUSION 129 8. REFERENCES 132 9. APPENDICES 143 143 9.1. Appendix A 149 9.2. Appendix B 169 9 .3. Appendix C V Declaration I certify that this thesis does not incorporate without acknowledgment any material previously submitted for a degree or diploma in any institution of higher education; and that to the best of my knowledge and belief it does not contain any material previously published or written by another person except where due reference is made in the text. Signature vi Acknowledgments First and foremost I wish to express my profound gratitude to my supervisor Dr Lyn Bloom for her patience, guidance, enthusiasm, support and encouragement. I wish to thank also the Chairperson of the Mathematics Department Dr David McDougall for ,:r providing me with some financial support in the way of sessional teaching during my study. I am also grateful to Associate Professor Byron Lamont and Dr Lyn Bloom for providing me respectively with the Banksia data and the Soil data analysed in this thesis. I am deeply indebted to my friend Andrew Mehnert, my family and other colleagues for their support and words of encouragement. Finally I thank my father for providing me with continuous moral support, love and ceaseless encouragement during my study. vii Abstract This research is in the area of geostatistics and consists essentially of two parts. The first is an investigation of the variogram and cross variogram and the associated kriging and cokriging methods of spatial prediction and the second is an application of these in the analysis of two ( original) data sets. In the first part (chapter 1 to chapter 5), the focus is on summarising and illustrating the various techniques of Exploratory Data Analysis (EDA) and some methods used to estimate and model the experimental variograms and cross variograms fora given data set, together with some of the geostatistical methods of kriging and cokriging used for prediction purposes. The research also illustrates some of the many applications of this theory in the earth and environmental sciences. The second part of the thesis (chapter 6) is an application of these geostatistical techniques to the analysis of two (new) data sets. The first data set consists of the Available Phosphate (in ppm) and Potassium (in ppm) from two fields (one cropped and one uncropped) in the Jimperding Brook area of Western Australia. The second data set consists of the number of species of Banksia at various locations within a region of Southwestern Australia. viii 1. INTRODUCTION 1.1. Background and Significance The term Geostatistics was first introduced by Matheron ( 1962) in connection with the analysis of problems arising in the mining industry. It was subsequently extended to geology (Matheron 1963a, 1963b, 197 la) in general and to other earth sciences such as geography (Boots and Dufournaud 1994) and soil science (McBratney and Webster 1986 ; Yost et al 1982). However the theory involved is now being applied in a wide variety of other areas. These include hydrology (de Kwaadsteniet 1990), pollution control (Wartenberg, Uchrin, and Coogan 1991), and atmospheric science (Hass 1990). Some of the ideas introduced by Matheron had appeared earlier in other forms. For instance, what Matheron refersto as the variogram was called a structure function in physics by Kolmorogov (1941 ), in probability by Yaglom ( 1962), and in meteorology by Gandin ( 1963), and a mean-squared difference by Jowett ( 1952) in time series. However it is Matheron's terminology that has persisted and which will be used throughout this thesis. The name kriging was introduced by Matheron in honour of the South African mining engineer D. G. Krige who (Krige 1951, 1966) developed methods for determining the true ore-grade distributions from the distributions based on the sampled ore grades. However the formulation of spatial linear prediction did not come from Krige's work. The contributions of Weiner (1949), Wold (1938), and Kolmogorov (1941) all contain optimal linear prediction methods. In geostatistics one is dealing with spatial data, where the locations of the samples, as well as their values, are important. Such data frequently exhibit spatial autocorrelation, which, loosely speaking, means that nearby values will be more alike than values located far away from one another. One is interested in obtaining a profile of the distribution of the data values and in using this to predict the value at a particular location. The variogram is an important tool in modelling the spatial continuity (the spatial dependence) of the data while kriging is the method, or more accurately methods, of spatial interpolation used for prediction purposes. The type of kriging used depends on the nature of the data and the underlying assumptions of the mathematical model to be used. One advantage of this approach is that it is possible, via the kriging variance, to obtain an idea of the accuracy of such predictions. When multivariate data are analysed, methods of co-kriging are sometimes used as the prediction accuracy of one variable can often be improved by knowledge of values of another variable measuredat the same locations. 1.2. Regionalized Variables Matheron (1962, 1971b, 1973) introduced the term regionalized variable z(s) where the location index s is assumed to vary continuously over the domain D, a fixed subset of d-dimensional Euclidean Space Rd . Regionalized variables and their relevant applications in constructing models are very important in geostatistics. A regionalized variable is simply a random variable where the values vary with temporal or spatial location, defined in space-time to exhibit certain types of spatial or temporal 2 characteristics. A regionalized variable exhibits two main features, one of which is a structure aspect, and other is a random aspect. The first is based on theoretical grounds which express structural and geometrical properties in an adequate form, and the second is based on practical grounds which solves the problem of estimation of a regionalized variable from fragmentary sampling data. These two purposes are related to each other because the error of estimation depends on the structural characteristics; for example, the error of estimation becomes greater when the regionalized variable is more discontinuous. Regionalized variable theory focusses attention on means and variances of the variables under consideration. In paiticular, under the intrinsic stationarity assumptions, increments between all pairs of points, situated at a given distance h, have a mean zero and a finite variance that remains the same in the various parts of the area under study.