Kriging Models for Linear Networks and Non‐Euclidean Distances

Received: 13 November 2017 | Accepted: 30 December 2017 DOI: 10.1111/2041-210X.12979 RESEARCH ARTICLE Kriging models for linear networks and non-Euclidean distances: Cautions and solutions Jay M. Ver Hoef Marine Mammal Laboratory, NOAA Fisheries Alaska Fisheries Science Center, Abstract Seattle, WA, USA 1. There are now many examples where ecological researchers used non-Euclidean Correspondence distance metrics in geostatistical models that were designed for Euclidean dis- Jay M. Ver Hoef tance, such as those used for kriging. This can lead to problems where predictions Email: [email protected] have negative variance estimates. Technically, this occurs because the spatial co- Handling Editor: Robert B. O’Hara variance matrix, which depends on the geostatistical models, is not guaranteed to be positive definite when non-Euclidean distance metrics are used. These are not permissible models, and should be avoided. 2. I give a quick review of kriging and illustrate the problem with several simulated examples, including locations on a circle, locations on a linear dichotomous network (such as might be used for streams), and locations on a linear trail or road network. I re-examine the linear network distance models from Ladle, Avgar, Wheatley, and Boyce (2017b, Methods in Ecology and Evolution, 8, 329) and show that they are not guaranteed to have a positive definite covariance matrix. 3. I introduce the reduced-rank method, also called a predictive-process model, for creating valid spatial covariance matrices with non-Euclidean distance metrics. It has an additional advantage of fast computation for large datasets. 4. I re-analysed the data of Ladle et al. (2017b, Methods in Ecology and Evolution, 8, 329), showing that fitted models that used linear network distance in geostatistical models, both with and without a nugget effect, had negative variances, poor predictive performance compared with reduced-rank methods, and had improper coverage for the prediction intervals. The reduced-rank approach using linear network distances provided a class of permissible models that had better predictive performance and proper coverage for the prediction intervals, and could be combined with Euclidean distance models to provide the best overall predictive performance. KEYWORDS geostatistics, prediction, predictive process models, reduced-rank methods, spatial statistics 1 | INTRODUCTION variances may be negative, and generally leads to unreliable standard errors for prediction. My objective is to help ecologists un- There are now several examples in the ecological literature where, derstand the problem and avoid this mistake. I introduce a class of for spatial prediction like kriging, non-Euclidean distances were models that are easy to construct, based on linear mixed models, used in autocorrelation models developed under a Euclidean that perform well and guarantee that prediction standard errors distance assumption. This leads to a problem where prediction will be positive. Published 2018. This article is a U.S. Government work and is in the public domain in the USA 1600 | wileyonlinelibrary.com/journal/mee3 Methods Ecol Evol. 2018;9:1600–1613. VER HOEF Methods in Ecology and Evoluǎo n | 1601 variance, σ2. In constructing kriging models, practitioners often include 1.1 | A quick review of kriging p a “nugget” effect, which is an independent (uncorrelated) random ef- Kriging is a method for spatial interpolation, beginning as a disci- σ2 fect, ɛi, in Equation 1 with variance 0. The nugget effect is often as- pline of atmospheric sciences in Russia, of geostatistics in France cribed to measurement error, or microscale variation, at a scale finer and appearing in English in the early 1960s (Cressie, 1990; Gandin, than the closest measurements in the dataset. Constructing a full co- 1963; Matheron, 1963). Kriging is attractive because it produces variance matrix for a kriging model generally yields both predictions and prediction standard errors, providing uncer- = C+σ2I =σ2R+σ2I, (2) tainty estimates for the predictions. Predictions and their standard 0 p 0 errors are obtained after first estimating parameters of the kriging σ2 ⩾ 0 σ2 ⩾ 0 where p is called the partial sill, 0 is the nugget effect, model. The ordinary kriging model, which we will feature here, is, and I is the identity matrix (a diagonal matrix of all ones). The total σ2 +σ2 Yi =μ+Zi +εi, (1) variance is p 0. The off-diagonal elements of R are obtained from models that generally decrease as distance increases, with a few where Y is a spatial random variable at location i, i = 1,2,…,n, with i that also oscillate. Several autocorrelation models (Chiles & Delfiner, constant mean μ (the fixed effect), a zero-mean spatially auto- 1999, pp. 80–93), based on Euclidean distance, di,j, between sites i correlated error Z and independent random error ɛ . For the set i i and j, are {Zi;i = 1,…,n}, the spatial distance among locations is used to model ρ (d ) = exp (−d ∕α), autocorrelation among the random errors. Spatial autocorrelation e i,j i,j 3 < is the tendency for spatial variables to co-vary, either in a similar ρs(di,j) = [1−1.5(di,j∕α)+0.5(di,j∕α) ] (di,j α), 2 fashion, or opposite from each other. The most commonly observed ρg(di,j) = exp (−(di,j∕α) ), (3) 2 type of spatial autocorrelation manifests as higher positive correla- ρc(di,j) = 1∕(1+(di,j∕α) ), tion among variables at sites closer together than among those at > ρh(di,j) = (α∕di,j) sin (di,j∕α) (di,j 0)+ (di,j = 0), sites farther apart. These tendencies are captured in autocorrelation and covariance matrices. where distances are scaled by α ⩾ 0, called the range parameter. (a) Let R be an autocorrelation matrix among spatial locations. All of is an indicator function, equal to one if the argument a is true, oth- the diagonal elements of R are ones. The off-diagonal element in the erwise it is zero. ith row and jth column of R is the correlation, which lies between −1 Examples of the autocorrelation models in Equation 3, scaled and 1, between variables at site i and site j. Then a covariance matrix σ2 = 2 σ2 = 1 with a partial sill, p , and a nugget effect, 0 , are shown in C =σ2R p is just a scaled autocorrelation matrix that includes an overall Figure 1a. The exponential model, ρe(di,j), is commonly used, and a (a) (b) G Exponential Gaussian Spherical e Cauchy iance anc r Hole Effect ri 123 234 va va toco Semi 0 Au 01 G –1 0246810 0246810 Distance Distance (c) (d) 0 .0 1. FIGURE 1 Autocorrelation models. α=0.5 (a) Autocovariance functions for various 81 8 0. α=2 0. models, with a partial sill of 2 and a nugget n α=8 n effect of 1. (b) The same models as in (a), .6 .6 except represented as semivariogram 40 40 models. Note that the black dots indicate 0. 0. tocorrelatio tocorrelatio a discontinuity of the fitted model at the Au Au origin due to the nugget effect, where .2 .2 the model “jumps” to the black dots when 00 00 distance is exactly 0. Effect of the range 0. 0. parameter α on the (c) exponential model, 0246810 0246810 and (d) spherical model Distance Distance 1602 | Methods in Ecology and Evoluǎon VER HOEF special case of the Matern model that approaches zero autocor- are given in Chiles and Delfiner (1999, pp. 80–93). Autocorrelation relation asymptotically (Figure 1c). The spherical model, ρs(di,j), also is generally controlled by α, which must be estimated from real is common, attaining exactly zero autocorrelation at α (Figure 1d). data. However, it is useful to vary α through simulated data, and Both the exponential and spherical models decrease rapidly near even for real distance data, to understand its effect on covariance the origin, for short distances, whereas the Gaussian model, ρg(di,j), models, which I do in Figures 1c,d, and also in Figures 2 and 3. decreases more slowly near the origin. The Gaussian model occurs Kriging is often expressed in terms of semivariograms rather as a limiting case for the smoothness parameter of the Matern than autocorrelation models. Semivariograms model the variance model, and creates very smooth spatial surfaces. The Cauchy of the difference among variables. If Yi and Yj are random variables model, ρc(di,j), is similar to the Gaussian, but approaches zero auto- at spatial locations i and j, respectively, a semivariogram is defined 2 correlation very slowly. Finally, the hole effect model, ρh(di,j), allows as γ(di,j) ≡ E(Yi−Yj) /2, where E is expectation. All of the models in for negative autocorrelation in a dampened oscillating manner. Equation 3 can be written as semivariograms, These models highlight different features of autocorrelation mod- γ (d ) =σ2(1−ρ (d )), (4) els, and they will be used throughout this paper. Many more models m i,j p m i,j (a) (b) 0 G 1. G Exponential G Gaussian G e Spherical alu .5 Cauchy v G Hole Effect G G 00 0. G Minimum Eigen G G G −0.5 0.00.5 1.01.5 2.02.5 3.0 Alpha (c) (d) 0 G G G G G G G GG G G G G G G G G G G G G G G GG G G G G G G G 1. G G G G G G G G G G G G G G G G e 5 0. G G G G G G G G alu v G G G G 0 0. G G G G G G G G G G G G G G G G G G G G G G G G G G G G G G G G G G G G G G G G G G G G G Minimum Eigen −0.5 G G G G G G G G G G G G G G G G G G G G G G −1.0 FIGURE 2 Cautionary examples.

Kriging Models for Linear Networks and Non‐Euclidean Distances

Lecture 13: Simple Linear Regression in Matrix Format

Introducing the Game Design Matrix: a Step-By-Step Process for Creating Serious Games

An Introduction to Spatial Autocorrelation and Kriging

Stat 5102 Notes: Regression

Spatial Autocorrelation: Covariance and Semivariance Semivariance

Uncertainty of the Design and Covariance Matrices in Linear Statistical Model*

The Concept of a Generalized Inverse for Matrices Was Introduced by Moore(1920)

Augmenting Geostatistics with Matrix Factorization: a Case Study for House Price Estimation

Geostatistical Approach for Spatial Interpolation of Meteorological Data

Comparison of Spatial Interpolation Methods for the Estimation of Air Quality Data

Linear Regression with Shuffled Data: Statistical and Computational

Week 7: Multiple Regression