Matérn Cross-Covariance Functions for Multivariate Random Fields
Tilmann GNEITING, William KLEIBER, and Martin SCHLATHER
We introduce a flexible parametric family of matrix-valued covariance functions for multivariate spatial random fields, where each con- stituent component is a Matérn process. The model parameters are interpretable in terms of process variance, smoothness, correlation length, and colocated correlation coefficients, which can be positive or negative. Both the marginal and the cross-covariance functions are of the Matérn type. In a data example on error fields for numerical predictions of surface pressure and temperature over the North American Pacific Northwest, we compare the bivariate Matérn model to the traditional linear model of coregionalization. KEY WORDS: Co-kriging; Convolution; Cross-correlation; Gaussian spatial random field; Matérn class; Maximum likelihood; Multi- variate geostatistics; Numerical weather prediction; Positive definite.
1. INTRODUCTION which depends on the spatial separation vector, h ∈ Rd, only. The covariance function is isotropic if C(h ) = C(h ) when- Spatial stochastic process models are of ever-increasing im- 1 2 ever h =h , where ·is the Euclidean norm. Clearly, portance in a wealth of applications, ranging from the geo- 1 2 sciences and atmospheric sciences to environmental monitor- stationarity along with isotropy are not always realistic as- ing, economics, and other areas. The goals are as varied as sumptions; however, these models are the basic building blocks the fields in which spatial models are applied. The modeler of more sophisticated, anisotropic, and nonstationary models. might want to gain scientific insight in the processes studied, in A class of isotropic covariance functions that has received a which case the interpretability of the parameters is of great im- great deal of attention recently is the Matérn family (Matérn portance. Environmental risk analysis may require spatial sam- 1986; Guttorp and Gneiting 2006), which specifies the covari- 2 2 pling, which depends on the use of a model that allows for fast ance function as σ M(h|ν,a) where σ > 0 is the marginal simulation. Shared amongst very many applications is the need variance and for spatial interpolation to sites where no observations are avail- 1−ν 2 ν able. Whatever the goal of the study, it is frequently essential M(h|ν,a) = (ah) Kν(ah) (1) (ν) that several spatial variables be modeled simultaneously. The critical step then is to identify a suitable spatial dependence is the spatial correlation at distance h.HereKν is a mod- structure, not only within each variable, but between variables ified Bessel function of the second kind and a > 0 is a spatial as well. Here we focus on geostatistical formulations that rely scale parameter, whose inverse, 1/a, is sometimes referred to as on the fitting of covariance and cross-covariance structures for a correlation length. The smoothness parameter ν>0 defines Gaussian random fields on a Euclidean space, Rd. In this con- the Hausdorff dimension and the differentiability of the sample text, spatial interpolation is often referred to as co-kriging. Ex- 1 paths. If ν equals an integer plus 2 , the Matérn function reduces cellent expositions of geostatistical approaches to the model- to the product of an exponential function and a polynomial, in ing of spatial data include the monographs by Cressie (1993), that Goovaerts (1997), Chilès and Delfiner (1999), Stein (1999), n 1 (n + k)! n − Wackernagel (2003), and Banerjee, Carlin, and Gelfand (2004). M h n + , a = exp(−ah) (2ah)n k Gaussian Markov random fields provide an alternative that is 2 (2n)! k k=0 particularly suited to lattice data (Besag 1974; Mardia 1988; Rue and Held 2005; Sain and Cressie 2007). for n = 0, 1,... by Equation 8.468 of Gradshteyn and Ryzhik Suppose, for the moment, that we have a single spatial vari- (2000). This nests the popular exponential model that arises as | 1 = − able, and our univariate data arise from a spatial random field M(h 2 , a) exp( a h ). {Y(s) : s ∈ Rd}, which we assume to be Gaussian and second- For a positive integer k, the sample paths of a Gaussian order stationary with mean zero. Hence, the process admits a Matérn field are k times differentiable if and only if ν> covariance function, k (Handcock and Stein 1993; Banerjee and Gelfand 2003). C(h) = E(Y(s + h)Y(s)), A complementary description of the smoothness of a Matérn field is via the Hausdorff or fractal dimension of a sample path in Rd, which equals the maximum of d and d + 1 − ν (Adler Tilmann Gneiting is Professor, Institut für Angewandte Mathematik, Univer- 1981; Goff and Jordan 1988). For a differentiable field with sität Heidelberg, Germany. William Kleiber is Graduate Student, Department of Statistics, University of Washington (E-mail: [email protected]). smoothness parameter ν>1, the Hausdorff dimension of a Martin Schlather is Professor, Institut für Mathematische Stochastik, Univer- sample path equals its topological dimension, d. Generally, the sität Göttingen, Germany. We are grateful to Jeff Baars, Clifford F. Mass, larger ν, the smoother the process. Adrian E. Raftery, the associate editor, and the referees for comments and discussions and/or providing data. This research was supported by the Alfried Krupp von Bohlen und Halbach Foundation, the National Science Foundation under awards ATM-0724721, DMS-0706745, and the VIGRE program, and by © 2010 American Statistical Association the Joint Ensemble Forecasting System (JEFS) under subcontract S06-47225 Journal of the American Statistical Association from the University Corporation for Atmospheric Research (UCAR). The re- September 2010, Vol. 105, No. 491, Theory and Methods search of Martin Schlather was supported by DFG grant FOR 916. DOI: 10.1198/jasa.2010.tm09420
1167 1168 Journal of the American Statistical Association, September 2010
Moving to a multivariate spatial process, let Y(s) = (Y1(s), on the scale and cross-covariance smoothness parameters. Sec- d ...,Yp(s)) , so that at each location, s ∈ R , there are p con- tion 3 turns to a data study on error fields for numerical pre- stituent components. We assume that the process is multivari- dictions of surface temperature and pressure over the Pacific ate Gaussian and second-order stationary with mean vector zero Northwest. For pressure, error fields are known to be smooth, and matrix-valued covariance function while for temperature they are rough, with a strongly negative ⎛ ⎞ colocated correlation coefficient, resulting in a situation that is C (h) ··· C (h) 11 1p difficult to accommodate with extant models. The paper closes ⎜ . . . ⎟ C(h) = ⎝ . .. . ⎠ , (2) in Section 4 with a discussion and hints at extensions. C (h) ··· C (h) p1 pp 2. THE MULTIVARIATE MATÉRN MODEL where each C (h) = E(Y (s + h)Y (s)) is a univariate covari- ii i i As noted, we discuss conditions on the multivariate Matérn ance function, while Cij(h) = E(Yi(s + h)Yj(s)) is the cross- ≤ = ≤ model that result in a valid specification for the cross-covariance covariance function between process components 1 i j p. functions in (4), ranging from a parsimonious model in a lower It is difficult to specify nontrivial, valid parametric models for dimensional parameter space to a full bivariate model with the cross-covariance functions, because of the notorious require- most flexibility. ment of nonnegative definiteness. Specifically, if is the co- variance matrix of the random vector (Y(s1) ,...,Y(sn) ) , then 2.1 Parsimonious Multivariate Matérn Model for any vector a ∈ Rnp, the fitted covariance model must sat- We first consider a parsimonious multivariate Matérn model isfy aa ≥ 0. The prevalent way of guaranteeing this is to with marginal and cross-covariance functions that are of the use the linear model of coregionalization (LMC), where each form (3) and (4), with natural restrictions on the parameter component is represented as a linear combination of latent, space. independent univariate spatial processes (Goulard and Voltz 1992; Wackernagel 2003). For the LMC, the smoothness of Theorem 1. Let d ≥ 1 and p ≥ 2. Suppose that any component defaults to that of the roughest latent process, 1 νij = (νi + νj) for 1 ≤ i = j ≤ p, and thus the standard approach does not admit individually dis- 2 tinct smoothness properties, unless structural zeros are imposed and suppose that there is a common scale parameter, in that on the latent process coefficients. More generally, the lack of there exists an a > 0 such that flexibility of the LMC and related models for multivariate spa- a1 =···=ap = a and aij = a for 1 ≤ i = j ≤ p. tial random fields has been noted by various authors, including Cressie (1993, p. 141), Goovaerts (1997, p. 123), and recently Then the multivariate Matérn model provides a valid second- Rd Jun, Knutti, and Nychka (2008, p. 945) who study the spatial order structure in if d 1/2 d 1/2 structure of climate model biases individually, using Matérn (νi + ) (νj + ) ρ = β 2 2 ij ij 1/2 1/2 covariances, rather than jointly, because “we are not aware of (νi) (νj) flexible cross-covariance models that would be suitable.” 1 ( (νi + νj)) These challenges motivate the constructions hereinafter, × 2 for 1 ≤ i = j ≤ p, 1 + + d which preserve the Matérn covariance function marginally, (2 (νi νj) 2 ) while allowing nontrivial, flexible cross-covariance structures. where the matrix (β )p with diagonal elements β = 1for Specifically, in our multivariate Matérn model, each marginal ij i,j=1 ii i = 1,...,p and off-diagonal elements β for 1 ≤ i = j ≤ p is covariance function, ij symmetric and nonnegative definite. = 2 | = Cii(h) σi M(h νi, ai) for i 1,...,p, (3) The requirement that the components share the same scale is of the Matérn type with variance parameter σ 2 > 0, smooth- parameter, a, may seem overly restrictive at first. However, i Zhang (2004) showed that the parameters σ 2 and a of a Matérn ness parameter νi > 0 and scale parameter ai > 0. Each cross- covariance function, covariance function (with the smoothness parameter, ν, fixed) in dimension d ≤ 3 cannot be consistently estimated under infill 2 2ν Cij(h) = Cji(h) = ρijσiσjM(h|νij, aij) for 1 ≤ i = j ≤ p, (4) asymptotics, while the composite quantity σ a can be con- sistently estimated. Du, Zhang, and Mandrekar (2009) study is also a Matérn function, with colocated correlation coefficient the corresponding asymptotic distribution, and extend results of ρij, smoothness parameter νij, and scale parameter aij.Thekey = 1 2 Ying (1991), who showed that if ν 2 then either a or σ can question then is as to the values of νij, aij, and ρij that result in be fixed arbitrarily and the composite quantity can still be es- a valid, that is, a nonnegative definite, second-order structure. timated consistently and efficiently. Hence, the assumption of We provide necessary and sufficient conditions in the bivariate a common scale parameter is not necessarily restrictive, simi- case, and introduce a parsimonious formulation for any number larly to the reasoning in Zhang and Wang (2010). Finally, the of variables. inclusion of the βij parameter in the colocated correlation coef- The remainder of the paper is organized as follows. In Sec- ficient ρij, where 1 ≤ i = j ≤ p, imparts a significant amount of tion 2 we discuss conditions on the parameter space that re- flexibility on this class, ranging from independent components sult in a valid multivariate Matérn model. In the bivariate case when the βij vanish to strongly correlated components when all (p = 2), we achieve a full characterization, whereas for p > 2 βij = 1, or anticorrelated components when some or all of the we suggest a parsimonious model which imposes restrictions βij are negative. Of course, it is possible to reduce the number Gneiting, Kleiber, and Schlather: Matérn Cross-Covariance Functions 1169 of parameters, by imposing structure on the βij’s, as directed by parsimonious bivariate Matérn model thus is given by substantive expertise and the application at hand. C (h) = σ 2M(h|ν , a), C (h) = σ 2M(h|ν , a), (7) The proof of Theorem 1 is constructive, and we present here 11 1 1 22 2 2 and the key ideas, with routine calculations deferred to the Appen- dix. The crux of the argument is contained in the following = = | 1 + C12(h) C21(h) ρ12σ1σ2M h 2 (ν1 ν2), a , (8) statement, which is a variant of a result of Gaspari et al. (2006, p. 1820) and theorem 1 of Majumdar and Gelfand (2007), and where d 1/2 d 1/2 1 is also proved in the Appendix. The respective multivariate (ν1 + ) (ν2 + ) ( (ν1 + ν2)) |ρ |≤ 2 2 2 . (9) Gaussian random field allows a representation as a process con- 12 (ν )1/2 (ν )1/2 1 + + d 1 2 (2 (ν1 ν2) 2 ) volution, with distinct kernel functions, c1,...,cp, relative to a common white noise process. By Theorem 3 below, this condition is not just sufficient, but necessary as well. In the case p = 2 and d = 2 of a bivariate Theorem 2. Suppose that c1,...,cp are real-valued functions spatial random field on the Euclidean plane, the inequality in on Rd which are both integrable and square-integrable. If (9) simplifies to C (h) = (c c )(h) for i, j = 1,...,p, (5) (ν ν )1/2 ij i j |ρ |≤ 1 2 . (10) 12 1 + where the asterisk denotes the convolution operator, the matrix- 2 (ν1 ν2) valued function defined by (2) is a multivariate covariance func- Thus, the upper limit on the colocated correlation coefficient tion on Rd. equals the ratio of the harmonic mean and the arithmetic mean The convolution approach to the construction of multivari- of the marginal smoothness parameters. ate covariance functions has also been studied by Ver Hoef and Figure 1 shows a simulated realization of a bivariate Matérn field, where the first component is smooth (ν = 3 ), while the Barry (1998), Gaspari and Cohn (1999), Oliver (2003), and Ver 2 = 1 Hoef, Cressie, and Barry (2004), among other authors. In case second component is rough (ν 2 ). Simulation code will be made available in an upcoming release of the R package of the ci’s being suitably normalized Matérn functions with common scale a > 0 and smoothness νi − d , where we require RANDOMFIELDS (Schlather 2005). Among other methods, 2 4 RANDOMFIELDS employs the circulant embedding technique νi > 0 to ensure integrability and square-integrability, we get of Chan and Wood (1999), which relies on the fast Fourier d 1/2 d 1/2 (νi + ) (νj + ) transform and is both fast and exact. C (h) = 2 2 ij 1/2 1/2 (νi) (νj) 2.2 Full Bivariate Matérn Model 1 + (2 (νi νj)) 1 We now take a closer look at the case p = 2 of a bivariate ran- × M h (νi + νj), a . (6) 1 + + d 2 dom field on the Euclidean domain Rd, where we expand the (2 (νi νj) 2 ) model to include distinct scale and cross-covariance smooth- In other words, the Matérn family is closed under convolution. ness parameters. The full bivariate Matérn model thus becomes This yields Theorem 1 in the special case in which βij = 1for = 2 | = 2 | i ≤ i, j ≤ p. The general result then is immediate from Schur’s C11(h) σ1 M(h ν1, a1), C22(h) σ2 M(h ν2, a2), (11) Theorem (Horn and Johnson 1985, p. 455). and Theorem 1 provides flexible cross-covariance structures for = = | multivariate random fields with any number, p, of process com- C12(h) C21(h) ρ12σ1σ2M(h ν12, a12). (12) ponents. In the case p = 2 of two components only, the para- By Cramér’s Theorem (Cramér 1940) in its spectral den- meter β12 = β21 can take any value between −1 and 1. The sity version (Yaglom 1987, p. 315; Chilès and Delfiner 1999,
2 = 2 = Figure 1. Simulated realization from the parsimonious bivariate Matérn model with variance parameters σ1 σ2 1, smoothness parameters ν1 = 1.5andν2 = 0.5, common scale parameter a = 1, and colocated correlation coefficient ρ12 = 0.5. 1170 Journal of the American Statistical Association, September 2010
= 1 + ≥ p. 326; Wackernagel 2003, p. 152), a bivariate covariance model (d) If ν12 2 (ν1 ν2) and a12 max(a1, a2) the full bivari- is valid if and only if its spectral density matrix is nonnega- ate Matérn model is valid if and only if tive definite at almost all spatial frequencies. Cramér’s char- ν ν d 1/2 d 1/2 a 1 a 2 (ν1 + ) (ν2 + ) acterization leads to Theorem 3, where parts (a) through (e) |ρ |≤ 1 2 2 2 12 1/2 1/2 provide detail in an exhaustive listing of special cases. In the a12 a12 (ν1) (ν2) case a = a = a = a of equal scale parameters, the bounds 1 1 2 12 ( (ν1 + ν2)) × 2 . (16) in (14) and (16) reduce to (9), as noted above. The special case 1 + + d = = = 1 (2 (ν1 ν2) 2 ) in which ν1 ν2 ν12 2 has been studied by Chilès and Delfiner (1999, p. 330), Vargas-Guzman, Warrick, and Myers 1 + (e) If ν12 > 2 (ν1 ν2) the infimum in (13) is attained either (2002), and Wackernagel (2003, exercise 21.2). if t = 0, or in the limit as t →∞,orift2 = u where u is Theorem 3. The full bivariate Matérn model is valid if and any positive root of only if ν + ν ν d ν − 1 2 u2 + ν − 2 + a2 + d + d 2 12 2 12 2 4 1 2 (ν1 2 ) (ν2 2 ) (ν12) ρ ≤ 12 (ν ) (ν ) + d 2 1 2 (ν12 2 ) + + − ν1 + d 2 − ν1 ν2 + d 2 2ν 2ν + ν12 a a u a 1 a 2 (a2 + t2)2ν12 d 2 4 2 2 2 12 × 1 2 12 inf + + . (13) a4ν12 t≥0 (a2 + t2)ν1 (d/2)(a2 + t2)ν2 (d/2) 12 1 2 d 2 2 ν2 d 2 2 + ν12 + a a − + a a In particular, this can be written as one of the following cases: 2 1 2 2 4 1 12 1 + ν d (a) If ν12 < 2 (ν1 ν2) the full bivariate Matérn model is − 1 + 2 2 = a2a12 0. (17) valid if and only if ρ12 = 0. 2 4 (b) If ν = 1 (ν + ν ) and a ≤ min(a , a ) the full bivari- 12 2 1 2 12 1 2 The proof of Theorem 3 is deferred to the Appendix.By ate Matérn model is valid if and only if part (a), the cross-covariance smoothness parameter, ν ,must 12 2 d/2 + d 1/2 + d 1/2 be greater than or equal to the arithmetic mean of the mar- a12 (ν1 2 ) (ν2 2 ) |ρ12|≤ ginal smoothness parameters, ν and ν , unless the components a a (ν )1/2 (ν )1/2 1 2 1 2 1 2 are uncorrelated. This is best explained in the spectral domain, 1 + in that Cramér’s Theorem forces the Fourier transform of the (2 (ν1 ν2)) × . (14) cross-covariance function to decay at least as quickly as the (1 (ν + ν ) + d ) 2 1 2 2 geometric average of the spectral densities of the marginal co- = 1 + variance functions. Abelian and Tauberian theorems serve to (c) If ν12 2 (ν1 ν2) and min(a1, a2)