Journal of Applied Statistics Analysis of Local Influence in Geostatistics Using Student's T- Distribution

This article was downloaded by: [Rosangela Assumpção] On: 30 April 2014, At: 11:47 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Applied Statistics Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/cjas20 Analysis of local influence in geostatistics using Student's t- distribution R.A.B. Assumpçãoa, M.A. Uribe-Opazob & M. Galeac a Colegiado de Matemática, Universidade Tecnológica Federal do Paraná, Rua Cristo Rei, 19, Vila Becker, 85902-490 Toledo, PR, Brazil b Centro de Ciências Exatas e Tecnológicas, Universidade Estadual do Oeste do Paraná, Rua Universitária 119, Jardim Universitário, 85814-110 Cascavel, PR, Brazil c Departamento de Estadística, Pontificia Universidad Católica de Chile, Santiago, Chile Published online: 28 Apr 2014.

To cite this article: R.A.B. Assumpção, M.A. Uribe-Opazo & M. Galea (2014): Analysis of local influence in geostatistics using Student's t-distribution, Journal of Applied Statistics, DOI: 10.1080/02664763.2014.909793 To link to this article: http://dx.doi.org/10.1080/02664763.2014.909793

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms- and-conditions Downloaded by [Rosangela Assumpção] at 11:47 30 April 2014 Journal of Applied Statistics, 2014 http://dx.doi.org/10.1080/02664763.2014.909793

Analysis of local inﬂuence in geostatistics using Student’s t-distribution

R.A.B. Assumpçãoa∗, M.A. Uribe-Opazob and M. Galeac

aColegiado de Matemática, Universidade Tecnológica Federal do Paraná, Rua Cristo Rei, 19, Vila Becker, 85902-490 Toledo, PR, Brazil; bCentro de Ciências Exatas e Tecnológicas, Universidade Estadual do Oeste do Paraná, Rua Universitária 119, Jardim Universitário, 85814-110 Cascavel, PR, Brazil; cDepartamento de Estadística, Pontiﬁcia Universidad Católica de Chile, Santiago, Chile

(Received 23 September 2013; accepted 26 March 2014)

This article aims to estimate parameters of spatial variability with Student’s t-distribution by the EM algorithm and present the study of local influence by means of two methods known as likelihood displacement and Q-displacement of likelihood, both using Student’s t-distribution with fixed degrees of freedom (ν). The results showed that both methods are effective in the identification of influential points.

Keywords: spatial variability; Q-displacement of likelihood; EM algorithm; diagnostics

1. Introduction Geostatistics is a method of analysis which models the spatial variability of georeferenced variables, estimating the parameters that define the structure of spatial dependence. These parameters are used when values are interpolated with kriging at unsampled locations. For this interpolation to be reliable it is estimated that it represents the real local variability, the modeling process has to be performed with caution, especially in the presence of outliers or Downloaded by [Rosangela Assumpção] at 11:47 30 April 2014 influential points because the observations identified as influential in certain data sets produce disproportional changes in the estimation of parameters, in the covariance matrix, as well as in the design of thematic maps. In the presence of influential points, Cysneiros et al. [7] suggested as alternative models in the class of symmetric distributions such as Student’s t-distribution, which are more sensitive and therefore allow to reduce the influence of these points by incorporating additional parameters that adjust the kurtosis of the data. Fang et al. [11] presented a theoretical development of the Student’s t-distribution moti- vated by the fact be a robust alternative to the normal distribution. Lange and Sinsheimer [13]

∗Corresponding author. Email: [email protected]

c 2014 Taylor & Francis 2 R.A.B. Assumpção et al.

presented the family of the normal/independent distribution, including the multivariate Stu- dent’s t-distribution, and used the maximum likelihood and the expectation and maximization (EM) algorithm for estimating the parameters for the robust regression. Liu and Rubin [14,15] described the algorithms EM, Expection/Conditional Maximization and Expectation/Conditional Maximization Either, showing their computational efficiency in the maximum likelihood estimation of the parameters of the multivariate Student’s t-distribution with known and unknown degrees of freedom, with or without missing data, and with or without covariates. Despite being an alternative with heavier tails than normal and better accommodate aberrant observations, it is still possible that the Student’s t-distribution suffers the effect of influential observations. Therefore, it is important to perform studies of sensibility on it through diagnostic analysis. A diagnostic analysis is a technique used to evaluate the quality of the fitting model by assessing the assumptions made to it and also for assessing the robustness of its estimation when disturbances are introduced in the model itself or in the data [12]. Within the diagnostic analysis there is the analysis of local influence which studies the effect of introducing small perturbations in the model (or data) using a suitable measure of influence. This methodology was originally developed by Cook [9] and has become a popular diagnostic tool. Zhu and Lee [28] proposed a method to assess the local influence on incomplete data using the EM algorithm. Wei et al. [25] presented a technique for analysis of influence based on the mixture of distributions and the EM algorithm, considering the multivariate Student’s t-distribution as a particular case of the Gaussian mixture. The goal of this paper is to describe two methods for the diagnostic of analysis of local influence in geostatistics: the first one, introduced by Cook [9], uses the log-likelihood function, and the second method, described by Zhu and Lee [28], uses the expectation of the log-likelihood function. We consider the spatial linear models with n-variate Student’s t-distribution with a fixed degrees of freedom.

2. Student’s t spatial linear model Consider a stationary stochastic process {Y(s), s ∈ S} where S ⊂ R2 and R2 is a two-dimensional T Euclidean space. Let Y = (Y(s1), ..., Y(sn)) have an n-variate in Student’s t-distribution function on, ﬁxed degrees of freedom, ν, μ is the vector of location of parameters n × 1 and the scale matrix n × n, then Y ∼ tn(μ, , ν). T For every Y = (Y(s1), ..., Y(sn)) , there are corresponding spatial locations known in si and sj where i = j = 1, ..., n, so that Downloaded by [Rosangela Assumpção] at 11:47 30 April 2014 Y(si) = μ(si) + (si) for i = 1, ..., n,(1)

where μ(si) is a deterministic term and (si) is a stochastic term, both dependent on the parameter space where Y(si) operates. Assuming that the stochastic error (si) has E[(si)] = 0for i = 1, ..., n, and that the variation between points in space is determined by some covariance function C(si, sj) = Cov((si), (sj)). T The covariance function is speciﬁed by a vector of parameters φ = (φ1, φ2, φ3) , and these parameters are deﬁnes by the structure of spatial dependence. Supposing that the known functions of si, xi(si), ..., xp(si), the average of the stochastic process is given by the following equation: p μ( ) = β ( ) = ... si jxj si for i 1, , n,(2) j=1

where, β1, ..., βp are unknown parameters to be estimated. Journal of Applied Statistics 3

The expectation of the vector of random errors , n × 1, E() = 0 is a vector of zeros, T i.e. 0 = (0, ...,0) and its scale matrix is = [(σij)], where σij = C(si, sj) and C(si, sj) = Cov((si), (sj)). Assuming that is non-singular, and that the matrix X(n × p) is full rank (rank(X) = p), the scale matrix takes the following spatial structure given by the following equation:

= [(σij)] = φ1In + φ2R,(3)

where In is identity by the matrix n × n, φ1 ≥ 0 is the parameter deﬁned as a nugget effect, φ2 ≥ 0 is the parameter deﬁned as the contribution or sill, R = R(φ3) is a symmetric matrix n × n, and that the range (a) is a function of φ3 > 0, i.e. a = g(φ3). Thus, the elements i, j of the matrix are C(si, sj) = C(hij) = φ2 rij where hij = si − sj , and the elements rij of the matrix 1 R is this form rij = C(hij) with φ2 > 0fori = j, rij = 1fori = j = 1, ..., n and rij = 0if φ2 φ2 = 0. For the Matérn family model, the covariance function is given by the following equation:

κ φ2 hij hij C(h ) = Kκ , h > 0 and C(h ) = φ + φ if h = 0, (4) ij κ−1 ij ij 1 2 ij 2 (κ) φ3 φ3 1 ∞ κ−1 −(1/2)u(x+x−1) κ ( ) = κ where K u 2 0 x e dx is the modified Bessel function of the third kind . This function is valid for φ3, κ>0 fixed. In the Matérn family, the parameter κ, known as order, is a form of parameter which determines the analytical smoothing of the underlying process Y(s). The Gaussian covariance function is a special case when κ →∞and the exponential covariance κ = 1 function corresponds to 2 . Considering the hypothesis of Student’s t-distribution with fixed degrees of freedom (ν)to residuals, i.e. Y ∼ tn(Xβ, , ν). The log-likelihood function of Y is given by the following equation: ((ν + n)/2) 1 l(θ) = log − log ||+log g, (5) (πν)n/2(ν/2) 2 where θ = (βT, φT)T , g = (1 + δ/ν)−((ν+n)/2) and δ = (Y − Xβ)T−1(Y − Xβ). Let Y c = (Y, U) be the vector of a complete data, such that Y is the vector of observed data and U the vector of unobserved data. The conditional distribution of Y given the U is normal the n- −1 −1 variate with location vector μ and covariance matrix u , that is, (Y|U = u) ∼ Nn(Xβ, u ). The joint density function of the complete data fY,U (y, u) is given by Downloaded by [Rosangela Assumpção] at 11:47 30 April 2014

fY,U (y, u) = fY|U (y|u) · h(u),

with h(u) denoting the density function of the random variable U. The complete log-likelihood function is given by

log fY,U (y, u) = log fY|U (y|u) + log h(u), T T T T T such that, θ = (β , φ ) , where β = (β1, ..., βp) and φ = (φ1, φ2, φ3) . The conditional density function is given by

1 −1/2 uδ f | (y|u) = e . Y U (2π)n/2u−n/2||1/2

We have n n 1 u log f | (y|u) =− log(2π)+ log(u) − log ||− δ. Y U 2 2 2 2 4 R.A.B. Assumpção et al.

Then ˆ n n ˆ 1 δ ˆ E{log f | (y|u)|Y, θ}=− log(2π)+ E{log(U|Y, θ)}− log ||− E{(U|Y, θ)}, Y U 2 2 2 2 (ν/2)ν/2uν/2−1 h(u) = e(−uν/2), (ν/2)

where U has gamma distribution, G(ν/2, ν/2), thus the distribution tn(μ, , ν) is obtained and its density function is given by −( / )(ν+ ) ((ν + n)/2)||−1/2 δ 1 2 n f (y) = 1 + . Y (ν/2)(νπ)n/2 ν In this case, (U|Y = y) ∼ χ 2(ν + n)/(ν + δ). Thus, E{U|Y = y}=(ν + n)/(ν + δ) and E{log(U|Y = y)}=ψ((ν + n)/2) − log((δ + ν)/2), where ψ is the digamma function. The log-likelihood for the complete data is shown by n n 1 u l (θ|Y ) =− log(2π)+ log(u) − log ||− δ c c 2 2 2 2 ν ν ν ν ν + log + − 1 log(u) − log − u . 2 2 2 2 2 Thus, we have the conditional expectation of the logarithm of the likelihood function for the ˆ ˆ completed data Q(θ|θ) = E{lc(θ|Y c)|U, θ}, where: n n 1 δ Q(θ|θˆ) =− log(2π)+ c − log ||− ϑ 2 2 2 2 ν ν ν ν ν + log + − 1 c − log − ϑ,(6) 2 2 2 2 2 where ν + n ν + n δˆ + ν ϑ = E{U|Y, θˆ}= and c = E{log(U|Y, θˆ)}=ψ − log . ν + δˆ 2 2

3. EM algorithm The EM algorithm [10] is an iterative computational method to obtain, among other applications, approximations for maximum likelihood that is estimated in problems with incomplete data, Downloaded by [Rosangela Assumpção] at 11:47 30 April 2014 where the application of other computational methods may be more complicated. (k) (k)T (k)T Denote θˆ = (βˆ , φˆ )T as the estimate of θ for the k-th iteration of the EM algorithm, which is comprised on the following steps: (k) (k) E-step: given θ = θˆ , calculate Q(θ|θˆ ). (k+1) (k) M -step: Re-estimate θˆ by maximizing Q(θ|θˆ ) in θ, which considers the solutions presented in the following equations: βˆ = (X T−1X)−1X T−1Y (7) ∂ ∂ tr −1 = ϑrT−1 −1r,(8) ∂φj ∂φj where r = Y − Xβ and j = 1, 2, 3. The sequence obtained from the iterations of the EM algorithm converges into the maximum likelihood estimated of θˆ. Journal of Applied Statistics 5

4. Local influence The local influence method, initially proposed by Cook [9], has become a popular diagnostic tool for jointly identifying influential observations in the linear and nonlinear regression, with the great advantage that it can be applied to any parametric model. As regards the diagnostic analysis in linear spatial models, Christensen et al. [4] presented studies for the prediction of parameters for linear spatial models applying diagnostic techniques to detect observations that influence the estimation of the covariance matrix, using universal kriging, and Christensen et al. [6] studied diagnostic methods based on the elimination of cases in linear models. Borssoi et al. [1,2] studied diagnostic techniques in linear gaussian spatial models presenting the plot of the elements |Lmax| vs.i (data order) as a measure of diagnostic of local influence for the detection of influential points in likelihood displacement. Uribe-Opazo et al. [24] used diagnostic techniques for the linear Gaussian spatial model in order to evaluate the sensitivity of the maximum likelihood estimators, on the covariance function and the linear predictor for small perturbations. To define the local influence, the additive perturbation was considered in the response of vari- T able as Y ω = Y + ω, where ω is a perturbation vector ω = (ω1, ..., ωn) , n × 1, such that ω ˆ ˆ can reflect any perturbation scheme. Let θ and θ ω be the maximum likelihood estimators of θ under the assumed and the perturbed models, respectively.

4.1 Local inﬂuence on likelihood displacement Cook [9] suggests to study the displacement behavior of the log-likelihood by comparing θˆ and ˆ ˆ ˆ θ ω when ω varies in , that is, LD(ω) = 2[l(θ) − l(θ ω)], ω ∈ and has the normal curvature T T −1 in the direction of a vector l (with l =1), as Cl = 2[l ωL ωl] , where L is the observed information matrix evaluated at θ = θˆ, given by the following equation: Lββ Lβφ L = ,(9) Lφβ Lφφ such that ∂2 (θ) l T −1 T −1 Lββ = = 2X (W + 2W rr ) X, ∂β∂βT g g ∂2l(θ) Lβφ = with elements ∂β∂φT

Downloaded by [Rosangela Assumpção] at 11:47 30 April 2014 ∂2l(θ) ∂ ∂ = T−1 T−1 + −1 = 2X Wgrr Wg r for j 1, 2, 3; ∂β∂φj ∂φj ∂φj

T Lβφ = Lφβ and 2 T Lφφ = ∂ l(θ)/∂φ∂φ , with elements ∂2l(θ) 1 ∂2 ∂ ∂ =− tr −1 − −1 ∂φj∂φi 2 ∂φi∂φj ∂φi ∂φj ∂ ∂ ∂ ∂ ∂2 T −1 −1 −1 −1 + Wg r + − r ∂φj ∂φi ∂φi ∂φj ∂φi∂φj ∂ ∂ + T−1 −1 T−1 −1 = Wg r r r r for i, j 1, 2, 3 ∂φj ∂φi =−1 (ν + )/(ν + δ) = 1 (ν + )/(ν + δ)2 with Wg 2 n , Wg 2 n , 6 R.A.B. Assumpção et al.

δ = (Y − Xβ)T−1(Y − Xβ) and r = (Y − Xβ). T T T ω is the delta matrix (p + 3) × n given by ω = (β , φ) , where = ∂2 (θ )/∂β∂ωT =− T−1 + T −1 β l (ω) 2X [Wg 2Wgrωrω] and 2 T φ = ∂ l(θ ω)/∂φ∂ω with elements

∂2 (θ ) ∂ l ω T −1 −1 T −1 =−2rω [W rωrω + W (δ)I ], T g(δ) g n ∂φj∂ω ∂φj

1 1 2 (δ) =− ((ν + )/(ν + δω)) = ((ν + )/(ν + δω) ) with Wg 2 n , Wg(δ) 2 n , T −1 ˆ δω = (Y ω − Xβ) (Y ω − Xβ) and rω = (Y ω − Xβ), assessed at θ = θ and at ω = ω0. Zhu et al. [29] introduced an alternative to Cook’s procedure by proposing an expected likelihood displacement function called Q-displacement of likelihood to replace LD(ω). ˆ ˆ ˆ This function is defined by fQ(ω) = 2[Q(θ|θ) − Q(θ, ω|θ)], where Q(θ|θ) is the conditional expectation of the log-likelihood function of the incomplete data presented by Equation (6). α(ω) ω Zhu et al. [29] also adapted Cook’s normal curvature and named it CfQ,l,for and 0, = T T {− }−1 which was defined as CfQ,l 2[l ω Q ω l], 2 ˆ T 2 ˆ T ˆ where Q = ∂ Q(θ|θ)/∂θ∂θ |θ=θˆ and ω = ∂ Q(θ, ω|θ)/∂θ∂ω |θ = θ ω. The Q matrix for the n-variate Student’s t-distribution with ν (fixed) degrees of freedom is obtained over Q(θ|θˆ) as follows: Q Q Q = ββ βφ , (10) Qφβ Qφφ

2 ˆ T T −1 with Qββ = ∂ Q(θ|θ)/∂β∂β =−ϑX X; 2 ˆ T Qβφ = ∂ Q(θ|θ)/∂β∂φ , which has the following elements:

∂2Q(θ|θˆ) ∂ =−ϑX T−1 −1r for j = 1, 2, 3; ∂β∂φj ∂φj

T Qβφ = Qφβ and 2 ˆ T Qφφ = ∂ Q(θ|θ)/∂φ∂φ , and also ∂2Q(θ|θˆ) 1 ∂2 ∂ ∂ =− tr −1 − −1 Downloaded by [Rosangela Assumpção] at 11:47 30 April 2014 ∂φj∂φi 2 ∂φj∂φi ∂φi ∂φj 1 ∂2 ∂ ∂ ∂ ∂ + ϑrT −1 − −1 − −1 −1 r 2 ∂φj∂φi ∂φj ∂φi ∂φi ∂φj

for i and j = 1, 2, 3, such that ϑ = ((ν + n)/(ν + δ))eϑ = ((ν + n)/(ν + δ)2). ˆ The matrix ω, obtained from Q(θ|θ), considering the same scheme of linear perturbation (Y ω = Y + ω) is given by

∂2 (θ ω|θˆ) Q , T −1 T −1 T −1 β = = ϑωX − 2ϑ X rωrω ∂β∂ωT ω

and ∂2Q(θ, ω|θˆ) φ = , T ∂φj∂ω Journal of Applied Statistics 7

with elements ∂2 (θ ω|θˆ) ∂ Q , T −1 −1 −1 T = rω {ϑωI + ϑ rωrω }, T n ω ∂φj∂ω ∂φj 2 T −1 where ϑω = (ν + n)/(ν + δω), ϑω =−(ν + n)/(ν + δω) , δω = rω rω and rω = Y ω − Xβ, ˆ both assessed at θ = θ and ω = ω0. According to Zhu et al. [29], where there are no incomplete data, the function fQ(ω) reduces to LD(ω). Thus, the normal curvature based on fQ(ω) can be regarded as a generalization of the normal curvature based on LD(ω). In order to obtain an invariant curvature under a change of scale, Poon and Poon [20]introduced the conformal normal curvature Bl for ω0 in the direction of a unit vector l and deﬁned it as T T −1 l ωL ωl Bl =− . T −1 2 tr{ωL ω} Zhu et al. [29] also adapted the conformal normal curvature proposed by Poon and Poon [20] for the case of incomplete data, shown by

T T −1 2l ω{−Q }ωl Bf ,l =− . Q T −1 tr{−2ω{−Q }ω}

An interesting property of the conformal normal curvature, in both cases, is that for any unit direction l, there are values in the range [0, 1]. This allows, for example, a comparison of the curvature across different models. With the elements of the main diagonal of matrix Bl or matrix

BfQ,l, we obtain plots Bl or BfQl vs. i (data order). Considering the normalized eigenvector asso-

ciated with the highest eigenvalue in the module of matrix Bl or BfQl, it is also possible to build the plot |lmax| or |lQmax| vs. i (data order). | | | | Plots Bl, BfQl, lmax and lQmax aim to assess whether there are inﬂuential observations in likelihood displacement, assuming the n-variate Student’s t-distribution with ﬁxed ν.

4.2 Local influence on the covariance function The covariance function C(h) also depends on the parameter vector θ and for the fixed distance h, we denote Covh(θ). This is an important study of the sensitivity of maximum likelihood that ˆ estimates, given by Covh(θ). To measure this sensitivity, according to Cardigan and Farrel [3], Downloaded by [Rosangela Assumpção] at 11:47 30 April 2014 one can use the direction of maximum slope. The first-order local influence of the perturbation (Y ω = Y + ω) can be measured by using the ˆ slope in the direction l, denoted as S(l), of the plot of influence of Covh(θ ω) vs. ω. In the usual case, we have [24], T ˙ S(l) = l Covh, ˙ where Covh is a vector n × 1, given by the following equation: ∂ (θ) ˙ T −1 Covh Covh = −ωL ,(11) ∂θ θ=θˆ

with L being the matrix of usual observation, deﬁned by Equation (9), ∂ Cov (θ) ∂ Cov (θ) ∂ Cov (θ) ∂ Cov (θ) ∂ Cov (θ) ∂ Cov (θ) h = T h h = h h h T 0 , T and T , , . ∂θ ∂φ ∂φ ∂φ1 ∂φ2 ∂φ3 8 R.A.B. Assumpção et al.

In the case of full Q-displacement of likelihood, we have

( ) = T ˙ SQ l l QCovh , ˙ × where QCovh is a vector n 1 shown by the following equation: ∂Q (θ) ˙ = −T −1 Covh QCovh ωQ , (12) ∂θ θ=θˆ

where Q is the matrix deﬁned in Equation (10). Thus, ∂Q (θ) ∂Q (θ) ∂Q (θ) ∂Q (θ) ∂Q (θ) ∂Q (θ) Covh = T Covh Covh = Covh Covh Covh T 0 , T and T , , . ∂θ ∂φ ∂φ ∂φ1 ∂φ2 ∂φ3

The usual local direction of maximum slope is given by ˙ = Covh lCov ˙ , Covh and the local direction of maximum slope by the Q-displacement of likelihood is shown by

Q˙ Q = Covh . Cov ˙ QCovh

The plots lCov and QCov are used to assess the sensitivity of the covariance function.

4.3 Local inﬂuence on linear predictor In the study of spatial data, prediction is measured by universal kriging, which aims to obtain values for the regionalized variable in non-sampled points. Let Y0 = Y(s0), be the universal ∈ Tβ T = ( ... ) kriging predictor at location s0 D. The average of Y0 is given by x0 , where x0 x01, , x0p e x0j = xj(s0) para j = 1, ..., p. The predictor of the lowest mean-squared error is given by

T T −1 p(s0, θ) = x0 β + C0 (Y − Xβ), Downloaded by [Rosangela Assumpção] at 11:47 30 April 2014 T = ( ( ) ... ( )) = − = ... where C0 C h10 , , C hn0 , with hi0 si s0 for i 1, , n. Thus, a point estimator for Y0 is − ˆ ˆ T ˆ ˆ T ˆ 1 ˆ Y0 = p(s0, θ) = x0 β + C0 (Y − Xβ). In the usual case, we have T S(l) = l p˙(s0, θ),

where p˙(s0, θ) is a vector n × 1, shown by the following equation: ∂ ( θ) T −1 p s0, p˙(s0, θ) = − L . (13) ∂θ θ=θˆ

In the case of full Q-displacement of likelihood, we have

T SQ(l) = l p˙Q(s0, θ), Journal of Applied Statistics 9

where p˙Q(s0, θ) is also a vector n × 1, given by the following equation: ∂ ( θ) T −1 p s0, p˙Q(s0, θ) = − ωQ . (14) ∂θ θ=θˆ

In Equations (13) and (14), we have ∂p(s , θ) ∂p(s , θ) ∂p(s , θ) T 0 = 0 , 0 , ∂θ ∂βT ∂φT

where:

∂p(s , θ) 0 = x − X T−1C ∂β 0 0; ∂p(s , θ) ∂p(s , θ) 0 = 0 ; ∂φ ∂φj

with

∂p(s , θ) ∂CT ∂ 0 = 0 − T−1 −1( − β) C0 Y X and ∂φj ∂φj ∂φj ∂CT ∂C(h ) ∂C(h ) 0 = 10 , ..., n0 for j = 1, 2, 3. ∂φj ∂φj ∂φj

The usual local direction of maximum slope is given by

p˙(s0, θ) lp = ˙p(s0, θ)

and the local maximum slope by the Q-displacement of likelihood is given by

p˙Q(s0, θ) Qp = . ˙pQ(s0, θ)

Downloaded by [Rosangela Assumpção] at 11:47 30 April 2014 The plots lp and Qp are used for assessing the inﬂuence on the linear predictor.

4.4 Generalized leverage The main objective of the generalized leverage is to measure the influence of the response observed in its own adjusted value, i.e., to evaluate the local influence on the predicted values [25]. Laurent and Cook [18] generalized this method to more complex models, such as nonlinear models; Christensen et al. [5] suggested a measure of leverage for mixed linear models under normality; Paula [19] considered generalized leverage in linear regression models when the parameter vector μ is restricted by linear inequalities and, more recently, Nobre and Singer [17] proposed incorporating the information of adjusted random effects and consider the generalized leverage in linear mixed-effect models. Uribe-Opazo et al. [24] implied the method of generalized leverage for linear spatial models for the Gaussian case. The aim of the leverage method is to measure the influence of values of the observed response on the prediction of its values at unsampled locations. 10 R.A.B. Assumpção et al.

Let μ = Xβ be the expected value of variable Y, deﬁned in Equation (1), Yˆ = Xβˆ its value estimated by using maximum likelihood and θ = (βT, φT)T the parameters of the model, then, according to Xie et al. [26], generalized leverage matrix given by

∂Yˆ ∂Yˆ ∂θˆ LG(θ) = = , ∂Y T ∂θˆ ∂Y T assumes the following form in the usual case given by the following equation:

−1 LG(θ) = Dθ (−L) LθY , (15)

where ∂μ Dθ = = (X, 0) and ∂θ T ∂2 (θ) l T T T LθY = = (Lβ , Lφ ) , ∂θ∂Y T Y Y with T =− T−1( + T−1) LβY 2X Wg 2Wgrr and ∂2 (θ) T l Lφ = , Y ∂φY T with elements ∂2l(θ) ∂ =− T−1 −1 + T−1 = T 2r [Wg 2Wgrr ] for j 1, 2, 3. ∂φjY ∂φj

Thus −1 −1 −1 LG(θ) = X(Lββ − LβφLφφLφβ ) (LβφLφφLφY − LβY ). Based on the proposal of Wei et al. [25], as well as on the works of Zhu and Lee [28], the generalized leverage matrix for models with complete data takes the form given by the following equation: −1 (θ) = Dθ −Q Q Downloaded by [Rosangela Assumpção] at 11:47 30 April 2014 LGQ [ ] θY , (16) such that ∂2Q(θ|θˆ) Qθ = , Y ∂θ∂ T Y θ=θˆ with ∂ T −1 −1 −1 Qβ = ϑX Qφ = ϑ r Y and Y ∂φ

and ∂2Q(θ|θˆ) Q = , ∂θ∂θ T θ=θˆ with the function Q(θ|θˆ) resulting from the conditional expectation of the log-likelihood for complete data deﬁned in Equation (6). Journal of Applied Statistics 11

(a) (b)

Figure 1. (a) Boxplot of productivity and (b) Sketch in the experimental area. Downloaded by [Rosangela Assumpção] at 11:47 30 April 2014

Figure 2. Diagnostic plots (a) Bl,(b)BfQl,(c)|lmax| and (d) |lQmax| for soybean productivity. 12 R.A.B. Assumpção et al.

4.5 Cook’s distance Similarly to Cook’s distance [8,27], a measure of global inﬂuence on θˆ was deﬁned for the usual case given by the following equation:

Di = αDiβ + (1 − α)Diφ, (17)

such that (βˆ − βˆ ) (β)(βˆ − βˆ )T (i) K (i) D β = , i p

with 4dg K(β) = X T−1X n and n ν + n dg = ; 4 ν + n + 2 Downloaded by [Rosangela Assumpção] at 11:47 30 April 2014

Figure 3. Diagnostic plots (a) lCov,(b)QCov,(c)lp and (d) Qp for soybean productivity. Journal of Applied Statistics 13

(φˆ − φˆ ) (φ)(φˆ − φˆ )T (i) K (i) D φ = where i q ∂ ∂ bij 4fg 2fg −1 −1 K(φ) = K(φij) = − 1 + tr , 4 n(n + 2) n(n + 2) ∂φi ∂φj ( + ) ν + ∂ ∂ n n 2 n −1 −1 fg = and bij = tr tr . 4 ν + n + 2 ∂φi ∂φj

We also have α = (p/(p + q)), where p is the number of β and q the number of φ parameters. For the Q-displacement given by Equation (18), denominate

DQi = αDQiβ + (1 − α)DQiφ, (18)

= (βˆ − βˆ ) (βˆ − βˆ )T/ = (φˆ − φˆ ) (φˆ − φˆ )T/ such that DQiβ (i) Qββ (i) p, DQiφ (i) Qφφ (i) q, with Qββ and Qφφ obtained from the matrix Q previously deﬁned in Equation(10). Downloaded by [Rosangela Assumpção] at 11:47 30 April 2014

Figure 4. Diagnostic plots (a) LG,(b)LGQ,(c)Di and (d) DQi for soybean productivity. 14 R.A.B. Assumpção et al.

5. Applications Using the same experimental data already presented by Uribe-Opazo et al. [24], the purpose of this article is to present the analysis of local influence by incorporating a linear trend to the model and considering the Student’s t covariance function with fixed degrees of freedom (ν>0) by means of two methods. The first one, introduced by Cook [9], uses the logarithm of the

Table 1. Maximum likelihood estimates of the parameters using the exponential covariance function.

ˆ ˆ ˆ ˆ ˆ ˆ Data ν β1 β2 β3 φ1 φ2 φ3

Prod 5 0.9933 0.0211 0.0303 0.2235 0.1095 112.65 (0.707) (0.014) (0.008) (0.212) (0.128) (126.7) Prod-6 10 1.8404 0.0037 0.0261 0.2069 0.0667 47.521 (0.650) (0.012) (0.007) (0.138) (0.047) (50.97) Prod-p 10 1.9050 0.0092 0.0212 0.2194 0.1525 174.06 (0.717) (0.014) (0.008) (0.169) (0.119) (165.0) Downloaded by [Rosangela Assumpção] at 11:47 30 April 2014

Figure 5. Thematic maps (a) Prod, (b) Prod-6 and (c) Prod-p for soybean productivity designed from the parameters estimated by the EM algorithm. Journal of Applied Statistics 15

maximum likelihood function, and the other method, presented by Zhu and Lee [28], uses the expectation of the logarithm of full maximum likelihood function. For data analysis, we used the software R [21] with packages geoR [22], matrixcalc [16] and Splancs [23].

5.1 Analysis of experimental data We analyzed data from soybean productivity (Prod) (t ha−1), the average plant height (Hgt) (cm) and average number of pods per plant (N), totaling 83 observations for each of the variables. Average plant height and average number of pods per plant were used as co-variables of productivity. The analysis of the Boxplot of variable soybean productivity in Figure 1(a) shows the presence of one outlying point (5.53 t ha−1), which corresponds to the maximum value of the set data. This outlying point is the sixth value of the data series, and its location is shown in the plot of the experimental area in Figure 1(b). We considered the Student’s t linear spatial model, although the sample size was large (n = 83), because we have an experiment with 83 observations spatially georeferenced featuring a multivariate analysis with spatial correlation structure. Downloaded by [Rosangela Assumpção] at 11:47 30 April 2014

Figure 6. Diagnostic plots (a) Bl,(b)BfQl,(c)|lmax| and (d) |lQmax| for perturbed data on soybean productivity. 16 R.A.B. Assumpção et al.

To estimate the parameters of the linear spatial model of soybean productivity as a function of the co-variables, we considered μ(si) = β1 + β2 Hgt + β3N, where β1, β2 e β3 are the unknown parameters to be estimated. The Matérn family model was fitted to the data with κ = 0.5, 1.5, 5.0 and 10 with degrees of freedom ν = 5, 7, 10, 15 and 20. Considering the cross-validation as a criterion for choosing the degrees of freedom (ν), it was concluded that the data have Student’s t-distribution with degrees of freedom equal to 5 (ν = 5). The results of the criteria for selection of models for cross-validation and the maximum value the maximum loglikelihood criterion were used for comparison of the Matérn family model with different values of the parameter κ. It was found that the best model has been fitted with κ =0.5, which corresponds to the exponential model, assuming Student’s t-distribution with a degrees of freedom of 5 (ν = 5). The influence on likelihood displacement, assessed by the plots Bl, |lmax|, BfQl and |lQmax| (Figure 2) presented as observation influential element 6, as identified by Uribe-Opazo et al. [24]. The influence on the covariance matrix and the linear predictor (Figure 3) also identified element 6. However, by the methodology of Q-displacement, element 69 (Figure 3(d)) is also influential on the linear predictor, which is not in accordance with the results found by Uribe-Opazo et al. [24]. Downloaded by [Rosangela Assumpção] at 11:47 30 April 2014

Figure 7. Diagnostic plots (a) lCov,(b)QCov,(c)lp and (d) Qp for perturbed data on soybean productivity. Journal of Applied Statistics 17

The two methodologies are in accordance as regards for the observations identified as influential on the predicted values (element 69). For Uribe-Opazo et al. [24] element 6 and 80 are both influenced on the predicted values. Under Cook’s distance (Figure 4(c) and 4(d)), the element 6 is influential in the two methodologies presented herein, but by Uribe-Opazo et al. [24] element 36 is influential. Thus, after the analysis of these plots, we decided to remove observation 6 (5.53 t ha−1). To differentiate the new data set, the following were considered: Prod: total data and Prod-6: data excluding element 6. It was found that, for this dataset, the best adjusted model had κ = 0.5 and degrees of freedom ν = 10. By comparing the estimation of the parameter vector (Table 1), obtained from the Prod and Prod-6 data sets, it was found that there were changes in the estimates of the parameters of the linear spatial model, with reduction in the contribution and range. The maps in Figure 5(a) and 5(b) show that removing the element identified as the most influential makes the map more − homogeneous. The amplitude of values obtained by kriging Prod data was 2.46–3.67 t ha 1, while the amplitude of values obtained by kriging Prod-6 data was lower, ranging from 2.49 to − 3.30 t ha 1. A change was made between the fifth element of the sample (Prod = 2.38, Hgt = 36 and N = 28) and the sixth element of the sample (Prod = 5 : 53, Hgt = 53 and N = 45), located as shown in Figure 1(b). Downloaded by [Rosangela Assumpção] at 11:47 30 April 2014

Figure 8. Diagnostic plots (a) LG , (b) LGQ ,(c)Di and (d) DQi for perturbed data on soybean productivity. 18 R.A.B. Assumpção et al.

Both the productivity data and the result of the two co-variables were changed. The change is justified because it is aimed at an element of the sample lying outside the range of the maximum point (6) and, at the same time, close to average (2.987) of the data set. The fifth element of the sample lies between the first quartile and the average is the distance between them and 456.21 m. We considered the new data set as Prod-p. The Matérn Family model was adjusted to the new data set, and the best adjusted model had κ = 0.5 (exponential model) and degrees of freedom equal to 10 (ν = 10). A comparison between the estimated of the parameter vector Prod-p with Prod and Prod-6 (Table 1) shows an increase in the contribution and range. Figures 6–8 show the plots of local influence after the elements were changed. It is observed that the 5, 38 and 61th elements. it appears as influential in the likelihood displacement, the covariance function and the linear predictor of the Cook’s method. The 5 and 69th elements was identified as influential only by the generalized leverage and the 5 and 36th elements was identified as influential by Cook’s distance. There was an increase amplitude of the kriged values ([2.30, 4.20]) and, once again, the map showed a small area with values ranging between 3.44 and 3.70 on the lower right. The kriging made its value equal to 4.20, i.e. closer to the actual value (5.53). With kriging performed on the Prod data, the sixth element was equal to 3.70, well below the real value.

6. Conclusions It is concluded that it is possible to model spatial data using Student’s t-distribution with fixed degrees of freedom (ν>0) and the EM algorithm, since most heavy-tailed distributions are robust alternatives to the normal distribution. It was found that the analysis of local influence is effective when performed from Student’s t-distribution by the two methodologies presented because they allowed the identification of influential values and the assessment of the effects caused by such values in the estimation of parameters and the design of thematic maps. The perturbation of the data has shown that, for geostatistics, which takes spatial dependence into consideration, a value is influencing the likelihood displacement, of the role of co-variance, the linear predictor and Cook’s distance does not depend on its location.

Acknowledgements

We would like to thank CAPES (Coordination for the Improvement of Higher Education Personnel), CNPq (National Downloaded by [Rosangela Assumpção] at 11:47 30 April 2014 Council for Scientific and Technological Development) and Fundação Araucária, Brazil, for their financial support. The third author also acknowledges financial support from Proyecto FONDECYT 1110318, Chile.

References [1] J.A. Borssoi, M.A. Uribe-Opazo, and M. Galea, Diagnostics tecniques applied in geoestatistics for agricultural data analysis, Rev. Brasil. de Ciên. do Solo 6 (2009), pp. 1561–1570. [2] J.A. Borssoi, M.A. Uribe-Opazo, and M. Galea, Técnicas de diagnóstico de influência local na análise espacial da produtividade da soja, Eng. Agrícola 31 (2011), pp. 376–387. [3] N.G. Cardigan and P.J. Farrel, Generalized local influence with applications to fish stock cohort analysis, J. R. Stat. Soc. C 51 (2002), pp. 469–483. [4] R. Christensen, L.M. Pearson, and W. Johnson, Prediction diagnostics for spatial linear models, Biometrics 79 (1992), pp. 583–591. [5] R. Christensen, L.M. Pearson, and W. Johnson, Case-deletion diagnostics for mixed models, Technometrics 34 (1992), pp. 38–45. [6] R. Christensen, W. Johnson, and L.M. Pearson, Covariance function diagnostics for spatial linear models,Int. Assoc. Math. Geol. 25 (1993), pp. 145–160. Journal of Applied Statistics 19

[7] F.J. Cysneiros, G.A. Paula, and M. Galea, Modelos Simétricos Aplicados,9a Escola de Modelos de Regressão, Águas de São Pedro, SP, 2005, 100p. [8] R.D. Cook, Influence assessment, J. Appl. Stat. 14 (1977), pp. 117–131. [9] R.D. Cook, Assessment of local influence (with discussion), J. R. Stat. Soc. B 48 (1986), pp. 133–169. [10] A. Dempster, N. Laird, and D.B. Rubin, Maximum Likehood Estimation from incomplete data via the EM algorithm, J. R. Stat. Soc. B 39 (1977), pp. 1–38. [11] K.T. Fang, S. Kotz, and K.W. Ng, Symmetric Multivariate and Related Distributions, Chapman & Hall, London, 1990, 319p. [12] C.S. Ferreira, Inferência e diagnóstico em modelos assimétricos, IME-USP, 2008. [13] K. Lange and J. Sinsheimer, Normal/independent distributions and their applications in robust regression,J. Comput. Graph. Stat. 2 (1993), pp. 175–198. [14] C.H. Liu and D.B. Rubin, The ECME algorithm: A simple extension of EM and ECM with faster monotone convergence, Biometrika 81 (1994), pp. 633–648. [15] C.H. Liu and D.B. Rubin, Estimativa of the distribution using EM and its extensions, ECM and ECME. Sin Statist. 5 (1995), p.19–39 [16] J.R. Magnus and H. Neudecker, Matrix Differential Calculus with Applications in Statistics and Econometrics,Vol. 2, John Wiley and Sons, New York, 1999. [17] J.S. Nobre and J.M. Singer, Fixed and random effects leverage for influence analysis in linear mixed models, Departamento de Estatística IME-USP (2003). [18] R.St. Laurent and D. Cook, Leverage, local influence and curvature in nonlinear regression, J. Amer. Statist. Assoc. 87 (1992), pp. 985–990. [19] G. Paula, Leverage in inequality - constrained regression models, J. R. Stat. Soc. Sér. D 48 (1999), pp. 529–538. [20] W.Y. Poon and Y.S. Poon, Conformal normal curvature and assessment of local influence, J. R. Stat. Soc. B 61 (1999), pp. 51–61. [21] R Development Core Team (2010). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. R Foundation for Statistical Computing, Vienna. [22] P.J. Ribeiro Jr and P.J. Diggle, geoR: A package for geostatistical analysis R-NEWS, 01 (2001), Available at http://cran.r-project.org/doc/Rnews. [23] B. Rowlingson and P.J. Diggle, Splancs: Spatial point pattern analysis code in S-Plus, Comput. Geosci. Canada 19 (1993), pp. 627–655. [24] M.A. Uribe-Opazo, J.A. Borssoi, and M. Galea, Influence diagnostics in Gaussian spatial linear models, J. Appl. Stat. 39 (2012), pp. 615–630. [25] B. Wei, Y. Hu, and W. Fung, Generalized leverage and its applications, Scand. J. Stat. 25 (1998), pp. 25–37. [26] F.C. Xie, B.C. Wei, and J.G. Lin, Case-deletion influence measures for the data from multivariate t distributions,J. Appl. Stat. 34 (2007), pp. 907–921. [27] Y. Zhao and A.H. Lee, Theory and methods: Influence diagnostics for simultaneous equations models, Aust. N.Z. J. Stat. 40 (1998), pp. 345–358. [28] H.T. Zhu and S.Y. Lee, Local influence for incomplete-data models, J. R. Stat. Soc. B 63 (2001), pp. 111–126. [29] H.T. Zhu, S.Y. Lee, B.C. Wei, and J. Zhou, Case-deletion measures for models with incomplete data, Biometrika 88 (2001), pp. 727–737. Downloaded by [Rosangela Assumpção] at 11:47 30 April 2014