Testing for Spatial Group-Wise Heteroskedasticity. A specification Scan test procedure.

Autores y e-mail de la persona de contacto: Coro Chasco (Universidad Autónoma de Madrid) Julie Le Gallo (Université de Franche-Comté) Fernando A. López (Universidad Politécnica de Cartagena)

Departamento: Métodos Cuantitativos e Informáticos

Universidad: Politécnica de Cartagena

Área Temática: (indicar el área temática en la que se inscribe el contenido de la comunicación)

Resumen: Spatial heterogeneity is an important topic in the modelling of regional economies. In regression models, spatial heterogeneity can be reflected by varying coefficients (structural instability) and/or by varying error variances across observations forming blocks, what is called Spatial Group-Wise Heteroskedasticity. Spatial implies local clustering in the values of the variable and has been extensively studied in the literature. Unfortunately, testing for Spatial Group-Wise Heteroskedasticity is a less developed field. In this paper we present a novel and powerful procedure for Spatial Group-Wise Heteroskedasticity detection based on the Scan methodology.

Palabras Clave: Spatial Group-Wise Heterokedasticity; Scan tests; Permutational approach; Monte Carlo.

Clasificación JEL: C21; C50; R15

1. Introduction Spatial heterogeneity means that the behaviour of a certain spatial process is not uniform over space. We can think in different causes of instability (i) in the mean, (ii) in the variance or (iii) in both moments. Mean instability implies local clustering in the values of the variable and this topic has been studied extensively in the literature (Lesage and Pace, 2009, for a recent overview). In other cases, the source of instability is the variance. We are interested in the case of Spatial Group-Wise Heteroskedasticity, which means that the variability of the spatial data is systematically higher in some areas than in others. Obviously, is also possible that both sources of instability concur simultaneously; we are going to skip this case. There is a huge literature on the topic of spatial dependence but, unfortunately, the detection and modelling of Spatial Group-Wise Heteroskedasticity (SGWH from now on) is less developed. However, we suspect that SGWH is a frequent phenomenon when working with real data, which involves serious inference problems. There are several heteroskedasticity tests, usually based in a likelihood approach that can be adapted to check for SGWH (GQ test of Goldfeld and Quandt, 1965, BP test of Breusch-Pagan, 1980, are obvious candidates). Unfortunately, these tests need, in general, a priori information about the spatial structure present in the data that the researcher must supply. This type of information is not always available. In the same spirit, Kelejian and Robinson (1998) introduce a test for spatial heteroskedasticity assuming that the variance can be modeled using some regressors that must be identified previously. More recently, Ord and Getis (2012) consider the problem of local instability in the variance introducing a new statistic, called Hi. The aim of the local Hi is to identify the limits of the area where the variance changes to another value. The authors draw the attention to the lack of papers directed at examining the spatial structure of the variance (p. 530): ‘Spatial statistics’ cluster identification is now common to many fields. (…However) these studies have focused attention upon local means, to the extent that variability is considered at all it is typically assumed that the process has a constant variance (i.e., that it is homoscedastic). A moment’s thought indicates that such an assumption could overlook important information’. Our contribution tries to fill this deficit by developing a flexible and powerful statistic based on the Scan methodology (Kulldorff et al 1995, 2009) to detect group-wise heteroskedasticity. Our procedure explores systematically the entire spatial surface looking for the groups of connected regions where the difference between the variability

inside and outside the group is relevant. The origin of this proposal can be traced back to Openshaw et al (1987) with the so-called Geographical Analysis Machine The inference framework is computationally intensive because it is based on permutational bootstrapping. The paper is organized as follows. Section 2 introduces some basic results from the Scan methodology, including our proposal to detect SGWH. The design of a Monte Carlo is presented in Section 3, together with the main results in relation to estimated size and power. Section 4 focuses in what we call accuracy, that the ability to identity exactly the location and composition of the clusters of heteroskedasticity. Main conclusion appear in Section 5.

2. A Scan Test for Spatial Group-Wise Heteroskedasticity (SGWH)

This section introduces a technique to detect the presence of SGWH in a spatial data set. The proposal has two objectives: (i) check for the null of homoskedaticity and, in case of rejection of the null, (ii) identify the points, spatially linked, that share the same variance.

Suppose {xi} to be a spatial process with i=1,.., R a set of spatial coordinates. We are interested in testing the hypothesis (which, implicitly, assumes normality and spatial independence):

Hx0i:..d.N(;) ii  (1) The alternative hypothesis says that there is a group of connected observations, Z, where the variance is different:

xi ii..d.N( ;Z ) for i Z HA : (2) xi ii..d.N( ;Z ) for iZ In order to proceed with the Scan methodology, it is necessary to obtain the likelihood function under the null and alternative hypotheses, respectively. The log- likelihood function under the null hypothesis is simply: 2 R xi l(HL00 ) lnxR2R , , ln ln i1 (3) 22 The maximum likelihood estimates of the mean and variance are: 2 ˆ x xi H0  ˆ RRi ˆ 2  (4) Hi10 RRH0 i1 Which produce a value in the corresponding log-likelihood function of:

R  2  l(H0 )  ln 21ˆ (5) 2   H 0  Under the alternative hypothesis, the log-likelihood is: I l(HLAA ) lnx , ,Z , Z 22 (6) xxii R2Rln ZZ ln   RR  ln    ZZiZ2222 iZ Z Z where RZ is the number of observations in set Z. The maximum likelihood estimates of the mean and variance for this case are:

22 ˆˆ  x xxiiHHAA   ˆ R i ˆˆ22()ZZ  ()   (7) Hi1A HHAA iZ iZ  RRRRZZ The value of the log-likelihood function in this point is:

I R RRRZZ22 l()HA  ln21   ln()ˆˆ Z  ln() Z (8) 22HHAA 2 Finally, the Scan statistic for the assumption of equal variances can be written as:

 2 2  ˆ R ˆ ()Z I  H 0 Z H A  Scan =maxl (HHA0 ) l( )2R ln  ln (9) Z  ˆˆ22()ZZR  ()  HHAA

Θ is a set of connected regions Z, called windows, where the Scan statistic is computed. The size and shape of the window must be defined in advance by the researcher with the idea of getting a good balance between cost and effectiveness. For example, the evaluation of elliptical windows is more time consuming but it provides a greater flexibility. The set Z where the Scan test attains its maximum value is usually called the Most Likelihood Cluster, MLC. The test can be easily extended to the case of simultaneous instability in the mean and in the variance. In this case, the null hypothesis continues to be that of (1) whereas the alternative hypothesis now corresponds to:

xi ii..d.N(ZZ ; )for i Z HA : (10) xi ii..d.N(ZZ ; )for i Z The log-likelihood function is:

II l()lnHLAA x ,,,,ZZZZ 2 2 (11) xiZ xiZ R2Rln ZZ ln   RR  ln    ZZiZ2222 iZ Z Z The maximum likelihood estimates of the mean and variance are:

xx ˆˆ()ZZii  () HiZAA HiZ RRRZZ 22 (12) ˆˆ()ZZ  () xxiiHHAA ˆˆ22()ZZ () HHAAiZ iZ RRRZZ Consequently, we define the Scan statistic for the alternative of (10), which is very similar to that obtained in (9):

 2 2  ˆ R ˆ ()Z II  H 0 Z H A  Scan, = max l (HHA0 ) l( )2R ln  ln (13) Z  ˆˆ22()ZZR  ()  HHAA Let us note that the two statistics in (9) and (13) correspond to classical likelihood ratios, for SGWH in the first case and SGWH plus a break in the mean in the second (Rao, 1971). In order to obtain a standard Likelihood Ratio test, LR, the researcher must select in advance the area Z where the test is calculated and the significance level of the test, according to the corresponding . In turn, this distribution remains usually unknown for which we need a certain assumption. Both decisions may undermine the confidence on the LR test, especially in relation to size (for example, Engle, 1984, or Drton and Williams, 2011). The Scan tests suffer from a similar weakness in the sense that, in general, the distribution function of the statistics of (9) and (13) are unknown. Our intention is solve the inference of the Scan tests in a more robust permutational framework (and more compute demanding), which avoids data mining and the assumption of normality. Hence, a p-value is obtained through a Monte Carlo testing procedure (Dwass, 1957), by comparing the value of the Scan statistics for the real data set with a large sequence of values corresponding to purely random data sets, according to the null hypothesis of the test. The procedure is as follows: 1. Compute the Scan statistics for the original sample , where S is a set xiiS

of spatial coordinates, S  xci;yci ;i1 ,,2 ... ,R . 2. Relabel the set of locations by randomly drawing, without replacement, the

spatial coordinates; x r is the new, permuted, series, where r is the  iiS permutation index.

r 3. Compute the Scan statistic for each permuted sample x r .  iiS

4. Repeat steps 2 and 3 (B–1) times to obtain B-1 realizations of the B1 Scan r permuted statistic. r1 5. Compute the pseudo-probability as: 1 pScanB1 Scan r (14) P B1 r1  where  is an indicator function which assigns a value of 1 to a true statement and 0 otherwise.

6. Reject the null hypothesis if pP   for a nominal size . Once the MLC has been identified, we can proceed with the secondary cluster using the variability in the variance and/or in the mean of the series as the target.

3. Evaluating Size and Power of Scan tests

This section evaluates the behaviour of the permuted Scan tests introduced in Section 2, applied on the LS residuals of a linear model without spatial effects. In this sense, we conduct a Monte-Carlo study with the following set-up:

i. A linear equation is specified, including one regressor (xi) plus a constant term as:

yi  32xi i (15) The values of the parameters guarantee an expected R2 coefficient close to 0.8.

ii. The observations of the regressor xi have been obtained from a unit uniform

distribution, xi U(0,1) . iii. Regarding the error term, we consider nine different situations: ● DGP1: the error terms are distributed as a N(0,1);

● DGP2: the error terms are distributed as a 2()2 ; ● DGP3: the error terms are distributed as a Beta(0.5,0.5); ● DGP4: the error terms are distributed as a Log-N(0,1); ● DGP5: the error terms are distributed as a Binomial distribution, B(R,0.1);

● DGP6: the error terms are distributed as a weighted average of a 2()2 and a Student’s t distribution with 2 degrees of freedom, t(2); the unit weights are obtained randomly in the interval (0;1) 1;

1The resulting variable is a stochastic mixture of two non-normal distributions, which generates a random variable with an unknown distribution. We call it a “mixture error structure” (see Lin et al., 2010).

● DGP7: the error terms are heteroskedastic, with a random spatial structure in the variance; ● DGP8: the error terms follow a spatial autoregressive process, SAR, with a coefficient of 0.5; ● DGP9: the error terms follow a spatial moving-average process, SMA, with a coefficient of 0.5. DGPs 1 to 6 correspond to the null hypothesis. DGPs 7 to 9 are heteroskedastic but only in DGP8 and DGP9 the variance has a spatial structure. However, these two processes do not conform to what is usually meant by SGWH (we can use the term of borderline processes associated to them). iv. Square regular lattices of orders (6x6), (7x7), (10x10) and (15x15) have been used as the spatial support for the data, which means sample sizes (R) of 36, 49, 100 and 225 observations, respectively. v.  is the set of all elliptical windows whose centre corresponds to the centroid of each lattice, have an eccentricity of e = 1, 1.5, 2, 3, 4, 5, 10 and a rotation angle of  = 10º, 20º, ..., 180º. Moreover, the number of location entering in any given window should be not exceed 50% of all locations vi. Each combination has been repeated 1000 times. The number of permutations for

each simulated dataset, in order to compute the pP-values, has been 999, so B=1000.

3.1. Estimated size

Table 1 shows the estimated size for the Scanσ and Scanσ,μ tests under the 8 DGPs. Our conclusions can be summarized as follows: (i)- Under DGP1, with εiidN(0,1), the estimated size for both tests is very close to the nominal value of 0.05, even for very small sample sizes.

(ii)- Under DGP7, random heteroskedasticity, the estimated size for the Scanσ and

Scanσ,μ tests is also, generally, correct. (iii)- Overall, the two Scan tests behave properly even if the error terms are non- normally distributed. These tests are quite robust to departures of normality. (iv)- Both tests suffer from a problem of oversize for the borderline cases, where the error term exhibits patterns of spatial dependence. The impact is clearer in the case of the Scanσ,μ test and increases with the size of the sample.

(v)- These results confirm the difficulty of isolating the symptoms of spatial dependence and heteroskedasticity, as shown by Anselin and Griffith (1988), Kelejian and Robinson (1995), Griffith (2008), Bera and Simlai (2004) or Mur and Angulo (2009).

Table 1: Estimated size. Main results

Table 1a: Estimated size for the Scan test. Percentage pP-value< 0.05 DGP1 DGP2 DGP3 DGP4 DGP5 DGP6 DGP7 DGP8 DGP9 6x6 0.056 0.061 0.084 0.081 0.052 0.044 0.052 0.077 0.074 7x7 0.039 0.073 0.078 0.056 0.056 0.048 0.058 0.062 0.081 10x10 0.049 0.054 0.065 0.060 0.051 0.067 0.042 0.089 0.095 2 15x15 0.049 0.055 0.056 0.052 0.059 0.049 0.046 0.109 0.111

Table 1b: Estimated size for the Scan test. Percentage pP-value< 0.05 DGP1 DGP2 DGP3 DGP4 DGP5 DGP6 DGP7 DGP8 DGP9 6x6 0.074 0.060 0.070 0.057 0.038 0.061 0.080 0.198 0.211 7x7 0.061 0.054 0.052 0.048 0.081 0.060 0.065 0.198 0.265 10x10 0.042 0.064 0.053 0.055 0.130 0.056 0.044 0.207 0.251 15x15 0.050 0.041 0.041 0.053 0.031 0.052 0.051 0.321 0.373 * 999 boots

3.2. Estimated Power

As pointed by Anselin and Bera (1998, p***), spatial autocorrelation and heteroskedasticity may be observationally equivalent in cross-sections: ‘For example, a spatial cluster (…) of extreme residuals may be interpreted as due to spatial heterogeneity (e.g., groupwise heteroskedasticity) or to spatial autocorrelation’. There is a high risk of misinterpreting the symptoms, especially if the variance follows some regular spatial pattern (see, for instance, Bera and Simlai, 2004). Obviously, if the symptoms are misinterpreted, decisions will be erroneous and the inference probably wrong. Therefore, it is of great practical importance the development of tests capable of detecting and discriminating between different forms of spatially structured heteroskedasticity and/or spatial dependence. Our impression is that the Scan tests constitute a way of tackling this problem. Overall, we may think in two types of spatially varying variances: discrete or continuous. The first means that there are blocks of locations that share the same variance, which differs from block to block; this corresponds to the traditional interpretation of SGWH (Baumont et al., 2003, Ertur et al., 2006; Ramajo et al, 2008). Continuous instability should be interpreted in analogy with the concept of ‘parameter

2 469 computing hours to complete this case using parallel computing with 8 cores

surface’ in the Geographically Weighted Regression literature, where the parameters associated to each location is space are the image of the corresponding location in the surface (Fotheringham et al., 1999). Yan (2007) introduces the term of Spatial

Stochastic Volatility in reference to a process, i.e. i;,,...,i12R  , whose variance is changing smoothly over space:

Vhi  xci;yci (16) where (xci,yci) are the spatial coordinates of location i. In what follows, we are going to investigate the behavior of the Scan tests for these two categories of variance instability. For the case of SGWH, we consider four different patterns. In the first two cases, the skedastic cluster is formed by 11 locations and has an elliptic shape (SGWH1 and SGWH2 in Figure 1 below). The number of locations included in this skedastic cluster (that is, the size of the cluster) remains the same for the different sample sizes used in the experiment; this means that the symptoms of instability are weaker as sample size increases. Two other cases exhibit an East-West dichotomy with different values of variances in each regime (SGWH3, SGWH4 in Figure 1). The size of the cluster in the last two cases is proportional to the size of the sample. Three cases of a continuous spatial pattern in the variance have been considered. The first, SGWH5, is inspired by Casetti and Can (1999) study where the variance of the error terms is expanded into a monotonic function of the distance of each location to the geographical central point of the system. The second case, SGWH6, reflects a continuous East-West variation while the third, SGWH7, extends the Casetti and Can (1999) example by considering two central foci.

Figure 1: Spatial skedastic patterns: discrete and continuous processes. DGP Function Example: 10x10 case h(xc ,yc )=0.9 if (xc ,yc ) inside of ellipse(a) SGWH1 i i i i h(xci,yci)=0.1 if (xci,yci) outside of ellipse h(xc ,yc )=0.7 if (xc ,yc ) inside of ellipse(a) SGWH2 i i i i h(xci,yci)=0.3 if (xci,yci) outside of ellipse 01. if yc   i Meyci SGWH3 h xci;yc   i 09. if yc  Me  i yci 03. if yc   i Meyci SGWH4 h xci;yci   07. if yc  Me  i yci

2 2 h005;yc  exp . Me yc Me SGWH5 xci i xci xc i yc i  i

yci SGWH6 h xci;yci  max(yci )

h05xci;.yci  yciMe yci SGWH7 2 2 exp005 . xc Me yc Me i xci i yc i Mex is the of the xc coordinates. Mey is the median of the yc coordinates. (a) The ellipse has the same size, 11 cells, for all the cases

Table 3 shows the results of the two Scan tests considered so far. We include also the results obtained for two well-kwon classical test of heteroskedasticity, the Breusch- Pagan test, BP, and the test of White (Breusch and Pagan, 1979, and White, 1980, respectively). It is worth to remind that, in all the cases, these tests have been applied to the LS residuals of an estimated equation, similar to that of (15).

Table 3: Power of the Scan tests. Percentage pP-values< 0.05 Discrete SGWH Continuous SGWH R SGWH1 SGWH2 SGWH3 SGWH4 SGWH5 SGWH6 SGWH7 36 0.604 0.148 0.634 0.127 0.340 0.463 0.008 49 0.725 0.161 0.785 0.184 0.479 0.550 0.032 Scan  100 0.853 0.197 0.997 0.434 0.889 0.896 0.309 225 0.890 0.170 1.000 0.812 1.000 0.999 0.996 36 0.307 0.081 0.271 0.077 0.090 0.179 0.180 49 0.455 0.086 0.411 0.078 0.109 0.312 0.504 Scan μ, 100 0.606 0.062 0.948 0.078 0.303 0.729 0.153 225 0.707 0.050 1.000 0.213 To do To do To do 36 0.147 0.082 0.119 0.054 0.099 0.073 0.129 49 0.256 0.088 0.091 0.063 0.059 0.073 0.088 BP 100 0.333 0.066 0.147 0.079 0.087 0.091 0.106 225 0.220 0.053 0.135 0.077 0.166 0.083 0.102 36 0.012 0.035 0.031 0.039 0.027 0.024 0.037 49 0.042 0.052 0.038 0.049 0.019 0.015 0.075 White 100 0.072 0.054 0.034 0.044 0.025 0.019 0.037 225 0.019 0.040 0.031 0.034 0.023 0.049 0.087 *999 boots and ** 299 boots for 15x15 lattices

We would like to highlight the following results: (i) Overall, the estimated power for the Scan tests is higher than that obtained for the classical heteroskedasticity tests (ii) The behaviour of the Scan tests improves as the sample size increases (we would require a minimum of 100 observations), also when there is a great difference between the variances of the two regimes.

(iii)The Scan test appears to have greater power than the Scanμ, test especially for

the case of continuous patterns of heteroskedasticity. (iv) The behaviour of the classical tests, BP and White, is very poor independently of the skedastic pattern and/or sample size. They are not designed to deal with the spatial nature of the data and, consequently, they produce bad resuls..

4. Local Sensibility and Spatial Precision

Section 3 contains data concerning the estimated size and power of the Scan tests under different heteroskedasticity patterns; the results seem encouraging. However, in case of rejecting the null hypothesis of homoskedasticity, another important question emerges: the necessity of identifying, as accurately as possible, the heteroskedasticity pattern

present in the data. It is clear that, in order to improve the specification, the researcher needs to know where and how are produced the clusters in the variance. We define Local Sensibility (LS) as the percentage of times, in the Monte Carlo that continues, that each cell is selected as a member of the significant cluster (MLC) over the times that the test rejects the null.; that is: Number of times that localization i is assign to the MLC LS(i)= (17) Number of times that the test rejects the null hypothesis Indeed, the maximum number of times that a location can be selected as pertaining to the MLC is the number of times that the test rejects the null. A procedure as the Scan technique will be useful for the researcher if its LS is close to 1 for the cells that really pertain to the true variance cluster, and 0 for the cells that do not belong to the cluster. Figure 2 shows the estimated LS corresponding to the four discrete SGWH introduced in Section 3 (SGWH1 to 4) using the Scan and Scanμ tests. The data represented in this figure correspond to the percentiles of the decisions taken with respect to each cell in the spatial lattice.

Figure 2: Local Sensibility of Scan and Scanμ for discrete heteroskedasticity patterns R=36 R=49 R=100 R=225

SGWH1 Scanσ

Scanμ,σ

SGWH2 Scanσ

Scanμ σ

SGWH3 Scanσ

Scanμ,σ

SGWH4 Scanσ

Scanμ,σ

Let us remind that the Scan tests classify each cell as pertaining to a spatial cluster of common variance. This is a binary decision, probably not very adequate in the case of a continuous pattern of heteroskedasticity, as those appearing in Figure 3. In any case, it is clear that also in this case the Scan tests offer valuable information about the spatial structure of the variance.

Figure 3: Local Sensibility of Scan and Scanμ for continuous heteroskedastic. patterns R=36 R=49 R=100 R=225

SGWH5 Scanσ

Scanμ,σ to do to do to do to do

SGWH6 Scanσ

Scanμ,σ to do to do to do to do

SGWH7 Scanσ

Scanμ,σ to do to do to do to do

Kulldorff et al. (2009) and Huang et al. (2008) proposed several measures of accuracy to evaluate the ability of the Scan technique to correctly identify spatial clusters. One of these indicators, called Sensitivity, Sens, is simply the percentage of cells belonging to a cluster that are correctly classified as members of this cluster. Another indicator, called Inverse of Sensitivity, Isens, measures the percentage of cells in the estimated cluster that are wrongly assigned to it. It is clear that the range of both indicators is (0,1). Sensitivity can be related to the concept of statistical power, with a value of 0 indicating poor precision and 1 perfect precision (e.g., a large value implies that most of the true cells of the cluster have been identified). Inverse of Sensitivity can be related to size given that it measures false positive assignments to spatial clusters. A value near to zero in this index is desirable, although is very important to attain a good balance between both ratios. For example, a method that declares the entire geographic area to be a significant cluster will show a high Sensitivity (in fact, all the cells in the true cluster will be declared as members of the cluster) at the cost of a high Inverse of Sensibility ratio (many cells would be incorrectly assigned to the cluster), rendering useless the procedure. Main results for the cases of discrete heteroskedasticity patterns appear in Table 4.

Table 4. Accuracy indicators for the Scan tests for discrete heteroskedasticity patterns

Scan Scanμ, SGWH1 SGWH2 SGWH3 SGWH4 SGWH1 SGWH2 SGWH3 SGWH4 Sens 0.76 0.68 0.78 0.59 0.77 0.58 0.79 0.49 R=36 Isens 0.14 0.37 0.03 0.18 0.13 0.38 0.03 0.22 Aveg. size 10.1 12.6 14.5 13.2 10.1 11.1 14.8 11.6 Sens 0.77 0.74 0.75 0.61 0.75 0.57 0.76 0.47 R=49 Isens 0.21 0.43 0.01 0.09 0.19 0.48 0.01 0.09 Aveg. size 11.5 16.0 21.2 18.8 10.9 13.5 21.5 14.8 Sens 0.77 0.75 0.89 0.65 0.79 0.61 0.90 0.59 R=100 Isens 0.22 0.58 0.01 0.06 0.18 0.63 0.01 0.05 Aveg. size 12.7 25.3 45.1 35.0 11.8 21.5 45.2 31.6 Sens 0.76 0.71 0.91 0.74 0.77 0.41 0.91 0.78 R=225 Isens 0.20 0.60 0.00 0.03 0.17 0.74 0.00 0.02 Aveg. size 12.3 37.0 109.9 91.4 11.4 28.1 109.9 95.5 Aveg. size: average size of the cluster identified in each case in terms of number of cells assigned to the cluster.

From Table 4 we can note the following points: (i) Sensibility does not depend on the sample size. (ii) Sensibility does react positively to the difference of the variances inside and outside the cluster. (iii) Inverse Sensitivity increases in cases where the difference between the variances is small; also in cases where the cluster is of small size with respect to the whole sample. For example in SGWH2, with a cluster of 11 cells, the Isens is equal to 0.43 with R=49 but increases to 0.60 in the case of R=225.

(iv) The Scan tests tend to overestimate the size of the cluster in SGWH2, especially when the difference of the variances is small.

5. Conclusion

The Scan test is a simple a power method to identify SGWH in the residuals of a regression model. The principal advantage of this test is that it is not necessary give information about the pattern of instability in the variance. Moreover, is a output of this test the spatial cluster of localizations with higher (or lower) variances. An obvious problem with the Scan methodology is that it is extremely demanding in terms of time computing. In case of spatial process it is possible to develop similar tests to identify SGWH in residuals of a SAR o SEM models. For this cases, the problem of evaluate the p-values using permutation bootstraping is a time of computing extremely long to desing a full

Montecarlo. We are sure that we can explore the alternative based in the Gumbel distribution as an alternative method. The Gumbel distribution has been use for get Scan p-values in other cases with excellent results (Abrams et al 2010).

References Abrams, A. M., Kleinman, K., & Kulldorff, M. (2010). Gumbel based p-value approximations for spatial scan statistics. International Journal of Health Geographics, 9(1), 61. Anselin, L., Griffith, D. (1988): Do spatial effects really matter in regression analysis? Papers of the Regional Science Association. 65 11-34. Anselin, L., Bera A. (1998): Spatial Dependence in Models with an Introduction to Spatial Econometrics. In D. Giles y A. Ullaah (eds). Handbook of Applied Economic Statistics (pp.237-289). Dekker: New York. Baumont C., C. Ertur y J. Le Gallo (2003): Spatial Convergence Clubs and the European Regional Growth Process, 1980-1995. In B. Fingleton (eds.) European Regional Growth (pp.131-158). Berlin: Springer. Bera, A. and P. Simlai (2004): Testing spatial autoregressive models and a formulation of spatial ARCH (SARCH) model with applications. Manuscript. Department of Economics. University of Illinois. Breusch, T., Pagan, A. (1979): Simple test for and random coefficient variation. Econometrica 47 1287–1294. Casetti E and A Can (1999) The econometric estimation and testing of DARP models. Journal of Geographical Systems, 1 91-106. Drton, M., B. Williams (2011): Quantifying the failure of bootstrap likelihood ratio tests. Biometrika. 98 919-934. Dwass, M. (1957): Modified Randomization Tests for Nonparametric Hypotheses. Annals of Mathematical Statistics. 28 181-187. Engle, R. (1984): Wald, Likelihood Ratio, and Lagrange Multiplier Tests in Econometrics. In Z. Griliches and M. Intriligator (Eds.): Handbook of Econometrics, Vol. II (pp 775-826). Amsterdam: North Holland. Ertur C., J. Le Gallo y C. Baumond (2006): The Regional Convergence Process, 1980- 1995: Do Spatial Regimes and Spatial Dependence Matter? International Regional Science Review, 29 3-34. Fotheringham S, B. Zhan (1996): A Comparison of Three Exploratory Methods for Cluster Detection in Spatial Point Patterns. Geographical Analysis, 28, 200-218 Fotheringham, A, Charlton, M y Brunsdon, C (1999): Geographically Weighted Regression. a Natural Evolution of the Expansion Method for Spatial Data Analysis. Environment and Planning A 30 1905-1927. Griffith, D. (2003): Spatial Autocorrelation and Spatial Filtering. Berlin: Springer. Huang L, Pickle W, Das B (2008) Evaluating spatial methods for investigating global clustering and cluster detection of cancer cases. Statistics in Medicine 27: 5111–5142 Kelejian H.H., Robinson D.P. (1995) The influence of spatially correlated heteroskedasticity on test for spatial correlation. In Anselin L., Florax R.G.J.M. (Eds.), New Directions in Spatial Econometrics, Springer-Verlag, Berlin.

Kelejian, H.H. and Robinson, D.P. (1998). A suggested test for spatial autocorrelation and/or heteroskedasticity and corresponding Monte Carlo results. Regional Science and Urban Economics 28 389–417. Kulldorff M., Nagarwalla N. (1995): Spatial disease clusters: Detection and Inference, Statistics in Medicine 14 799-810. Kulldorff M., Huang L., Konty K. (2009): A scan statistic for continuous data based on the normal probability model, International Journal of Health Geographics 8 58-73. Lesage, J., K. Pace (2009): Introduction to Spatial Econometrics. Boca Raton: Chapman & Hall/CRC. Lin, P.,S. Hartz, Z. Zhang, S. Saccone, J. Wang, J. Tischfield, H. Edenberg, J.. Kramer, A. Goate, L. Bierut, J. Rice (2010): A New Statistic to Evaluate Imputation Reliability. PloS ONE 5 e9697. doi:10.1371/journal.pone.0009697 Mur, J., A. Angulo (2009): strategies in a spatial setting: Some additional results. Regional Science and Urban Economics, 39 200-213. Openshaw, S., E. Charlton, C. Wymer, A. Craft (1987): A Mark 1 Geographical Analysis Machine for the Automated Analysis of Point Data Sets. International Journal of Geographical Information Systems 1 359-377. Ord and A Getis (2012) Local spatial heteroscedasticity (LOSH) Annals of Regional Science 48 529–539. Goldfeld, S., R. Quandt (1965): Some Tests for . Journal of the American Statistical Association 60 539–547 Ramajo, J., M. Márquez, G. Hewings y M. Salinas (2008): Spatial Heterogeneity and Interregional Spillovers in the European Union: Do Cohesion Policies Encourage Convergence across Regions? European Economic Review, 52 551-567. Rao, R (1971): Statistical Inference and its Applications. New York: Willey. White, H. (1980): A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity. Econometrica 48 817–838. Yan, J. (2007): Spatial stochastic volatility for lattice data Journal of Agricultural, Biological, and Environmental Statistics 12 25-40.