Dependence and Regimes in Applied Spatial Regression Analysis
Total Page:16
File Type:pdf, Size:1020Kb
Dependence and Regimes in Applied Spatial Regression Analysis. Paper to be presented at the 36. European Regional Science Association Congress, Zurich, 1996 Jørgen Lauridsen Institute of Economics, Odense University, DK-5230 Odense M. Fax +45 66158790 E-mail [email protected] Abstract. Recent results have shown that the presence of geographical dependency among regions in a cross section has serious consequences for the reliability of traditional tests for structural (in)stability. In the present paper, it is illustrated how the Chow-test for switching regimes is affected by geographical dependency. Consequently, asymptotic variances of the Chow-test, which implement geographical correlation processes in the error term of a linear regression model, are set up. Applications of these Chow-tests are illustrated on models for explanation of municipal elderly care services in a cross section of 275 Danish municipalities. Derivation of different regimes structures, using natural regimes, univariate sorting and multivariate clustering are illustrated and evaluated. 1. Introduction. In empirical regional research, it is usually assumed that the relations under consideration are stable over the spatial structure. Danish political research is no exception from this. Almost all investigations of intermunicipal service variation explains the intermunicipal service variation by linear regression of some service measure on a set of explanatory variables. In these policy- output investigations, the municipalities are assumed to be in an equilibrium state, such that a common model with fixed coefficients can be assumed for all municipalities. Opposed to this, Lauridsen (1995, 1996) provides evidence that there are strong heterogeneities in these models, as the error terms for the models seems to vary systematically with the explanatory variables. In the present paper, focus is on models accounting for heterogeneity of a spatial nature. Specifically, spatial heterogeneity will be formulated as regime models. In connection to this, the implications of another spatial effect - spatial dependence among the spatial units - will be considered. It is well known - see Lauridsen (1995, 1996) and Anselin (1988) that the presence of spatially autocorrelated error terms migth have serious consequences for the estimated model parameters, their significance, and misspecification tests. For example, the power of the popular Breusch-Pagan test for heteroscedasticity is uncontrollable if the error terms are spatially autocorrelated. This paper aims to formulate and estimate regime models and to test these models aginst a common model by a spatial Chow-test, suggested by Anselin (1990), taking into account the implications of spatial dependence. The paper consists of 6 parts. Following this introduction, part 2 reviews models which implements spatial effects in the form of spatial regimes and spatially autocorrelatederror terms. After this, a spatial Chow-test for structural regimes is presented in part 3. As the application of the Chow-test preassumes the regime structure to be defined in advance, part 4 will focus shortly on different heuristics for developing regime structures. In part 5, a practical case is considered, which is a set of models for explanation of the intermunicipal variation in elderly care. Following Lauridsen (1995, 1996), where strong evidence of spatial dependency and structural instability in these models is provided, structural regimes will be developed and evaluated by the spatial Chow-test. Finally, part 6 consists of a few conclusions and suggestions of future research. 2. Structural instability and spatially autocorrelated errors in regression models. In linear regression models, based on cross section observations, spatial heterogeneity migth be defined in a variety of forms. The form consisted in this paper is specified by assuming a set of regression coefficients for each regime. In this layout, a test for homogeneity corresponds to a test for similar coefficients for the regimes. Formally, the presence of spatially autocorrelated errors will disturbe such tests, as they are based on the assumption that the covariance matrix for the error terms is described by a diagonal matrix. Spatial autocorrelation implies that errors in contingent spatial units will have a non-zero covariance, breaking the assumption of a diagonal covariance matrix. For the moment, it will be assumed that the regime structure is known; deduction of regime structures will be discussed in part 4. 1 On a spatial structure of n units, G spatial regimes are defined, each consisting of ng spatial units, g=1, 2, .. , G, such that n = n1 + n2 + .. + ng . Assuming k explanatory variables, of which one migth be a constant term, the unlimited model for each regime reads as yg = Xgg$ + ,g , E(,gg', ) = Qg , g = 1, 2, .. , G, where yg is a ng vector of observations for the dependent variable, Xg is a ng by k matrix of explanatory variables, $g is a kvector of regression coefficients, ,g is a ng vector of error terms, and Qg is a ng by ng matrix of variances and covariances. Opposed to the unlimited model, the limited model, assuming equal regression coefficients for all regimes, reads as yg = Xg$ + ,g , E(,gg', ) = Qg , g = 1, 2, .. , G, the only difference being that the vector of regression coefficients, $, is the same for each regime. For notational simplicity, both models migth be written compactly as (1) y = X$ + , , E(,',) = Q, defining for both models y = (y1 ', .. , yG ')', , = (,1 ', .. , ,G') ', a n d Q = diag(Q1 , .. , QG). For the unlimited model, define X = diag(X1 , .. , XG ) , and $ = ($1 ', .. , $G') ' , whereas the according definitions for the limited model are X = (X1 ', .. , XG ')' , and $ a k vector of coefficients. Assuming the heterogeneity to be fully captured by the varying coefficients, the covariance matrix for the error terms for each regimes reads as 2 Qg = F Ing , g = 1 , .. , G , 2 where F is the common error variance for all spatial units, and Ing is a ng dimensional identity 2 matrix. Following this assumption, the covariance matrix for the entire system becomes the well known spherical form 2 Q = diag(Q1 , .. , QG) = F In , in which case In is a n dimensional identity matrix, resulting in forms of models (1) which migth simply be estimated by Ordinary Least Squares regression (OLS). This is valid for the limited as well as the unlimited model. If spatial dependence in the error terms are present, the method of OLS looses validity, as described in Lauridsen (1995), Anselin (1988). In this case, assuming intra- as well as interdependence among spatial units in consecutive regimes, the error term for the single spatial unit in (1) consists of an autoregressive, interdependent part and an independent part. Formally, 2 (2) , = 8W, + µ , µ distributed N(0,F I)n , where 8 is an autocorrelation parameter, and the dependencies are specified in the n by n matrix W, defined as Wij = 1 , if regions i and j are assumed interdependent = 0, otherwise. In this definition, the dependent parts of the error terms are specified in the n vector (W,), the i'th term of this product simply being the sum of the error terms in these regions with which region i is assumed to be interdependent. The autocorrelation coefficient 8, assumed to be numerically less than unit, measures the sensibility of the error made in the spatial region i upon the errors made in contingent spatial units. Row standardizing W, i.e. dividing each element in W by the sum of the elements in the corresponding row, the product (W,) has as i'th element the average - instead of the simple sum - of the errors in contingent regions. This definition of W is rather ad hoc; other specifications - giving rise to weighted averages in the product (W,) - migth be considered as well. Furthermore, the unsystematic - i.e. independent - error made in spatial unit i is accounted for as element i in the error vector µ. A comfortable way of rewriting (2) is , - 8W = µ , or (In - 8W), = µ , or, with B = I - 8W , , = B-1 µ . Assuming the heterogeneity to be fully captured in the varying regression coefficients - the $g's - the covariance matrix for the error term , becomes 3 Q = E(,,') = E((B-1µ)(B -1 µ)') = B-1 E(µµ') (B-1 )' = F2 (B'B)-1 . It is obvious from the definition of B that this covariance matrix does not meet the assumptions of a diagonal structure, which is fundamental for a series of test procedures in standard non- spatial econometrics. This is also profound in tests for structural instability, which are in general based on the statistical distance between the estimated error terms for the unlimited and the limited models. For the case of the Chow-test, well known from a times series setup, nondiagonality of the Q matrix in the spatial setup prevents direct application of the standard F- distributed test in a spatial regimes context. This will be discussed further in the next section. 3. Testing structural instability : A spatial Chow-test. The Chow-test for structural instability is based on the quadratic form -1 -1 (3) CG = eLU 'F e L - eUU 'F e U , where eU and eL are the estimates for the error , in the unlimited and limited models respectivily, and FU is the estimate of the covariance matrix Q in the unlimited model. Essentially, this quadratic form measures the statistical distance between the sums of squared errors in the two models, relative to the variation in the unlimited model. From asymptotic theory, it is well known that - under the hypothesis of no structural instability - such a distance follows an asymptotic P2 distribution with q degrees of freedom, q being the number of restricted koefficients, which for the models in (1) gives q equals k. 2 2 In a traditional OLS setup, FU reduces to the diagonal form sU In , sU being the estimated common variance for the error terms.