XXXIV Reunión de Estudios Regionales Política Regional Europea y su incidencia en España. Economía, sociedad y medio ambiente . X Congreso de la Asociación Andaluza de Ciencia Regional El olivar andaluz: Territorio y Economía

Baeza-Jaén, 27 al 29 de noviembre de 2008

A Bottom-Up Approach to Forecast Regional Series: The Case of the Spanish labor market.

Jesús Mur (*) F. Javier Trívez (**) Ana Angulo (***) Department of Economic Analysis University of Zaragoza Gran Vía, 2-4. (50005). Zaragoza. . (*) e-mail: [email protected] (**) e-mail: [email protected] (***) e-mail: [email protected] Abstract In the last decades there has been an interesting debate in the literature on regional economics about the obtaining of regional forecast. We can distinguish two main approaches, namely the Top-Down and the Bottom-Up. The first is the most popular because is simple and assures a good forecasting capacity, because the functioning of the model is almost entirely subordinated to the trends originated in the national economy. On the other hand, the design of an standard Bottom-Up model appears more attractive as it mainly relays on the spatial relations. Our paper falls into the category of Bottom-Up approaches by using (static and dynamic) spatial panel data models. To this end, we discuss the case of the employment series corresponding to the Spanish provinces, with a very specific objective: forecasting the employment growth in each of the fifty Spanish provinces in the years 2006 and 2007. Our results are mixed in the sense that the best model combines the temporal dynamics of the series with the spatial structure of the labor markets. Keywords : Regional Modeling; Forecasting; Employment; JEL Classification : C21; C50; R15 Acknowledgements : This work has been carried out with the financial support of project SEJ2006-02328/ECON of the Ministerio de Ciencia y Tecnología del Reino de España. 1 1- Introduction. The field of panel data models has received a considerable attention during the last decade (see Wooldridge, 2002, Arellano, 2003, Hsiao, 2003, for an overview). This interest may be seen as the confluence of several factors. On the one hand, panel data offers a great opportunity to time series specialist to overcome the difficulties caused by small sample sizes. On the other hand, the popularity of some topics, such as economic growth or demand analysis, combined with the pressure to obtain results require the development of more powerful tools, among which the panel or the VAR models are obvious candidates. From a purely spatial perspective, there exists also a growing interest in completing the usual static analysis with the aspects related to the temporal dynamics of the cross-sectional relationships. Finally, it must be acknowledged that the sources of statistical information, with panel data, have improved in number and in quality. The evolution has been particularly strong in the case of the spatial econometrics literature (Baltagi, 2002). The objective of our paper is to review the current situation of the literature dedicated to modeling spatio-temporal data. There has been numerous, and very interesting, proposals in this particular field during the recent years. Most of them pertain to what may be called Bottom-Up methods which initially model the most basic spatial relations and then, by simple aggregation, the regional or the national are obtained. Some of the recent contributions made in the fields of panel data model or spatial VAR support the Bottom-Up approach. This evolution contradicts the dominant Top-Down strategy in which the regional or local models are only satellites connected to the national system. We think that, at present, there are good quality data and enough analytical tools to advance in the general philosophy of Bottom-Up modeling. However, we wonder about the applicability and competence of this approach. In section two we present a general taxonomy for the category of spatiotemporal models. In the third section we focus the discussion on the case of the static models (mainly, SUR and panel data models). The fourth section deals with the difficulties that arise from the introduction of temporal dynamic elements in these specifications. In both cases, static and dynamic models, we pay special attention to the problems caused by the space. In the fifth section we solve an application to the case of the Spanish employment by provinces. The specific objective of the application is to obtain forecast for each of the fifty Spanish provinces in the years 2006 and 2007. With 2 this purpose, we specify and estimate several competing models among which the best appears to be an specification that combines the temporal dynamics with the spatial structure of the data. The paper finishes with a section of conclusion. 2. Space-Time models. A taxonomy. In Table 1 we present a taxonomy of space-time models which includes a wide variety of alternatives. The arrangement is made attending to the time dimension of each alternative, and for this reason we talk about static or dynamic models. The problems associated to both categories of models are similar (identification, stationarity, nonlinearity, etc.) although the resolution of dynamic specifications with cross-sectional relationships is, technically, more difficult. Table 1: A taxonomy of space-time models. STATIC MODELS IN TIME Spatial Type of effects Structure FIXED RANDOM SEM SEM+EF SEM+EA  Panel Datal Models SLM SLM+EF SLM+EA Spatial Type of effects Structure FIXED RANDOM SEM SEM+EF SEM+EA  SUR Models SLM SLM+EF SLM+EA  Random Coefficient Models DYNAMIC MODELS IN TIME  Purely Recursive in Space  Recursive in Space and Time  Spatially Simultaneous  Panel Datal Models  General Dynamic Spatiotemporal Model  Non-spatial VAR  VAR Models  Spatial VAR Panel data and SUR models, in the category of static approaches, are very general specifications which maintain clear relations. In fact, as Chamberlain (1982) points out, a panel data model may be interpreted as a SUR model in which we use an equation for each individual and there are restrictions (the vector of coefficients is the same for the different equations). To put in another way, a SUR model is a like a collection of panel data models connected through their random terms. Both specifications may contain non-observable components as well as time and spatial mechanisms which are, usually, interrelated: 3

rt 1 2 ' rt rt r rt 2 2 rt r 2 r r 1 2 u t t y x ~ iid 0, ~iidN 0,

Cov u ;u ; t t

¡ ¡



(1) being r a non-observable random term related to the r-th individual and rt an idiosincratic random variable. The presence of the first term results in processes of time dependence. On the other hand, if we introduce a random time effect, associated to a some general shock, we will obtain situations of spatial dependence:

rt 1 2 1 1 ' rt rt t rt 2 2 rt t r r 1 2 2 r s u t t t t y x ~ iid 0, ~iidN 0, Cov u ;u 0; t t

Cov u ;u ; r s

¢ ¢ 

(2) It is common in the literature of panel data models to use the time effects as a way of accounting for noisy cross-sectional dependencies (Pesaran, 2005). However, in some occasions, the interest relies specifically in modeling these horizontal relations; if this is the case, the specification must contain information about the spatial structure of the data. Another question refers to the nature of the effects that have been omitted from the specification (Mundlak, 1978). The discussion about random or fixed effects appears routinely in all panel estimations but it is really important if we are using spatial data. The situation may be summarized according to two different positions. On one side, the models that include a spatial structure need a very big sample (a large R, number of individuals), because the convergence results are obtained with R tending to infinite. But, on the other hand, it the omitted effects are non random, there appears a problem of incidental parameters (that is, the number of parameters grows at the same rate of the number of observations); in that case, is preferable a situation of T big and R small. The last observation induces to Anselin et al. (2006) to discard the use of fixed effects in mechanisms of spatial dependence: ” Since spatial models rely on asymptotics in the cross-sectional dimension (…), this would preclude the fixed effects model from being extended with a spatial lag or spatial error term‘ . 4 These authors prefer the random effect framework, where the inference is conditional and we only need a very big R (the improvements with T are of minor importance). Elhorst (2003) does not share that view when he states that: ” The spatial units of observation should be representative of a larger population, and the number of units should potentially be able to go to infinity in a regular fashion. Moreover, the assumption of zero correlation between r and the explanatory variables is particularly restrictive. Hence, the fixed effects model is compelling, even when R is large and T is small‘ . Both approaches are contradictory and reflect strong methodological positions whose implications are evident. The VAR models are few developed in a spatial context although we can find a collection of very interesting papers in this field (see, for example, Carlino and DeFina, 1998, Di Giacinto, 2003 and 2006, Badinger et al, 2004, of Beesntock and Felsenstein, 2007). 3- Static models with spatial effects The SUR models constitute a flexible way of dealing with situations of moderate interdependence between the individuals (Greene, 1997); at the same time, are capable to deal with the problems caused by heterogeneity in a relatively simple context. For example, the excessive instability between the individuals may be treated as the consequence of unobserved effects. Assuming that the model contains M equations, with T observations in the time dimension for R different individuals, we can write: ' yrtm rm xrtm m rtm r 1,...,R; t 1,...,T; m 1,...,M

(4) where rm captures the specific effect associated to the r-th individual in the m-th equation. In matrix notation: m m m m 1 r1 1 r1 2 r2 2 r 2 m r m r R rT R rT 1 r 21 rk1 2 r 22 rk 2 m r R r 2T rkT Y X ; m 1,2, ,M y y y y Y y ; y y x 0 1 0 x x X x x 0 1 0 x x

x 0 1 0 x x

£ ¤ ¥ £

¦ ¦ ¦ ¦ ¦ § ¦ ¦ § ¦

¦ ¦ ¦ ¦ ¨ ¦ ¦ ¦ ¦ ¦ ¦ § ¦ ¦ ¨ § ¦ ¦ § ¦ ¦

£ £ £ £

¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦

¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ § ¦ ¦ ¦ ¦ § ¦ ¦

¦ ¦ ¦ ¦

¦ ¦ ¦

¦ ¦ ¨ ¦

£ £

¦ ¦ ¦

¦ ¦ ¦

¦ ¦ ¦ ¦

…                     

§ § m 1m 2m Rm 2m km ;

Y X

¦ ¦ ¦ ¦

¦ ¦ © © © ¦ ¦

¤ £

   (5) 5 If we treat this effects as fixed in (5) there are too much parameters, (MR+M(k- 1)). This number reduces to Mk in the case of random effects: rtm ' rtm rtm m rm rtm u y x r 1,...,R; t 1,...,T; m 1,...,M

 (6) being rm the random omitted term associated to the r-th individual in the m-th equation, which we assume to be constant for each period. The estimation of (5) or (6) may be solved by ML or by 2 step GLS, as usual. In the first step, we obtain the LS of each equation separately and then, using the LS residuals, a consistent estimation of the covariance matrix. In the case of the model of (6): 2 ( )1 ( )12 ( )1M 2 ( )12 ( )2 ( )2M rtm rtn ( )mn 2 ( )1M ( )2M ( )M 2 ( )1 ( )12 ( )1M 2 ( )12 ( )2 ( )2M rm rn ( )mn ( )1M ...... E[ ; ]= 0 ...... E[ ; ]= 0

......

2 ( )2M ( )M 1 2 1 1 1 1 2 R T T RT Y X u u N 0; ... T Q P P I l l 'T Ù X ' X X' Y Q I P

 (7) The introduction of mechanisms of cross-sectional dependence poses no specific difficulties for the ML algorithm. For example, assuming that we have only one equation, the SLM version for a SUR model is: t t t t t t t t t t R t y y x y x I

W A A W (8) In matrix notation: 6

R

1 1 1 2 2 2 T T T TRx1 TRXk TRx1 TR 1 11 12 13 1T 2 21 22 23 2T 3 31 32 33 3T T TxT y X N(0, ) I y x 0 0 y y ;X 0 x 0; y 0 0 x I 0 0 0 0 0 0 0 0 0 0 0 0

A A W                      T1 T2 T3 TT TxT       (9) The log-likelihood function is:

1 T R t 1 t l(y; ) RT ln 2 R ln ln y X ' I y X

2 2 2

    

 

     

A 1 A

1 A (10)

     



  being ' ';  1;; T; ij the (k+T+T(T+1)/2)x1 vector of parameters of the models. The score and the matrix information are highly non-linear (Mur and Lopez, 2008); for example, in the case of the score:

1 R 1 1 tt t R t 1ij 1 1 ij R ij l X' I y X g( ) l tr y X ' I y l R tr y X ' I y X

2 2



 

   1 A A W A 1 E W A 1E1 A 1 E (11) Eij (analogously Ett ) is a (TxT) matrix whose elements are zero, except the (i,j) and the (j,i) which are 1. The results for the SEM case are also simple. The specification is: R TR 1 1 1 1 2 2 2 2 T T T T TRx1 TRXk TRx1 TRx1 1 11 12 2 3 T TxT y X u u N(0, ) I I y x 0 0 u y y ;X 0 x 0;u u ; ; y 0 0 x u 0 0 0 0 0 0 0 0 0

0 0 0

  

 

  

      

 

 

 

        

                     

                

  



  

    

 

 

    

B B W                   13 1T 21 22 23 2T 31 32 33 3T T1 T2 T3 TT

TxT

 

  

 

   

 

   

 

   

           (12) 7 Whose log-likelihood function appears to be:

1 T R t 1 t l(y; ) RT ln 2 R ln ln y X I y X

2 2 2

    

 

     

B' 1 B

1 B (13)

        

' '; 1;; T;  ij is the (k+T+T(T+1)/2)x1 vector of parameters. The same as before, the ML algorithm is highly nonlinear; for example, the score vector is:

1 R 1 1 tt t R t 1ij 1 1 ij R ij l X' ' I y X g( ) l tr y X ' ' I y X l R tr y X ' ' I y X

2 2



 

  

B 1 B BW B1 E W B 1E1 B 1 E (14) In both cases we allow for heterogeneity in the parameters of spatial dependence of the different equations, as well as heteroskedasticity in the variance and serial dependence of the idiosyncratic error. As said before, panel data models may be seen as a simplified version of SUR structures. The nature of the unobserved effects, fixed or random, is a main question that, in our case, interferes with the problem of deciding the spatial dependence mechanism for the data, SEM or SLM. Consequently, we have at least four alternatives. Fixed unobserved effects and SEM spatial dependence. The main elements of this specification are the following:

t t t 1 t t t T T t 2 R 2RT RT 2 T 1 1t 11t 21t k1t 2t 12t 22t k 2t t 3t 13t 23t k3t Rt 1Rt 2Rt y x Y X l W I B ;B I W ~N 0, I ~N 0, ; I B'B y x x x y x x x y y ;X x x x

y x x

  

       



1 2 3 kRt R 1 2 RT RT 2 T ; x

l 12ln 12 ' R2Tln TlnB 21 'I B'B   Log - likelihood :



(15) This specification includes the conflict between the incidental parameter problem and the need of a large cross-sectional sample. The proposal of Elhorst (2004) is to use the data is to differentiate the data with respect to the mean of each individual 8 (the so called demeaned equation ). The transformation amounts to the Within estimation:

2 r r 2 t t t t t l RTln T ln1 2 1 Byy xx 'Byy xx

2

(16) The ML estimators of (16) will be consistent provided that the sample mean is a sufficient statistic with respect to the population mean. Fixed unobserved effects and SLM spatial dependence. The he main elements of this specification are: T t t t t 1 2 t R 2 RT 1t 11t 21t k1t 2 t 12t 22t k 2t t 3t 13t 23t k3t Rt 1Rt 2Rt kRt RT T Y (I W)Y X l y x Wy Y I (I W) X l ~N 0, I ~N 0, I y x x x y x x x y y ;X x x x

y x x x

 

 !

 

 !

 

 !





         



"

 

    

          #               

       

  

  

  

   

 

 

    

1 2 3 R 1 2 RT RT 2 RT T RT T ; l 1ln 1 ' RTln TlnB 2 2 2 1 I (I W)Y X l ' I (I W)Y X l

2

!

!

! !

!

 

 



 

 !  !  Log - likelihood :



"

  

  

  

    

   

   

      

  

   

           (17) There remains the incidental parameter problem so the demeaned equation will be necessary once again:

2 r r 2 t t t t t l RTln T ln1 2 1 By y x x 'By y x x 2

(18) Random unobserved effects and SEM spatial dependence. The change in the treatment of the unobserved effects renders the equation apparently more complex because now we combine two different sources of error, one of them of a spatial type. In any case, the results are standard: 9 

t t t t 1 t t t T T 2 2 t R R RT 2 2 1 RT R T 1 2(T 1) RT T 1 1 RT T u 1 T y x Y X u W u l I B ;B I W ~N 0, I ; ~N 0, I u~N 0, (ll ') I I B'B B'B T I B

(lTl ') B'B T I Q B'B

¡

¡

2 2 T T 2 1 T 1 2 T 1 T ll ' Q I T l R2Tln (T1)lnB 12lnB'B TI

21 u '(lTl') B'B T I u u'Q B'B u

¡

Log - likelihood :

(19) Random unobserved effects and SLM spatial dependence. The results corresponding to this specification are similar to the previous cases: 

t t t t t T 2 2 t R R RT 2 2 RT R RT R RT RT 1 RT 2 R T R T T u 2 2 2 2 y x Wy Y (I W)Y X u u l ~N 0, I ; ~N 0, I u~N 0, (ll ') I I T 1 ll' T I Q I

Q I

¡

¡



2 2 2 2 1 RT T RT RT T ll ' T T l RTln Rln TlnB 2 2

12 I (I W) Y X ' I (I W) Y X

¡





Log - likelihood :

(20) 10 All the above specifications may be generalized in different ways. For example, the assumption of parameter constancy causes problems when the data are very heterogeneous. In this case, the alternative of random coefficients (Swamy, 1970) is a useful solution not very demanding in terms of time computing. Some other alternatives, like those based on GWR approaches (Fotheringham et al, 1999) or the SALE estimation algorithm (Lesage and Pace, 2004; see Mur et al., 2008, for an intermediate position), are equally acceptable 4- Dynamic models with spatial effects In general terms, we must distinguish between dynamic panel data equations and VAR models (Anderson y Hsiao, 1981). The two approaches have been developed in a context of time series but admit the introduction of spatial elements. As a counterweight, we have to mention the increased complexity of the relations (see Elhorst, 2005, or Anselin et al, 2006) and the difficulty of isolating the various effects (time or spatial) that intervene in the specification. Purely recursive spatial panel models This specification adapts to several relevant problems of spatial economics. For example, to spatial diffusion processes (Upton and Fingleton, 1988) where the agents react with some delay to their neighbors. It suffices with writing: t t1 t t t1 1t1 kt1 t1 t2 1t2 kt2 t2 t t t tR 1tR ktR tR y Wy x t 1,2, ,T y 1 x x y 1 x x y ;x ;

y 1 x x

…          (21) The time dynamics of this equation is not direct but operates through the time lag of the spatial lag of the endogenous. The incidental parameter problem will arise if we treat the unobserved effects as fixed and, if we decide to maintain them as random, we will originate a problem of endogeneity. However, the estimation of (24) can be easily solved by means of, for example, ML or IV algorithms. Mixed recursive spatio-temporal panel models Now we specify an equation with explicit elements of time dynamics, although maintaining a time lag in the spatial lag: 11 t t1 t1 t t t1 1t1 kt1 t1 t2 1t2 kt2 t2 t t t tR 1tR ktR tR y y Wy x t 1,2, ,T y 1 x x y 1 x x y ;x ;

y 1 x x

…          (22) The assumption is that the agents follow their own time path but watking over the decision taken by the neighbors in the recent past. There is a multiplicity of effects acting at the same time: the spatial, the temporal and the unobserved effects, which results in a complex system of multipliers. The hypotheses of stationarity, in the time dimension (Brockwell and Davies, 2001), and of stability (Kelejian and Robinson, 1995) in the spatial dimension play an important role here. As shown by Giacomini and Granger (2004), this kind of specifications assures a good forecasting performance, both in time as well as in space. Simultaneous spatio-temporal panel models The distinctive feature of this case is that the mechanisms of spatial interaction are contemporaneous to the decisions of the agents. That is: t t1 t t t t1 1t1 kt1 t1 t2 1t2 kt2 t2 t t t tR 1tR ktR tR y y Wy x t 1,2, ,T y 1 x x y 1 x x y ;x ;

y 1 x x

…          (23) The cross-sectional ties are reinforced in the specification but the simultaneity renders difficult the use of this model for purely forecasting purposes. General spatio-temporal panel models The previous discussion leads to a more general model which combines two unrestricted temporal and spatial dynamic processes. There remain some restrictions of homogeneity in the parameters in order to assure the panel structure of the model: 12 t t 1 t t 1 t 1 t 1 2 t 3 t 1 4 t t1 1t1 kt1 t1 t2 1t2 kt2 t2 t t t tR 1tR ktR tR y y Wy Wy x x Wx Wx t 1,2, ,T y 1 x x y 1 x x y ;x ;

y 1 x x

…          (24) The unobserved individual effects are included in the error term of the equation,

t, and they may be fixed or random. All the observations made earlier apply also to this case. For example, there is a quite complex dynamic structure, in which the temporal and spatial elements interact between them. The equations must verify the restrictions of stationarity and stability and the model has a limited capacity to be used for forecasting purposes. The estimation of the model of (24) may be difficult. Elhorst (2005) uses a maximum-likelihood approach developing an iterative algorithm based on obtaining the reduced and the final forms of the model. Assuming that there are no unobserved effects in the equation (which implies that t is a white noise vector), the reduced form is: t  t t t 1 1 1 1 t t1 t t 1 t 1 t 1 2 t 3 t 1 4 x x x B B B B B I W A I W

x Wx Wx

y Ay y

$

(25) Whereas the final form:

t t tj t t j t t1 t m j 1 1 m t t j 1 t j t m 1 1 1 t t1 t m j 1 m1 t j 0 tj t m 1 x x x x x B A B A B A B B B y y y y y y

y y

%

%

&

(26) Matrix plays a very important role in the last two expressions because it guarantees the overall stability of the model: a necessary condition for the stationarity of the system of (24) is that the matrix must be convergent. In this case, the system will move towards a long-run equilibrium. That is: 13 1 m m m m

IfB A 1 lim limB 1A 0

' ( ' (

(27) This property of convergence depends (as shown by Elhorst, 2005) on the autoregressive parameters of the model ( , and ), and also on the eigenvalues of the weighting W matrix:

1 1 R r r 1 r B A I W I W 1 1

1 r

(28) Assuming stationarity and a normal distribution for the random term, we obtain the following results for the first and second order moments:

t t 1 t 1 1 t 1 1 2 2m t tt m 1 1 21 '1 2 1 ' 1 t tt m 1 x 1 x 1 1 1 m 1 E[By ] I L E[y ] I L B V[By ] I ' Cov[By ;y ] I '

V[y] B I ' B Cov[y;y ] B I ' B

)

)





 

  (29) The log-likelihood function, conditional on y 1, is:

t 2 T T T 1 2 1 2 t 2 t t t t t1 t t1 t 1 t1 2 t 3 t1 4 t t1 x l y ;y ;....;y y R(T2 1) ln 2 (T 1) ln B 21 ' y y Wy Wy x x Wx Wx

By Ay

(30) The treatment of the first cross-section may be simplified if we may assume an initial situation of stationarity (Hsiao, 2003), which allows to use the long-run moments of the unconditional distribution in the marginal distribution of vector y 1. That is:

T T 1 2 1 T T 1 2 1 1 2 T T T 1 2 1 2 t 2 t t 2 1 1'1 1 1 1 1 1 1 1 1 l y ;y ;....;y ;y l y ;y ;....;y y l y l y ;y ;....;y y R(T 1) ln 2 (T 1) ln B 1 ' 2 2 lyR2ln 2 12lnB I ' B 12yE y ' V(y ) y E y

V

% %

% * +

*

% % % %

*

,

¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦

£

¦ ¦ ¦ ¦ § §

£ . . / 0 ¥ . . 0 1

-

¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦

£ . / 0 ¥ . 2 2 . . . -

1 1 2 1 1'1 1 1 1 1 1 1 1 2 1 '1 T T 1 2 1 T 1 2 t 2 t t 1 1 1 1 1 1 x x 1 1 B I ' B I L B I B l y ;y ;....;y ;y R2T ln 2 (T 1) ln B 12lnB I ' B 21 ' 12 y Ey ' V(y) y

[y ] E[y ]

% % %

% % % %

*

%

% %

% *

%

+

*

, , ,

,

¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦

¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦

0 . 2 2 . 2 ¤ 3 . 2 ¤

¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦

£ . / 0 ¥ . ¥ . 2 2

4

§ § ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦

. 0 1 . . .

¦ ¦ ¦ ¦ E y 1 (31) 14 Elhorst (2005) is particularly skeptic about this solution: ”Conditioning on the cross-section of the first observation is an undesirable feature especially when the time dimension of the space time data is short‘ . Moreover, the assumption of stationarity of the process, included the first observation, is not acceptable on many occassions. On the contrary, the problem is what to do with, and how to interpret, the initial conditions (Blundell y Bond, 1998). Besides, there remains the question of the unobserved effects which we omitted from the specification of (25). If these effects are, in fact, present in the model, we have to use the model in first differences (Arellano, 2003) to eliminate them: 

t t t t j t t1 t t t1 t t 2vR t 1 1 t1 1 t t Vt 1 2 R t t m j 1 1 m t j 0 tj x x x x MODEL IN LEVELS FIRST DIFFERENCED MODEL v N(0, I ) B B B where : B A B A By Ay B y A y y Ay v

y v %

&

 …

t j t m 1 m j 1 m1 t j 0 B x tj t m 1 y

y v y %

&

(32) In continuation, Elhosrt (2005) proposes a parametric solution to the problem of the initial conditions following the discussion of Hsiao et al. (2002). This alternative consists in modelling the set of initial conditions by using some (ideally) less restrictive hypothesis; for example, that ” the expected changes in the initial endowments of the spatial units follows a first-order spatial autoregressive lag model ‘ Elhorst (2005). If the process is purely dynamic, without exogenous variables in the right hand side of the equation ( xt =0), the hypothesis amounts to: 1 -1 1 b -1 E[BUy1]= Vl E[Uy1]= WWE[Uy1]+Vl Initial Condition Uv1=BUy1-Vl V is a parameter to estimate •E[Uv ]= 0 •V[Uv ]= V I-Y

(33) Under these conditions, the full log-likelihood function is: 15

TR 2 12 1 T T1 2 1 2 1 1 v T T b R 1 R R R 2 1 R R 3 2 R R R R f y; y ;....; y; y 2 V( v) exp 1 v ' V( v) v 2 V( v) I B HI B' V I 0 0 0 By l I 2I I 0 0 By Ay H 0 I 2I 0 0;v By Ay 0 0 0 2I I

0 0 0 I 2I

           T 1 T 2 T T1 By Ay

By Ay

 (34) There are 5 parameters, , , , and 2

v , and the unobserved effects can not be recovered. The general case in which there appear a set of regressors in right hand side of the equation is a bit more complex ( xt 0) because the data generating process of the first observation will depend, also, on this type of variables. From (32) we can observe

that: 5 1 x1 0 1 1 1 x1 0 B y A y v v B y A y 5 (35) We need additional assumptions in relation to the behavior of the x variables; for example, we may assume that the x are stationary and exogenous:

E x t 0;t E x t 0;t E v1 0 (36)

The variance of v1 also depends on some unobserved elements and Balestra and Nerlove (1996) propose a rather tedious, but consistent, estimator:

1 j 1 j X X m j 1 1 m j 1 1 m 1j 0 j 0 1j m m j 1 1 m j 1 1 1 j 0 j 0 1j b 1 '1 1 ' 1 1 1 b x x 2v 2v B A B A B A V B V A B V A B V V BVB B B V B y v y y v

y y

%

%

6

6

(37) Where:

X 1 m m 1 X 1 2 3 4 X I I ' I 'I' ; ; ; ; ; ; '

Sampling covariance matrix of the observed X

6

5 5

The unconditional log-likelihood function presents a quite complex structure: 16

X TR 2 12 1 T T1 2 1 2 1 1 v T BN T R 1 2 1 R R R R R BN R R R R f y; y ;....; y; y 2 V( v) exp 1 v' V( v) v 2 V( v) I B H I B' I 0 0 0 By l By Ay I 2I I 0 0 H 0 I 2I 0 0 ; v 0 0 0 2I I

0 0 0 I 2I

6

           2 3 T 1 T 3 2 T 1 T 2 T T1 x x x x By Ay By Ay

By Ay

%

 (38)

The number of parameters is 4k+5: 1, 2, 3, 4, , , , and 2

v . 5- The case of the Spanish employment by provinces. In this section we are going to apply some of the models presented in the last sections to the case of the Spanish employment by provinces (NUTST III administrative spatial unit in terms of Eurostat). The data consist of series of aggregate employment for each of the fifty Spanish provinces, for the period 1980 to 2007. The series proceed from the Encuesta de Población Activa (EPA), as published by the Instituto Nacional de Estadística (INE), and are obtained in a quarterly basis. In our application we use the average level of employment for each year and province. The objective of the application is to compare the different approaches in terms of their forecasting capacity for the years 2006 and 2007. The period 1980-2005 will allow us to estimate and check the models which, in a second step, will be used to forecast the series of employment by provinces in the years 2006-2007. The spatial distribution of the variable of interest, expressed asthe percentage that each province represents in relation to the total national employment in the initial and final estimation periods, appears in Figure 1. One of the most interesting aspects of these maps is the shift of the distribution of employment in favor of the provinces situated in the Mediterranean basin, as well as in the Andalusian coast at the South of the peninsula. A second area of attraction is the corridor that links Madrid, in the centre of the peninsula, with the upper part of the Ebro valley and, specifically, the province of in the North of Spain. The provinces situated in the interior part of the peninsula and outside this corridor, in the North along the Bay, and in the West following the Portuguese border suffered the most severe shares losses. 17 Figure 1: Spatial distribution of the employment. Share variations. Percentage of each province in 1980 Percentage of each province in 2005 Variation in the share of each province. (white: increases; grey: decreases) Figure 2: Spatial distribution of the Gross Value Added. Share variations. Percentage of each province in 1980 Percentage of each province in 2005 Variation in the share of each province. (white: increases; grey: decreases) In Figure 2 we include the data corresponding to the distribution by provinces of the Gross Value Added. The similarities between both variables are very clear. The correlation between the shares is almost the unit and amounts to 0.89 in terms of share variations. The same can be said with respect to the total population, as appears in Figure 3: there exists a nearly perfect correlation in the provincial shares that 18 corresponds to the employment and the population, and maintains very high, 0.87, in terms of variations of the shares. Figure 3: Spatial distribution of the population. Share variations. Percentage of each province in 1980 Percentage of each province in 2005 Variation in the share of each province. (white: increases; grey: decreases) The relation is less evident with respect to the per capita income, as shown in Figure 4. In this case there exists a sharp difference between the scores corresponding to Southern or Northern provinces. The personal income of the provinces situated below of an imaginary line crossing from Lisbon to Madrid and Valencia was, approximately, 30% below the national average in 1980 (the lowest rate corresponds to Grenade, 64%) and 20% in 2005 (the worst result now corresponds to the province of , 66%). The correlation of this index with the employment shares is very weak, near zero in 1980, 0.30 in 2005, and 0.20 in terms of variations in the shares and in the indices of personal income. In Figure 5 we present the distribution of the unemployment between the Spanish provinces. The general appearance of these maps maintains the North-South dichotomy, but now the highest rates are in the Southern provinces. In any case, the variables of employment and unemployment are highly related as reflected by the correlation in their provincial shares, which is 0.96 in 1980 and 0.92 in 2005. The situation changes drastically in terms of variation in the shares, with a correlation of -0.20. 19 Figure 4: Spatial distribution of per capita income. Share variations. Provincial index. National average=100. 1980 Provincial index. National average=100. 2005 Variation in the provincial indices. (white: increases; grey: decreases) The purpose to these maps is to give support to the equations specified for the case of the Spanish employment. The different equations share the same nucleus, as the

following:

ert f p rt 1; y rt 1;u rt 1;other factors;v rt (39) where ert is the growth rate observed in the employment of province r in period t;

prt , yrt and urt are the growth rates corresponding to population, per capita income and unemployment respectively; v rt is the error term, possibly composed of an idiosyncratic error term, rt , plus an individual unobservable factor, r ( v rt rt r ). The term ” other factors ‘ refers to the incidence of (time or spatial) dynamic factors as well as some other nonstochastic elements required in the different equations (for example, in the case of the static models we introduced a sequence of time dummies to account for national shocks). Moreover, there is a lag between the endogenous and the regressors of the right hand side of (39) in order to cope with problems of simultaneity among the four variables. As expressed in (39), the regressors are predetermined with respect to v rt . 20 Figure 5: Spatial distribution of unemployment. Share variations. Percentage of each province in 1980 Percentage of each province in 2005 Variation in the share of each province. (white: increases; grey: decreases) In Table 1 we present the main results of the estimation of various static panel data models. Model 1 refers to the basic model, without spatial effects; Model 2 includes a SLM correction in the right hand side of the equation whereas in Model 3 we introduced a SEM structure. In the three models, the unobserved effects are treated as fixed and there are time dummies Table 1. ML estimation of STATIC panel data models for the case of Employment MODEL 1 MODEL 2 (SLM) MODEL 3 (SEM) Coeff. t-stat* Coeff. t-stat* Coeff. t-stat*

prt-1 0.082 4.265 0.212 2.396 0.136 7.665

yrt-1 0.112 5.226 0.099 2.117 0.109 4.666

urt-1 -0.061 -2.016 -0.006 -1.965 -0.104 -3.129

7 (SLM) - - 0.185 3.110 - - 8 (SEM) - - - - 0.122 4.918 Hausman 32.61 0.000 - - - - Spat. Correl. 6.231 0.000 0.320 0.665 3.222 0.151

NOTE : In MODEL 1, the test of Spatial Correlation is the

: ; ; : 9 Moran‘s I, in MODEL 2 is the RS 9 and the RS in MODEL 3. (*) p-value in the cases of the Hausman and autocorrelation tests. It is obvious that there exits a very strong spatial structure in the distribution of the Spanish employment between the provinces. According to the Moran‘s I, the residuals of the basic model are not randomly distributed over the Spanish map,. This situation leads us to Model 2 and 3 which contain some spatial elements. The SLM or the SEM 21 mechanisms are enough to account for the cross-sectional relationships found in Model 1. The ML estimates of the parameter are significant and have the right sign. Moreover, we can not accept the restriction of common factors (through the LRCOM, not included in the table) which favors the SLM specification of Model 2. The point that we want to underline is that, after introducing the spatial framework in the specification, there remains a very strong dynamic structure in the residuals. In Table 2 we include the autocorrelation coefficients corresponding to the 23 crosssections of ML residuals obtained from the three models estimated in Table 1. Observing these estimates, our impression is that the spatial structure is not enough to explain the data on employment for the Spanish provinces; we need to take into account the temporal dynamics of the local labor markets. Table 2. Autocorrelation coefficients of the ML residuals form the static panel data models . Lag MODEL 1 MODEL 2 (SLM) MODEL 3 (SEM) 1 0.919 0.984 0.976 2 0.855 0.931 0.942 3 0.786 0.943 0.915 4 0.725 0.930 0.886 5 0.657 0.913 0.850 6 0.447 0.859 0.666 Table 3 presents the results of the ML estimation of the dynamic version of the models of Table 1, in addition to other dynamic specification mentioned in Section 2. For the moment, we do not have result for the category of VAR (time or spatial) models. Apparently the six models are satisfactory. One or two lags of the endogenous variable are enough in the different equations; beyond this mark, the addition of additional lags adds nothing to the equation. The same applies with respect to the predetermined variables: in neither case it was necessary to include more than two lags. In general, the ML estimates are significant and have the expected right sign. The most important variables in all equations are the lags of the provincial employment growth whereas the spatial structure, in terms of contemporaneous or delayed spatial lags of the endogenous, contributed to a lesser extend. The exception is the Purely recursive equation of Model 4. Among the predetermined variables, the most important appears to be the variation in the population of each province, followed by the variable of unemployment. The impact of the personal income is most evident in the short run 22 although weakens very rapidly. The increments of the population and of the per capita income tend to stimulate positively the employment of each province; the impact of the unemployment, used as an indicator of the economic climate of the province, is persistently negative. Finally, the two specification tests that we report at the bottom of Table 3 must be interpreted with caution because they are just an approximation to the asymptotic ones. However, the six equations appear to be well specified in terms of the temporal dynamics of model whereas the symptoms of spatial dependence remain under control, although unexpectedly high. Table 3. ML estimation of DYNAMIC panel data models for the case of Employment MODEL 1 MODEL 2 (SLM) MODEL 3 (SEM) MODEL 4 MODEL 5 MODEL 6 Coeff. t-stat* Coeff. t-stat* Coeff. t-stat* Coeff. t-stat* Coeff. t-stat* Coeff. t-stat*

ert-1 0.116 3.662 0.158 0.082 0.332 4.662 0.194 3.755 0.106 5.996

ert-2 0.078 4.117 0.066 0.987

prt-1 0.174 3.652 0.226 4.663 0.612 1.322 0.255 4.965 0.332 6.221 0.163 6.774

prt-2 - - -0.066 -5.584 0.147 2.665 0.084 1.874 - -

yrt-1 0.226 6.887 0.332 4.698 0.148 3.987 0.213 6.227 0.187 2.116 0.096 6.978 yrt-2 - - 0.334 4.668 0.023 1.226 0.326 2.652

urt-1 -0.096 -3.558 -0.111 -2.668 -0.198 -2.108 -0.554 6.441 -0.003 0.963 -0.144 -3.997

urt-2 -0.018 -2.449 -0.094 -1.718 -0.372 2.017 - -

- - 0.044 1.668 - - 0.099 2.639 7 (SLM) -0.036 -1.268 -0.049 -1.887 8 (SEM) - - - - -0.344 1.667 Moran‘s I 1.831 0.067 0.697 0.486 1.347 0.178 2.017 0.044 1.654 0.098 1.225 0.221 r1 0.123 0.384 0.098 0.488 0.165 0.243 0.056 0.692 0.044 0.756 0.087 0.538 NOTE : MODEL 4 refers to the Purely recursive in Space; MODEL 5 refers to the Recursive in Space and Time; MODEL 6 refers to the Spatially Simultaneous. (*) p-value in the cases of Moran and first order autocorrelation tests. The p-values estimated for the Moran‘s I and r 1 (first order autocorrelation coefficient between the residuals of the panel) are an approximation to their asymptotic values, assuming that R tends to infinity. The next step is to forecast the employment growth in each of the fifty Spanish provinces for the years 2006 and 2007. Given that these data are known but have not been used in the estimation of the models, we can employ them in order to compare the forecasting performance of the six models. The comparison will be made in terms of the Root Mean Squared Error for each forecasting year, that is: 23 2 50 j r 1 er,2005 k eÙ r,2005 (k) j j 1,...,6 RMSE (k) ;

50 k 1,2

(40) where er,2005 k is the employment growth observed in province r and period 2006 (k=1) or 2007 (k=2), and eÙ r,2005 (k) j the corresponding forecast obtained with model j. In general, the preferred model should the model with the lowest RMSE. Table 4 present a summary of the final results. Table 4. A comparison of the forecasting performance. RMSE indicator. MODEL 1 MODEL 2 MODEL 3 MODEL 4 MODEL 5 MODEL 6 2006 4.251 5.860 7.554 7.943 7.227 4.675 2007 6.225 7.114 9.439 11.002 8.967 5.783 MEAN 5.328 6.487 8.497 9.473 8.097 5.229 The best model for predicting the variations in the employment of the Spanish provinces in the years 2006 and 2007 appears to be Model 6, the so called Spatially Simultaneous Model. This equation combines the temporal dynamics of the employment series with the (contemporaneous) spatial structure of the labor markets. The differences with Model 1, which does not include the spatial structure, are small. In the short run the forecasts of both models are almost the same although Model 1 does a little better. The predictions tend to diverge in the long run (our forecast horizon is limited just to two periods) and, apparently, Model 6 is superior in this context. The forecasts obtained, with Model 6 for the years 2006 and 2007 are represented in Figures 6 and 7. In each case, we include a map of observed growth, a second map with the predictions and a third one summarizing the forecasting errors in terms of over or under-prediction. From our perspective, the most intriguing fact of these maps is the apparent tendency to form cluster of provinces for which the model tends to over or under-predict, in spite of the (verified) absence of spatial dependence in the residuals. The composition and shape of the clusters changes every year which corroborates the absence of serial dependence. 24 Figure 6: Employment Growth Forecasts with MODEL 6. Year 2006. Observed Employment Growth Employment Growth Forecasts Prediction errors. (white: under; grey: over) Figure 7: Employment Growth Forecasts with MODEL 6. Year 2007. Observed Employment Growth Employment Growth Forecasts Prediction errors. (white: under; grey: over) 6- Conclusions. The importance of using a wide temporal perspective to better appreciate the evolution of economic facts has been recognized from the very beginning of the discipline of quantitative economics. At present, the same can said in relation to the spatial dimension of the data: it is important to pay attention also to the spatial breakdown of the variables of interest. It should be recognized that, during the last two 25 decades, the situation has greatly improved. The researches involved in regional analysis demand for larger and more complete data sets and the Statistical Offices are trying to satisfy these needs. On the other hand, the literature on applied economics is full of proposals and methods aiming to combine the temporal and spatial dynamics of the economic facts. In this paper we have tried to illustrate the possibilities of the growing literature on spatio-temporal modeling. We have focused specifically on panel data models. The objective of our application is to forecast the evolution of the Spanish employment, by provinces. From our point of view, the results obtained are quite satisfactory: the best model combines the spatial and the temporal structure of the series. The elements associated to the temporal dynamics seem to play a more prominent role in the specification, especially in the short-run. The spatial structure of the data on employment occupies a secondary role, whose importance increases in the long run. 26 References Anderson, T and Ch. Hsiao (1981): Formulation and Estimation of Dynamic Models Using Panel Data. Journal of Econometrics , 18 , 47-82. Anselin, L. (1988): Spatial Econometrics. Methods and Models. Dordrecht: Kluwer. Anselin L., J. Le Gallo and H. Jayet (2006): Spatial Panel Econometrics. In Matyas, L. and P. Sevestre (Eds.): The Econometrics of Panel Data, Fundamentals and Recent Developments in Theory and Practice (3ª edition). Dordrecht: Kluwer. Arellano, M (2003): Panel Data Econometrics . Oxford: Oxford University Press. Arellano, M and S. Bond (1991): Some Tests of Specification for Panel Data: Monte Carlo Evidence and an Application to Employment Equations. Review of Economic Studies , 58 , 277-297. Arellano, M and O. Bover (1995): Another Look at the Instrumental Variable Estimation of Error-Components Model. Journal of Econometrics , 68 , 29-51. Badinger H., W. Müller and G Tondl (2004): Regional Convergence in the European Union, 1985-1999: A Spatial Dynamic Panel Analysis. Regional Studies , 38 , 241œ 253. Bai, J. and S. Ng (2002): Determining the Number of Factors in Approximate Factor Models. Econometrica , 70 , 91-121. Balestra P. and M. Nerlove (1996): Pooling Cross-Section and Time Series Data in the Estimation of a Dynamic Model: The Demand for Natural Gas. Econometrica , 34 , 585-612. Baltagi, B. (2002): Econometric Analysis of Panel Data (2 nd edition). New York: John Willey. Barghava, A. and D. Sargan (1983): Estimating Dynamic Random Effects Models from Panel Data Covering Short Time Periods. Econometrica , 51 , 1635-1659. Beenstock, M. and D. Felsenstein (2007): Spatial Vector Autoregressions. Spatial Economic Analysis 2 167-196 Blundell, R and S. Bond (1998): Initial Conditions and Moment Restrictions in Dynamic Panel Data Models. Journal of Econometrics , 87 , 115-143. Carlino G. and R. DeFina (1998): The differential regional effects of monetary policy. Review of Economics and Statistics 80 , 572-587. Chamberlain, G. (1982): Multivariate Regression Models for Panel Data. Journal of Econometrics , 18 , 5-46. Chamberlain, G. (1987): Asymptotic Efficiency in Estimation with Conditional Moment Restrictions. Journal of Econometrics , 34 , 305-334. Di Giacinto, V. (2003): Differential Regional Effects of Monetary Policy: A Geographical SVAR Approach. International Regional Science Review, 26 , 313-341. Di Giacinto, V. (2006): A Generalized Space-Time Model with an Application to Regional Unemployment Analysis in Italy. International Regional Science Review , 29 , 159-198. Elhorst J. (2001): Dynamic Models in Space and Time. Geographical Analysis, 33 , 119- 140. Elhorst J. (2003) Specification and Estimation of Spatial Panel Data Models. International Regional Sciences Review , 26 ,244-268. Elhorst J. (2004): Serial and Spatial Error Dependence in Space-Time Models. In: A. Getis, J. Mur and H. Zoller (eds.) Spatial Econometrics and Spatial Statistics . 176- 193. Londres: Palgrave-MacMillan. Elhorst J. (2005): Models for Dynamic Panels in Space and Time. An Application to Regional Unemployment in the EU . Working paper. Department of General Economics, University of Groningen 27 Fotheringham A, Charlton M y Brunsdon C (1999): Geographically Weighted Regression. a Natural Evolution of the Expansion Method for Spatial Data Analysis. Environment and Planning A 30 : 1905-1927. Giacomini, R. and C. Granger (2004): Aggregation of space-time processes, Journal of Econometrics 118 , 7-26 Greene, W. (1997): Econometric Analysis . New York: McMillan (3 rd edition). Hsiao, C., H. Pesaran and A. Tahmiscioglu (2002), Maximum Likelihood Estimation of Fixed Effects Dynamic Panel Data Models Covering Short Time Periods. Journal of Econometrics , 109 , 107-150 Hsiao, Ch. (2003): Analysis of Panel Data (2 nd edition). Cambridge: Cambridge University Press. Kelejian H. and D. Robinson (1995): Spatial Autocorrelation: a Suggested Alternative to the Autoregressive Model. In: L. Anselin and R. Florax (eds.) New Directions in Spatial Econometrics . 75-95. Berlin: Springer-Verlag. Lesage J. and K. Pace (2004): Spatial Autoregressive Local Estimation. In A. Getis, J. Mur and H. Zoller (eds.) Spatial Econometrics and Spatial Statistics . 31-52. London: Palgrave-MacMillan. Lutkepohl, H. (1991): Introduction to Multiple Time Series Analysis . Berlin: Springer- Verlag Mundlak, Y. (1978): On the Pooling of the Time Series and Cross Section Data. Econometrica , 46 , 69-85. Mur, J. and F. López (2008): Testing for the Presence of Spatial Effects in Seemingly Unrelated Regressions. 3 rd Jean Paelinck Seminar. University of Cartagena. Nerlove, M. (1971): Further Evidence on the Estimation of Dynamic Economic Relations from a Time Series of Cross Sections. Econometrica , 39 , 359-382. Nickell, S. (1981): Biases in Dynamic Models with Fixed Effects. Econometrica , 49 , 1399-1416. Pesaran, H. (2005): Estimation and Inference in Large Heterogeneous Panels with a Multifactor Error Structure. Econometrica , 74 , 967-1012 Sargan, D. (1958): The Estimation of Economic Relationships Using Instrumental Variables. Econometrica , 26 , 393-415. Swamy, P. (1970): Efficient Inference in a Random Coefficient Regression Model. Econometrica , 38 , 311-323. Swamy, P. (1974): A Random Coefficient Model of the Demand for Liquid Assets. Journal of Money, Credit and Banking , 6, 241-252 Upton, G. and B. Fingleton (1988): Spatial Data Analysis by Example . New York: John Wiley & Sons. Wooldridge, J. (2002): Econometric Analysis of Cross Section and Panel Data . Cambridge: The MIT Press.