A Service of
Leibniz-Informationszentrum econstor Wirtschaft Leibniz Information Centre Make Your Publications Visible. zbw for Economics
Dahlberg, Matz; Eklöf, Matias
Working Paper Relaxing the IIA Assumption in Locational Choice Models: A Comparison Between Conditional Logit, Mixed Logit, and Multinomial Probit Models
Working Paper, No. 2003:9
Provided in Cooperation with: Department of Economics, Uppsala University
Suggested Citation: Dahlberg, Matz; Eklöf, Matias (2003) : Relaxing the IIA Assumption in Locational Choice Models: A Comparison Between Conditional Logit, Mixed Logit, and Multinomial Probit Models, Working Paper, No. 2003:9, Uppsala University, Department of Economics, Uppsala, http://nbn-resolving.de/urn:nbn:se:uu:diva-4463
This Version is available at: http://hdl.handle.net/10419/82744
Standard-Nutzungsbedingungen: Terms of use:
Die Dokumente auf EconStor dürfen zu eigenen wissenschaftlichen Documents in EconStor may be saved and copied for your Zwecken und zum Privatgebrauch gespeichert und kopiert werden. personal and scholarly purposes.
Sie dürfen die Dokumente nicht für öffentliche oder kommerzielle You are not to copy documents for public or commercial Zwecke vervielfältigen, öffentlich ausstellen, öffentlich zugänglich purposes, to exhibit the documents publicly, to make them machen, vertreiben oder anderweitig nutzen. publicly available on the internet, or to distribute or otherwise use the documents in public. Sofern die Verfasser die Dokumente unter Open-Content-Lizenzen (insbesondere CC-Lizenzen) zur Verfügung gestellt haben sollten, If the documents have been made available under an Open gelten abweichend von diesen Nutzungsbedingungen die in der dort Content Licence (especially Creative Commons Licences), you genannten Lizenz gewährten Nutzungsrechte. may exercise further usage rights as specified in the indicated licence. www.econstor.eu Relaxing the IIA Assumption in Locational Choice Models: A Comparison Between Conditional Logit, Mixed Logit, and Multinomial Probit Models∗
Matz Dahlberg and Matias Ekl¨of
First version: February 2002 This version: February 2003
Abstract This paper estimates a locational choice model to assess the demand for local public services, using a data set where individuals chooses be- tween 26 municipalities within a local labor market. We assess the importance of the IIA assumption by comparing the predictions of three difference models; the conditional logit (CL) model, the mixed logit (MXL) model, and the multinomial probit (MNP) model. Our main finding is that a MXL or a MNP estimator leads to exactly the same conclusions as the traditional CL estimator. That is, given the data used here, the IIA assumption, and hence the use of a CL esti- mator, seems to be valid when estimating Tiebout-related migration. The only instance when we get somewhat different results when using the MXL or MNP estimator compared with the CL estimator is when we have a too parsimonious model. One possible hypothesis explaining this result is that omitted variables are captured by the distribution parameters of the coefficients of the included variables, leading to the false conclusion that the coefficients are not fixed. This hypothesis is supported by the results from a Monte Carlo investigation.
JEL Classification: C15, C25, H72, H73 Keywords: Locational choice, Tiebout migration, Mixed logit, Multi- nomial probit
∗We are grateful for comments from seminar participants at Uppsala University, G¨oteborg University, and the 2002 EEA meeting in Venice, Italy. The authors can be reached at: Department of Economics, Uppsala University, PO Box 513, SE-751 20 Up- psala, Sweden; [email protected]; [email protected]. Matz Dahlberg grate- fully acknowledges financial support from The Swedish Research Council. Matias Ekl¨of gratefully acknowledges financial support from the Jan Wallander foundation. 1 Introduction
Since there is no market for local public services, it is not obvious how to estimate preferences for these services. In the literature, there exist several approaches to this problem. These include the median voter model (e.g., Bergstrom & Goodman (1973)), survey data approaches (e.g., Bergstrom, Rubinfeld & Shapiro (1982)), hedonic price models (e.g., Rosen & Fullerton (1977)), and discrete choice approaches estimating Tiebout-related migra- tion based on random utility models (e.g., McFadden (1978)). All earlier studies of Tiebout-related migration have assumed that the In- dependence from Irrelevant Alternatives (IIA)-assumption is valid since they have used conditional logit models (Friedman (1981), Quigley (1985), Boije & Dahlberg (1997), Nechyba & Strauss (1998), and Dahlberg & Fredriksson (2001)). This is potentially problematic since the IIA-assumption implies that the odds-ratio between two alternatives does not change by the inclu- sion (or exclusion) of any other alternative. The purpose of this paper is to reexamine the question of the impor- tance of local public services for community choice by adopting estimation methods that allow for more flexible substitution patterns between the al- ternatives. More specifically, we will relax the IIA-assumption by using simulation techniques (multinomial probit and mixed logit models) and in- vestigate the effects this might have on the results. Neither the multinomial probit nor the mixed logit model has been used to analyze the economic problem under study in this paper. Furthermore, since McFadden & Train (2000) claim that, based on theoretical grounds, mixed logit and multino- mial probit models are very similar, a more general aim of the paper is to investigate to what extent this claim is true also in real applications with many alternatives. As far as we know, this is the first time simulation based estimators are used in applications on micro-data with as many as 26 al- ternatives. A further question is then if simulation based estimators can be practically used in applications with many alternatives. Swedish data are very suitable for the purpose of this paper. First, the quality of the data is exceptional. Second, local governments comprise a sizable fraction of aggregate economic activity in Sweden: in 1992, local government expenditure amounted to around 27 percent of GDP; by com- parison, expenditures at the federal and local level in the US amounted to 15 percent (OECD (1994)). Third, local governments have important responsi- bilities such as the provision of day care, education, elderly care, and social welfare services. Finally, local governments have a large degree of autonomy regarding spending, taxing, and borrowing decisions. We have access to a unique individual data set - LINDA; see Edin & Fredriksson (2000). LINDA contains the characteristics of a large panel of individuals and is representative for the Swedish population. From these data we have selected all individuals who moved to a new municipality within
1 the local labor market of Stockholm between 1990 and 1991. To these data we match a set of (destination) characteristics of the local public sector and other characteristics of the municipality, such as housing. The same data is used in Dahlberg & Fredriksson (2001). Our main finding is that a mixed logit or a multinomial probit estima- tor leads to exactly the same conclusions as the traditional conditional logit estimator: When we relax the traditional assumption that the coefficients are the same for all individuals and estimate distribution parameters for coefficients that are assumed to vary randomly in the population, we cannot reject the hypothesis of fixed coefficients. That is, the IIA-assumption, and hence the use of a conditional logit estimator, seems to be valid when esti- mating Tiebout-related migration, at least when using Swedish data. The only instance when we get somewhat different results when using the mixed logit or multinomial probit estimator compared with the conditional logit estimator is when we have a too parsimonious model. One possible hy- pothesis explaining this result is that omitted variables are captured by the distribution parameters of the coefficients of the included variables, leading to the false conclusion that the coefficients are not fixed. This hypothesis is supported by the results from a Monte Carlo investigation. The paper is outlined as follows. The next section describes the theoret- ical framework and section 3 presents the econometric methods to be used in the paper. Section 4 presents the data, section 5 the empirical results, and section 6 the Monte Carlo investigation. Section 7 concludes.
2 Theoretical Framework
To fix ideas, we will present a simple theoretical model based on the ran- dom utility model. The random utility model assumes a stochastic consumer decision process in which goods are treated as discrete. Households are as- sumed to choose one alternative out of a set of discrete alternatives that maximizes their utility. The utility function is assumed to be composed of a systematic part and a stochastic component. Assuming a specific distribu- tion of the stochastic component makes it possible to estimate the unknown parameters of the utility function. Consider an individual who is confronted with a discrete set of location alternatives (communities) within a local labor market.1 When maximizing over this discrete set of alternatives she takes the attributes of the commu- nities into consideration. In the spirit of Tiebout (1956), we mainly have local public services (gj) in mind when characterizing the attributes of the community (j). The individual has additively separable preferences over the consumption of private goods (xij) (housing consumption is subsumed into xij) and public
1We assume that the choice of local labor market has been made in a prior stage.
2 goods. We assume that the utility function is given by
uij = aj + z(xij) + m(gj) + εij (1) where aj denotes community amenities apart from local public services. The random component of (1), εij, captures random preferences for the (j)’th alternative. The individual budget constraint takes the form
yi(1 − τj) = ρjxij (2) where yi denotes income, ρj the price of private goods, and τj the local income tax rate. Thus, local public services are financed by income taxes.2 For estimation purposes, we will assume that the functions z(·) and m(·) in (1) are logarithmic. So a stylized version of utility would be
uij = β0 ln yi + β1 ln (1 − τj) + β2 ln ρj + β3 ln gj + aj + εij (3) where yi can be ignored since it does not vary by j. In the empirical part of the paper, we will think of ρj as primarily reflecting differences in housing prices across communities. The utility actually observed is the maximum over the set of all possibilities and (in principle) the coefficients have the interpretation of marginal utilities.3
3 Econometric Methods
When estimating the random utility model in (3), we will use three different approaches. In the first approach, we assume that the IIA assumption holds, and apply McFadden’s conditional logit estimator. This is the approach taken in earlier locational choice studies. In the second and third approach, we will use simulation estimation techniques to estimate a mixed logit specification and a multinomial probit model. This allow us to relax two strong assumptions used in the first ap- proach: the IIA-assumption and the assumption of fixed coefficients. Hence, we do not have to assume that the alternatives are independent of each other (meaning that we can estimate our model with flexible substitution patterns) and estimate distribution parameters for the coefficients in our model. By comparing the mixed logit and multinomial probit results with the results from the first approach, we can get an indication on how sensitive the results are to those two assumptions, and by comparing the mixed logit with the multinomial probit results, we can (i) examine whether there are any differences between the multinomial probit and the mixed logit model
2In Sweden, 99 % of the taxes raised at the municipal level come from income taxation. Moreover, the local tax rate is proportional so, abstracting from savings, there is not much abuse of reality in specifying the left-hand side of (2). 3 The simple model outlined here of course implies the restriction β1 = −β2.
3 in real applications (according to McFadden & Train (2000), they shall be very similar) and (ii) investigate whether there are any practical reasons to chose either the multinomial probit or the mixed logit model in an appli- cation with many alternatives. Furthermore, the mixed logit is interesting to use since it nests the conditional logit estimator, thereby allowing us to conduct a direct test to whether the conditional logit or the mixed logit is the appropriate estimator to use. As noted in the introduction, neither the multinomial probit nor the mixed logit model has been used to analyze the economic problem under study in this paper. These three approaches will be briefly described below. To simplify notation, let us rewrite (3) in a general form as
0 Uij = xijβij + εij
3.1 Conditional logit (McFadden 1973) and (McFadden 1978) showed that if the systematic part of the utility function has an additively separable, linear in parameters, form and the residuals εij are independently and identically distributed with the type I extreme-value distribution, then the probability that household i will choose the j’th municipality is given by
0 exp xijβ Pr (Yi = j) = PJ 0 j=1 exp xijβ
The assumption of independence of εij requires that there are no sim- ilarities among the alternatives, implying that the odds ratio between two alternatives does not change by the inclusion or exclusion of any other al- ternative. This is a property that has been labelled the “independence from irrelevant alternatives“ (the IIA-property).
3.2 Mixed logit An obvious (theoretical) way to handle the IIA property is to allow the unobserved part of the utility function to follow, e.g., a multivariate normal distribution, allowing the residuals to be correlated with each other, and estimate the model with a multinomial probit model. This approach has, however, been less obvious in empirical applications since multiple integrals then have to be evaluated. The improvements in computer speed and in our understanding of the use of simulation techniques in estimation have however made other approaches than the traditional one as viable alternatives. In this paper, we will adopt both the multinomial probit model and an approach
4 that has recently been suggested in the econometrics literature: the mixed logit model.4 When using a mixed logit model, we relax the assumption that the co- efficients are the same for all individuals. More specifically, we estimate distribution parameters for coefficients that are assumed to vary randomly in the population. For a general characterization of the mixed logit model in a cross-sectional setting, let the utility functions take the form (where i denotes individuals and j choice alternatives)
0 Uij = xijβi + εij (4) where βi is unobserved for each i. Let βi vary in the population with density f (βi|θ), where θ are the true parameters of the distribution. Furthermore, assume that εij are iid extreme value distributed. If we knew the value of βi, the conditional probability that person i chooses alternative j is standard logit:
0 xij βi ∗ e Lij (βi) = 0 (5) P xij βi j e We do not, however, know the persons’ individual tastes. Therefore, we need to calculate the unconditional probability, which is obtained by integrating (5) over all possible values of βi:
Z Lij (θ) = Lij (βi) f (βi|θ) dβi (6)
x0 β Z e ij i = f (β |θ) dβ (7) x0 β i i P e ij i j
0 0 0 0 Brownstone & Train (1999) assume that xijβi = xij (b + ηi) = xijb+xijη, where b is the population mean and ηi is the stochastic deviation which rep- resents the individual’s tastes relative to the average tastes in the population. 0 0 This means that the utility function takes the form Uij = xijb + xijηi + εij , 0 where xijηi are error components that induce heteroscedasticity and corre- lation over alternatives in the unobserved portion of the utility. This means that an important implication of the mixed logit specification is that we do not have to assume that the IIA property holds. Let g (ηi|θ) denote the
4The mixed logit model is described in, e.g., Brownstone & Train (1999) and Brown- stone & Train (1999) consider the model in a cross-sectional setting, Revelt & Train (1998) consider it in a panel data setting. Several names have been used in the literature for this model: random coefficient logit, random parameters logit, mixed multinomial logit, error components logit, probit with a logit kernel, and mixed logit. These names label the same underlying model. We stick with the name ‘mixed logit’.
5 density for η. Then different patterns of correlation, and hence different sub- stitution patterns, can be obtained through different specifications of g (·) and xij. For some further results for the mixed logit model, see below. Since the integral in (7) cannot be evaluated analytically, exact maxi- mum likelihood estimation is not possible. Instead the probability is ap- proximated through simulation.5 Maximization is then conducted on the simulated log-likelihood function.6 As noted above, an alternative to the mixed logit model is the multino- mial probit model. There are however some results in the literature indicat- ing that the mixed logit model might be preferable in situations where the aim is to estimate distribution coefficients for parameters in a model. These and other results will be discussed in the rest of this section. McFadden & Train (2000) establish, among other results, the following: 7 (1) Under mild regularity conditions, any discrete choice model derived from random utility maximization has choice probabilities that can be approxi- mated as closely as one pleases by a mixed logit model, (2) A mixed logit model with normally distributed coefficients can approximate a multinomial probit model as closely as one pleases, and (3) Non-parametric estimation of a random utility model for choice can be approached by successive approx- imations by mixed logit models with finite mixing distributions; e.g., latent class models. From an economic point of view, result (1) is interesting since we often want to put a utility maximizing perspective on the problem at hand. Furthermore, if we want to make welfare analysis (e.g., calculate will- ingness to pay), it is crucial that the observed choice probabilities can be motivated as the outcome of a utility maximization problem. If not, welfare analysis cannot be conducted. Result (2) is useful since it implies that mixed logit can be used wherever multinomial probit has been suggested and/or
5Since, in the model described in equations (5) and (7), the dimension of the inte- grals to be evaluated increases with the number of coefficients that are allowed to vary in the population, approaches using Gaussian quadrature to evaluate integrals must be considered being of limited value. A more fruitful approach is then to use simulation methods. 6The algorithm we use to obtain the simulated maximum likelihood results is described in, e.g., Brownstone & Train (1999) and Revelt & Train (1998). The estimation method typically proposed and used for mixed logit models in earlier studies is the maximum simulated likelihood (MSL). However, McFadden & Train (2000) suggest that the method of simulated moments (MSM) might be a useful alternative when estimating mixed logit models. According to Stern (1997), the MSL and MSM are two of four existing simulation based estimation methods. The other two are the method of simulated scores (MSS), and the Monte Carlo Markov Chain (MCMC) method (where one of the most known MCMC methods is Gibbs sampling). The most common methods are MSL and MSM, with some advantages for MSL (see B¨orsch-Supan & Hajivassiliou (1993) and Hajivassiliou, McFadden & Ruud (1996)). According to Stern, MSS is the least developed of the four, but it holds some significant promise. There are mixed evidence regarding the properties of Gibbs sampling methods (see Stern (1997)). It is left for future research to decide if any of the less developed methods is to be preferred over the MSL (or MSM) method. 7These results are obtained from McFadden (1996).
6 used.8 Advantages with the mixed logit specification does then include:
• The model does not exhibit the IIA property. • The model can, as closely as one wishes, approximate multinomial probit models. • Unlike pure probit, mixed logit can represent situations where the coefficients follow other distributions than the normal. • If the dimension of the mixing distribution is less than the number of alternatives, the mixed logit might have an advantage over the multi- nomial probit model simply because the simulation is over fewer di- mensions. • The model can be derived from utility maximizing behavior.
3.3 Multinomial Probit Another alternative model is hence the multinomial probit model. Let an individual face J mutually exclusive alternatives, each associated with an unobserved utility 0 Uij = xijβi + εij 0 where βi ∼ N(b, Σβ), εi = (εi1, . . . , εiJ ) ∼ N(0, Σε). This form allows for a high degree of flexibility including unobserved heterogeneity in tastes and arbitrary substitution patterns across alternatives. The goal of the analysis is to estimate the parameters b, Σβ, Σε using observed choices made by the individuals. The utility can be partitioned into a deterministic component and a stochastic component as follows 0 0 ˜ Uij = xijb + xijβi + εij (8) 0 Uij = xijb + ηij (9) 0 ηi = (ηi1, . . . , ηiJ ) ∼ N(0, Σηi ) (10) ˜ 0 0 where β = β − b, Σηi = xiΣβxi + Σε where xi = (xi1, . . . , xiJ ) . Hence, the covariance matrix of the stochastic utility component may vary across individuals even if Σβ and Σε are constant.
8Additional evidence on this point is given by Ben-Akiva & Bolduc (1996) (as re- ported by McFadden (1996)) and by Brownstone & Train (1999). Ben-Akiva & Bolduc (1996) find in Monte Carlo experiments that the mixed logit model gives approximation to multinomial probit probabilities that are comparable to the Geweke-Hajivassiliou-Keane simulator. Brownstone & Train (1999) find in an application that the mixed logit model can approximate multinomial probit probabilities more accurately than a direct Geweke- Hajivassiliou-Keane simulator, when both are constrained to use the same amount of computer time.
7 The individual is assumed to choose alternative j if this alternative gives her the highest utility among the present alternatives, i.e,
Uij > Uik, ∀k 6= j (11) 0 0 xijb + ηij > xikb + ηik, ∀k 6= j (12) Note that the observable choice is invariant to location and scale of the latent utility levels; only the differences of utility levels are important. An additive or (positive) multiplicative constant can not be identified in the model. Therefore, we need to normalize the model w.r.t. location and scale. By taking differences w.r.t. to a reference alternative r, we normalize w.r.t. location and measures the utilities in terms of differences. Fixing an element of the resulting covariance matrix will then set the scale, and we have achieved identification. The covariance of the transformed stochastic components can be de- rived from the original structure (Σβ, Σε) using the difference operator, Dr, defined as the J − 1 identity matrix with a vector of −1 inserted in the r’th column. Let r = 1, then the vector of transformed residuals is ∗ 0 0 0 ηi = (ηij − ηi1)j=1,...,J = (0, ηiD1) with covariance matrix 0 ··· ! Σ∗ = (13) ηi . 0 . D1Σηi D1 ∗ ∗ ∗ ∗ 0 Let xij = xij − xi1, xi = (xi1, . . . , xiJ ) , then the choice probability of alter- native j can be written in terms of our normalized (and identified) model as