Multinomial Probit and Multinomial Logit: a Comparison of Choice Models for Voting Research Jay K
Total Page:16
File Type:pdf, Size:1020Kb
Electoral Studies 23 (2004) 107–122 www.elsevier.com/locate/electstud Multinomial probit and multinomial logit: a comparison of choice models for voting research Jay K. Dow ∗, James W. Endersby Department of Political Science, University of Missouri, Columbia, MO 65201-0630, USA Abstract Several recent studies of voter choice in multiparty elections point to the advantages of multinomial probit (MNP) relative to multinomial/conditional logit (MNL). We compare the MNP and MNL models and argue that the simpler logit is often preferable to the more complex probit for the study of voter choice in multi-party elections. Our argument rests on three areas of comparison between MNP and MNL. First, within the limits of typical data—a small sample of revealed voter choices among a few candidates or parties—neither model will clearly appear to have generated the observed data. Second, MNP is susceptible to a number of estimation problems, the most serious of which is that the MNP is often weakly identified in application. Weak identification is difficult to diagnose and may lead to plausible, yet arbitrary or mislead- ing inferences. Finally, the logit model is criticized because it imposes the independence of irrelevant alternatives (IIA) property on voter choice. For most applications the IIA property is neither relevant nor particularly restrictive. We illustrate our arguments using data from recent US and French presidential elections. 2003 Elsevier Ltd. All rights reserved. 1. Introduction Models of voter choice among more than two parties or candidates rest on underly- ing assumptions about the nature of individual decision-making. Statistical methods commonly used to estimate models of multiparty vote choice often impose restrictive assumptions about these choices that render inferences suspect. For example, binary logit or probit analyses are sometimes used to represent voter choice as a decision ∗ Corresponding author. Tel.: +1-573-882-0047; fax: +1-573-884-5131. E-mail addresses: [email protected] (J.K. Dow); [email protected] (J.W. Endersby). 0261-3794/$ - see front matter 2003 Elsevier Ltd. All rights reserved. doi:10.1016/S0261-3794(03)00040-4 108 J.K. Dow, J.W. Endersby / Electoral Studies 23 (2004) 107–122 between government and opposition. This ignores potentially important differences within the opposition. Less restrictive are ordered models including ordered logit, ordered probit and least squares. These represent voter choice among multiple candi- dates or parties, but impose the assumption that political competition takes place along a single, ordered axis. The unidimensionality assumption is typically invalid for multiparty systems (Lijphart, 1984; Taagepera and Shugart, 1989: pp. 92–98). Stokes (1963) provided an early criticism of the ordered axis assumption. Alvarez and Nagler (1995, 1998, 2001), Alvarez et al. (2000), Schofield et al. (1998), Lacey and Burden (1999) and Quinn et al. (1999) identify advantages of multinomial probit (MNP) analysis in these applications. MNP fits within a class of multinomial choice models that includes the multinomial/conditional logit (MNL), and more flexible logit-type specifications including the generalized extreme value (GEV) and mixed logit (MXL) models. These represent voter choice as a decision among unordered alternatives (parties, candidates) as a function of chooser (voter) and choice (party/candidate) attributes. Since the MNL specification is tractable and simple to estimate, it is included in many commercial software packages and is used regularly in electoral research.1 MNL, however, imposes the restrictive assumption that choices are independent across alternatives. MNP does not impose the indepen- dence assumption and advances in computer technology make its estimation increas- ingly practical. Thus, one might reasonably argue that MNP should be the benchmark methodology in the study of voter choice in multiparty elections. Although political methodologists understand the relative merits of MNP and MNL in application, our experience suggests that many scholars investigating voter choice in multiparty elections are not fully aware of the assumptions and implications of the two models. We seek to inform applied researchers about the benefits and liabilities of MNP and MNL and to compare the two models in typical application. We assess MNP and MNL in the study of voter choice in multicandidate/multiparty elections and argue that for most purposes the simpler logit is preferable to probit. This is particularly true for applications estimating the probability that a voter casts a ballot for a candidate or a party selected from a fixed, stable pool of alternatives and the marginal changes in the probability associated with changes in the values of explanatory variables. We are motivated by our perception that electoral scholars are sometimes uncertain about MNP and associated issues such as the independence of irrelevant alternatives property (IIA). In particular, researchers are unsure as to whether their statistical analyses provide useful insights or are subject to challenge if based on MNL. We demonstrate that for the most common application in which these concerns arise—the study of voter choice in multiparty/candidate elections— these concerns are exaggerated and, under most circumstances, logit estimation per- forms as well or better than MNP. Our argument rests on three areas of comparison between the MNP and MNL models. First, since the statistical properties motivating MNP and MNL are asymp- 1 We do not distinguish between strictly multinomial (MNL) and conditional (CL) logit models as most electoral applications use a logit specification that combines the two. J.K. Dow, J.W. Endersby / Electoral Studies 23 (2004) 107–122 109 totic, there are few a priori reasons to believe that one method or the other is superior in typical application. The relative merits of any statistical model primarily depend on the method that best represents the underlying choice process that generates the observed data. Within the limits of typical data—a relatively small sample vector of observed choices among a few candidates or parties—neither model will clearly appear to have produced the sample of observed data. The estimated MNP and MNL coefficients and standard errors (adjusted for scale), probabilities and marginal effects will be indistinguishable. Second, MNP imparts a number of potentially serious problems. These are suf- ficiently difficult to detect that, in the absence of investing exceptional effort in model diagnostics, researchers are justified in using the MNL specification. The most important problem is that even formally identified MNP specifications are often weakly identified in application. This is serious because weak identification is diffi- cult to diagnose and may lead to plausible, yet arbitrary or misleading inferences. The MNP presents a difficult maximum likelihood optimization problem that some- times fails to converge at a global optimum or produces parameter estimates that are sufficiently imprecise as to make statistical inferences suspect. Except for cases of profound mis-specification, the logit likelihood will optimize at its global maximum and is not prone to optimization errors. Finally, the logit model is criticized because it imposes the IIA property on voter choice. The IIA property imposes the restriction that the relative odds of selecting between any two candidates or parties is independent of the number of alternatives. Behavioral theories of voter choice sometimes assume IIA is violated, so logit esti- mation is suspect. For much of empirical voting research, the IIA property is not particularly restrictive. We illustrate our arguments using data from recent US and French presidential elections. 2. The multinomial logit and multinomial probit models 2.1. Overview Scholars of voting behavior must adopt theoretically appropriate models of voter choice among multiple alternatives (Whitten and Palmer, 1996; Alvarez and Nagler, 1998; Adams and Merrill, 2000). The MNL and MNP models fit this requirement as each may be derived from economic theories of utility maximization. In multican- = = didate elections, assume voter i’s utility for candidate j, Uij,(i 1,…,n; j 1,…,p) is a function of both voter and candidate attributes and a stochastic error. A typical representation (Maddala, 1983: p. 61; Alvarez and Nagler, 1998; Powers and Xie, 2000: p. 248) is: ϭ Ј ϩ Ј ϩ Uij b Xij a jZi eij where Xij is a vector of candidate attributes as perceived by voter i, and Zi is a vector of voter characteristics. We seek to estimate the coefficients b and aj and their standard errors. 110 J.K. Dow, J.W. Endersby / Electoral Studies 23 (2004) 107–122 If each voter is a utility maximizer, the probability that voter i casts a ballot for candidate 1 is: Pi1 ϭ P[Ui1 Ͼ Ui2,Ui1 Ͼ Ui3,…,Ui1 Ͼ Uip] For any “m” in the set of 1,...,p candidates: ϭ Ϫ Ͻ Ј ϩ Ј Ϫ Ј ϩ Ј P(m) P[eim eij (b Xij a jZi) (b Xim a mZi), j m] MNL and MNP differ according to the assumed error structure for the eij. The logit assumes the errors are independent, identically distributed (iid) with type I extreme value distribution (log Weibull). The probability that a voter i votes for candidate j is given by: exp(bЈX ϩ aЈ Z ) P(vote ϭ j͉ , ,X ,Z ) ϭ ij j i b aj ij i p Ј ϩ Ј exp(b Xik a kZi) k ϭ 1 The MNP assumes the errors are distributed multivariate normal, with mean 0 and 2 s1 s12 . s1n s s2 .. covariance matrix Σ= 12 2 . The probabilities are written: ΄....΅ 2 s1n ..sn ϭ ͉ ⌺∗ P(vote j b,aj,Xij,Zi, ) ∗ ∗ ϩ ∗ ∗ ∗ ∗ ϩ ∗ b X1 a1 Z b XjϪ1 ajϪ1Z ∗ ∗ ∗ ∗ ϭ ͵ ͵ Ϫ ∂ ∂ Ϫ … f(ei1,…,eij 1) ei1,…, eij 1 Ϫϱ Ϫϱ where f(.) is the probability density function of the multivariate normal distribution.2 2.2. General comparisons Given the effort invested to argue the merits of one method over the other, the MNP and MNL are remarkably similar in important respects.