The Random Coefficients Logit Model Is Identified
Total Page:16
File Type:pdf, Size:1020Kb
Journal of Econometrics 166 (2012) 204–212 Contents lists available at SciVerse ScienceDirect Journal of Econometrics journal homepage: www.elsevier.com/locate/jeconom The random coefficients logit model is identified Jeremy T. Fox a,∗, Kyoo il Kim b, Stephen P. Ryan c, Patrick Bajari d a University of Michigan & NBER, United States b University of Minnesota, United States c MIT & NBER, United States d University of Minnesota & NBER, United States article info a b s t r a c t Article history: The random coefficients multinomial choice logit model, also known as the mixed logit, has been Received 11 November 2009 widely used in empirical choice analysis for the last thirty years. We prove that the distribution of Received in revised form random coefficients in the multinomial logit model is nonparametrically identified. Our approach requires 21 August 2011 variation in product characteristics only locally and does not rely on the special regressors with large Accepted 4 September 2011 supports used in related papers. One of our two identification arguments is constructive. Both approaches Available online 16 September 2011 may be applied to other choice models with random coefficients. ' 2011 Elsevier B.V. All rights reserved. 1. Introduction xj;k − ETxj;kU, where the mean is taken over the inside goods and across markets or consumers when the exogenous variables vary One of the most commonly used models in applied choice anal- across markets or consumers. This location normalization for the ysis is the random coefficients logit model, also known as the covariates is not without loss of generality in a model, like the logit, mixed logit, which describes choice between one of a finite num- where the additive error is treated parametrically. However, one ber of competing alternatives. Domencich and McFadden (1975), D 0 0 normalization is just as arbitrary as any other. Let x .x1;:::; xJ / Heckman and Willis (1977) and Hausman and Wise (1978) intro- denote the stacked vector of all the xj: duced flexible specifications for discrete choice models, while the Each consumer i has a preference parameter βi, which is a vector random coefficients logit model was proposed by Boyd and Mell- of K marginal utilities that gives i's preferences over the K product man (1980) and Cardell and Dunbar (1980). Currently, the random characteristics. There is also a homogeneous term for each choice coefficients logit model is widely used to model consumer choice j, denoted as αj. Each αj could be an intercept common to product in environmental economics, industrial organization, marketing, j or a term that captures product characteristics without random public economics, transportation economics and other fields. D C 0 coefficients (homogeneous coefficients), as in αj α wj γw. Here In the random coefficients logit, each consumer can choose α is a common intercept for the inside goods contributing to the D between j 1;:::; J mutually exclusive inside goods and one utility of purchasing an inside good instead of the outside good, outside good (good 0). The exogenous variables for choice j are 0 γw is a vector of parameters on the characteristics in the vector wj, in the K × 1 vector xj D .xj;1;:::; xj;K / . In the example of D 0 0 and w .w1; : : : ; wJ / denotes the stacked vector of other product demand estimation, xj might include the product characteristics, J the price of good j and the interactions of product characteristics characteristics. Let α be the vector of the J αj's. Agent i's utility for with the demographics of agent i. A source of variation in product choice j is equal to characteristics is the differences in characteristics across markets, D C 0 C ui;j αj xjβi ϵi;j: (1) which could be indexed by t. So potentially xj can depend on consumers i and markets t, but we suppress these subscripts The outside good has a utility of ui;0 D 0 C ϵi;0, as a location nor- for transparency. In this paper, for some results we assume that malization for the utilities. The type I extreme value distribution the support of xj includes the 0 vector, which can occur by gives the scale normalization for utility values. centering each element of xj around constants common across the The logit model is defined when the errors ϵi;j are i.i.d. inside goods. For example, for each k we may redefine xQj;k D across choices and each error has the type I extreme value distribution, which has a cumulative distribution function of exp − exp −ϵi;j . The random coefficients logit arises when βi ∗ Correspondence to: University of Michigan, 48109-1220 Ann Arbor, MI, United varies across the population, with unknown distribution F .βi/. States. Tel.: +1 734 330 2854; fax: +1 734 274 2331. The unknown objects of estimation are the distribution F .βi/ and 0 J E-mail address: [email protected] (J.T. Fox). the homogeneous coefficients (α; γw/ or α . Under the standard 0304-4076/$ – see front matter ' 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.jeconom.2011.09.002 J.T. Fox et al. / Journal of Econometrics 166 (2012) 204–212 205 assumption that βi is independent of xj (we discuss endogeneity crucial to identification. We show that indeed the random coeffi- below), utility maximization leads to a choice probability for the cients logit model is identified, which provides a more solid econo- good j, metric foundation for its application in applied microeconomics. The paper is organized as follows: Section 2 discusses the Z exp α C x0β J j j i related literature, Section 3 introduces notation for a generic Pr j j xI F; α ≡ dF .βi/ : (2) J mixtures model, Section 4 states the general non-constructive 0 C P 0 C 1 exp αj xj0 βi identification result, Section 5 states the general constructive 0 j D1 identification result, Section 6 shows how the results apply to This specification is popular with empirical researchers because the random coefficients logit, Section 7 explores extensions, and the resulting choice probabilities are relatively flexible. In terms Section 8 concludes. of modeling own- and cross-price elasticities, the random coeffi- cients logit model allows products with similar x's to be closer sub- 2. Related literature stitutes, which the logit model without random coefficients does not allow. There is a growing literature on the identification of binary and In this paper, we prove that the distribution F(β/ is nonpara- multinomial choice models with unobserved heterogeneity. These metrically identified, in the sense that the true F 0; αJ;0 is the papers differ in the set of assumptions made in order to obtain only pair F; αJ that solves Pr .j j x; w/ ≡ Pr j j xI F; αJ in (2) identification results and also differ in the objects of interest in for all j and all .x; w/, where Pr .j j x; w/ denotes the popula- identification. tion choice probabilities. We first recover the homogeneous terms, Ichimura and Thompson (1998) study the case of binary choice: J one inside good (J D 1) and one outside good. This restriction α (or the fixed parameters α and γw) in (2). We then provide two identification arguments for the distribution of random co- makes their method inapplicable for most empirical applications efficients, one of which is constructive and the other of which is of demand analysis, which study markets with two or more inside goods. Ichimura and Thompson identify the cumulative non-constructive. The non-constructive identification theorem leverages results from Hornik (1991) on function approximation distribution function (CDF) of, in our notation, β; ϵi;1 − ϵi;0 . They and Stinchcombe and White (1998) on consistent specification use a theorem due to Cramér and Wold (1936) and do not exploit testing. The theorem requires variation in x in only an open set. the structure of the extreme value assumptions on the ϵi;1 and ϵi;0. Our other identification argument is constructive. We demon- Consequently, they need stronger assumptions: (1) a monotonicity strate how to iteratively find all moments of β, which is sufficient assumption (sign restriction) on one of the K components of 0 to identify the distribution F within the class of distributions that β (βi;k > 0 8 i/ and (2) a full support assumption for all K elements are uniquely determined by all of their moments. This class is the of x1. Similar assumptions will appear in many of the papers below. set of probability distribution functions that satisfy the Carleman Gautier and Kitamura (2009) provide a computationally simple condition, which we review below. estimator and some alternative identification arguments for the Both the constructive and non-constructive proof strategies same binary choice model as Ichimura and Thompson. To our are not unique to the logit: they could be applied to identify knowledge, no one has generalized Ichimura and Thompson to the the distribution of heterogeneity in many differentiable economic case of multinomial choice. models where covariates enter as linear indices. We outline the We will refer to a regressor whose (random) coefficient has a main theorems using generic notation and verify their conditions sign restriction and that has full support as a ``special regressor''. for the multinomial logit model. The sign restriction for a special regressor may not be too While one of our identification approaches is constructive, we restrictive because it is testable and the sign can be determined do not recommend that empirical researchers adopt an analog es- from a pre-model analysis as described in Lewbel (2000) and his timator to our identification argument.