Confidence Intervals for Choice Probabilities of the Multinomial Logit Model Joel Horowitz, U.S
Total Page:16
File Type:pdf, Size:1020Kb
23 the American Statistical Association, No. 50, 19 55, Centre de Recherche sur les Transports, Univ. de pp. 130-161. Montreal, Publication 82, 1978, 14 pp. 9. H. Theil. Principles of Econometrics. Wiley, New York, 1971, pp. 65-76. Publication of this paper sponsored by Committee on Traveler Behavior 10. M. Gaudry and M. G. Dagenais. The Dogit Model. and Values. Confidence Intervals for Choice Probabilities of the Multinomial Logit Model Joel Horowitz, U.S. Environmental Protection Agency This paper describes three methods for developing confidence intervals logit models, conditional on correct functional specifica for the choice probabilities in multinomial legit models. The confidence tion of the models and use of correct values of the ex intervals reflect the effects of sampling errors in the parameters of the planatory variables. models. The first method is based on the asymptotic sampling distribu· A model's forecasting error can be characterized in tion of the choice probabilities and leads to a joint confidence region for these probabilities. This confidence region is not rectangular and is use· a variety of ways, including average forecasting error ful mainly for testing hypotheses about the values of the choice proba· and root-mean-square forecasting error, in addition to bilities. The second method is based on an asymptotic linear approxima confidence intervals for the forecast. Among the vari tion of the relation between errors in models' parameters and errors in ous error characterizations, only the confidence inter - choice probabilities. The method yields confidence intervals for individ· val provides a range in which the true value of the fore ual choice probabilities as well as rectangular joint confidence regions for cast quantity is likely to lie. Methods for developing all of the choice probabilities. However, the linear approximation on which the method is based can yield erroneous results, thus limiting the confidence intervals for the forecasts of linear econo applicability of the method. A procedure for setting an upper bound on metric models are well known ( 4). However, these thll error caused by the linear approximation is described. The third methods are not applicable to logit models, which are method is based on nonlinear programming. This method also leads to nonlinear in parameters. Koppelman (5, 6) has analyzed rectangular joint confidence regions for the choice probabilities. The the forecasting errors of logit models andhas described nonlinear programming method is exact and, therefore, more generally the ways in which various sources of error contribute to applicable than the linear approximation method. However, when the total error in forecasts in choice probabilities. Koppel linear approximation is accurate, it tends to produce narrower confidence man 's error measures do not include confidence inter intervals than does the nonlinear programming method, except in cases where the number of alternatives in the choice set is either two or very vals for the choice probabilities although, as will be large. Several numerical examples are given in which the nonlinear pro· shown later in this paper, one of his error measures can gramming method is illustrated and compared with the linear be used to derive approximate confidence intervals. approximation method. Three methods for estimating confidence intervals for the choice probabilities of logit models are described in this paper. All of the methods lead to asymptotic con The multinomial logit formulation of urban travel-demand fidence intervals in that they are based on the large models has a variety of theoretical and computational sample properties of the estimated parameters of the advantages over other demand-model formulations arid models. The first method is based on the exact asymp is receiving widespread use both for research purposes totic sampling distribution of the choice probabilities and as a practical demand-forecasting tool (1-3). How and leads to a joint confidence region for these prob ever, travel-demand forecasts derived from- logit abilities. This region is useful mainly for testing hy models, like forecasts derived from other types of potheses about the values of the choice probabilities. econometric models, are subject to errors that arise The region is not rectangular and, therefore, is diffi from several sources, including sampling errors in the cult to use in practical forecasting. Moreover, the estimated values of parameters of the models, errors methods used to derive the confidence region cannot be in the values of explanatory variables, and errors in the readily extended to functions of the choice probabilities. functional specifications of the models. Knowledge of The second method is based on an asymptotic linear the magnitudes of forecasting errors can be important in approximation of the relation between sampling errors practice, particularly if either the errors themselves in models' parameters and sampling errors in choice or the costs of making erroneous decisions are large. probabilities. The linear approximation method yields This paper deals with the problem of estimating the mag confidence intervals for individual choice probabilities nitudes of forecasting errors that result from sampling as well as rectangular joint confidence regions for all of errors in the estimated values of the parameters of logit the choice probabilities. The method can easily be ex models. Specifically, the paper describes techniques tended to functions of the choice probabilities. However, for developing confidence intervals for choice probabili the linear approximation on which the method is based ties and functions of choice probabilities (e.g., aggre can yield erroneous results, thus limiting the method's gate market shares, changes in choice probabilities applicability. A procedure for placing an upper bound caused by changes in independent variables) derived from 24 on the error caused by the linear approximation is mean values (a,,} and covariance matrix A - 1, where described. The third method is based on nonlinear programming. N Jn Ars = - ~ ~ (Xirn - X.rn) (Xisn - X.,0 ) Pin (r,s = 1, ... , M) (5) This method yields rectangular joint confidence regions n=l j=l for the choice probabilities and can be extended to func tions of the choice probabilities. The method does not and require approximation of the relations between sampling errors in models' parameters and sampling errors in Jn X.rn = ~ Xjrnpjn (6) choice probabilities and, therefore, is more generally j=J applicable than is the linear approximation method. Several numerical examples are given in which the In addition, the quadratic form nonlinear programming method is illustrated and com M M pared with the linear approximation method. QU!_,g_) =~ ~ (ai -CTj)Aij(aj -CTj) (7) i=l j=l PROPERTIES OF THE LOGIT MODEL tends asymptotically to the chi-square distribution with In the multinomial logit model, the probability that in M degrees of freedom. dividual n selects alternative i from a set of J. available Let one of the J. alternatives available to individual alternatives is given by n be considered a numeraire, and de11ote this alterna tive by t. Then the random variables ('V,n - Vt.; i=l,' Pin exp(Yjn) (!) ... , Jn; irft} are linear combinations of the asymptoti =exp(Yiny~ cally normally distributed random variables (a.} and are themselves asymptotically jointly normally distrib where Pin is the probabiiity that aiternative i is chosen uted with mean values (V • - Vt.; i=l, ... , Jn; i I- t} and 1 1 by individual n, and Vl• (j=l, ... , J 0 ) is the systematic covariance matrix c. - , where component of the utility of alternative j to individual n. • M M For each alternative i, V 1 is assumed to be a linear 1 1 function of appropriate explanatory variables. Thus (C~ )ij = ~ ~ (A- ),,(X1rn - XtrnHXjsn - X1snl (8) r=l s=J M and (i, j=l, ... , J.; i, j I- t). In addition the quadratic Vir. = ~ Ximnl'.l:'.m (2) m=l form where Jn In R(Yn, Ynl = ~ ~ (Cn)ij [(Vin - Yin) - (Vin - Ytnll i=1 j=) M =the number of explanatory variables, i,j=ft x 1 •• = the value of the m th explanatory variable for alternative i and individual n, and X [(Yjn - Ytnl -(Yjn - Y1nll (9) C'i,, = the coefficient of explanatory variable m. is asymptotically distributed as chi-square with J. - 1 The values of the coefficients (or parameters) C'i,, ordi degrees of freedom. narily are not known a priori and are estimated from In practical applications of logit models, the prob observations of individuals' choices by using the method abilities Pin and, therefore, the matrix A in Equation 5 of maximum likelihood. Details of the estimation pro are not known due to their dependence on the unknown cedure and the statistical properties of the estimated coefficients (a,,}. Therefore, Pin is approximated by coefficients are described by McFadden (7). Pin in Equation 5. This approximation is used without Denote the estimated coefficients by (il,,; m=l, further comment in the rest of this paper. :.·., M }. For each alternative i and individual n define In the following discussion the subscript n, which Vin by denotes the individual, will not be used unless needed to prevent confusion. The choice probabilities will be - M understood to apply to an individual. The explanatory Vin=~ XLmnam (3) m=1 variables Ximn will be assumed to have known, fixed values. All uncertainty in the choice probabilities will v1• is the estimated systematic utility function for alter be due to their dependence on the unknown coefficients native i and individual n. V10 is a random variable by (a,, 1 virtue of its dependence on the random variables (a, J. Define Confidence Intervals for Choice Probabilities in Binary Logit Models Pin= exp(Vini/#, exp(Vjn) (i = l, ... , Jn; n = 1, ... , N) (4) If there are only two alternatives in the choice set (J=2), then c-1 is a scalar. Therefore, if Z.12 is the 100 (1 - •12) P1• estimates the probability that individual n makes percentile of the standard normal distribution, a choice i and is the forecast of the choice probability that 100(1-£) percent confidence interval for V1 - V2 is is used in applications of the logit model.