Lecture 6 Multiple Choice Models Part II – MN Probit, Ordered Choice
Total Page:16
File Type:pdf, Size:1020Kb
RS – Lecture 17 Lecture 6 Multiple Choice Models Part II – MN Probit, Ordered Choice 1 DCM: Different Models • Popular Models: 1. Probit Model 2. Binary Logit Model 3. Multinomial Logit Model 4. Nested Logit model 5. Ordered Logit Model • Relevant literature: - Train (2003): Discrete Choice Methods with Simulation - Franses and Paap (2001): Quantitative Models in Market Research - Hensher, Rose and Greene (2005): Applied Choice Analysis 1 RS – Lecture 17 Model – IIA: Alternative Models • In the MNL model we assumed independent nj with extreme value distributions. This essentially created the IIA property. • This is the main weakness of the MNL model. • The solution to the IIA problem is to relax the independence between the unobserved components of the latent utility, εi. • Solutions to IIA – Nested Logit Model, allowing correlation between some choices. – Models allowing correlation among the εi’s, such as MP Models. – Mixed or random coefficients models, where the marginal utilities associated with choice characteristics vary between individuals. Multinomial Probit Model • Changing the distribution of the error term in the RUM equation leads to alternative models. • A popular alternative: The εij’s follow an independent standard normal distributions for all i,j. • We retain independence across subjects but we allow dependence across alternatives, assuming that the vector εi = (εi1,εi2, , εiJ) follows a multivariate normal distribution, but with arbitrary covariance matrix Ω. 2 RS – Lecture 17 Multinomial Probit Model • The vector εi = (εi1,εi2, , εiJ) follows a multivariate normal distribution, but with arbitrary covariance matrix Ω. • The model is called the Multinomial probit model. It produces results similar results to the MNL model after standardization. • Some restrictions (normalization) on Ω are needed. • As usual with latent variable formulations, the variance of the error term cannot be separated from the regression coefficients. Setting the variances to one means that we work with a correlation matrix rather than a covariance matrix. MP Model – Pros & Cons • Main advantages: - Using ML, joint estimation of all parameters is possible. - It allows correlation between the utilities that an individual assigns to the various alternatives (relaxes IIA). - It does not rely on grouping choices. No restrictions on which choices are close substitutes. - It can also allow for heterogeneity in the (marginal) distributions for εi. • Main difficulty: Estimation. - ML estimation involves evaluating probabilities given by multidimensional normal integrals, a limitation that forces practical applications to a few alternatives (J=3,4). Quadrature methods can be used to approximate the integral, but for large J, often imprecise. 3 RS – Lecture 17 MP Model – Estimation • Probit Problem: Pnj Prob[Y j 1| X ] ... I[Vnj Vni nji ;j i] f (n )dn 1 1 J-dimensional integral involves ξjk=εk-εj, which is normally distributed, with variance Ω. We can rewrite the the probability as: P[yj=1|X] = P(ξj < Vj ) where Vj is the vector with kth element Vjk= xj’β-xk’β. Let θ={β,Ω}. To get the MLE, we need to evaluate this integral for any β and Ω. The MLE of θ maximizes L = Σn Σj ynj logP(ξj< Vj ) <= we need to integrate MP Model – Estimation • We need to integrate to get log P(ξj< Vj ) If J=3, we need to evaluate a bivariae normal –no problem. If J>3, we need to evaluate a 3-dimensional integral. A usual approach is to use Guassian quadrature (Recall Math Review, Lecture 12). Most current software programs use the Butler and Moffit (1982) method, based on Hermite quadrature. Practical considerations: If J>4, numerical procedures get complicated and, often, imprecise. For these cases, we rely on simulation-based estimation -simulated maximum likelihood or SML. 4 RS – Lecture 17 Review: Gaussian Quadratures • Newton-Cotes Formulae – Nodes: Use evenly-spaced functional values – Weights: Use Lagrange interpolation. Best, given the nodes. – It can explode for large n (Runge’s phenomenon) • Gaussian Quadratures – Select functional values at non-uniformly distributed points to achieve higher accuracy. The values are not predetermined, but unknowns to be determined. – Nodes and Weight are both “best” to get an exact answer if f is a (2n-1)th-order polynomial. Legendre polynomials are used. – Change of variables => the interval of integration is [-1,1]. 9 Review: Gaussian Quadratures • The Gauss-Legendre quadrature formula is stated as 1 n f(x)dx c f (x ) i i 1 i1 the ci's are called the weights, the xi's are called the quadrature nodes. The approximation error term, ε, is called the truncation error for integration. For Gauss-Legendre quadrature, the nodes are chosen to be zeros of certain Legendre (orthogonal) polynomials. 10 5 RS – Lecture 17 Change of Interval for Gaussian Quadrature • Coordinate transformation from [a,b] to [-1,1] This can be done by an affine transformation on t and a change of variables. b a b a t x 2 2 b a dt dx 2 x 1 t a x 1 t b abt1 t2 b 1 b a b a b a b a n f (t)dt f ( x ) ( )dx ci f (xi ) a 1 2 2 2 2 i1 11 Review: Gaussian Quadrature on [-1, 1] • Gauss Quadrature General formulation: 1 n f ( x )dx ci f ( xi ) c1 f ( x1 ) c2 f ( x2 ) cn f ( xn ) 1 i1 1 n 2 : f(x)dx 1 c1f(x1) c2f(x 2 ) -1x1 x2 1 •For n=2, we have four unknowns (c1, c2, x1, x2). These are found by assuming that the formula gives exact results for integrating a general 3rd order polynomial. It can also be done by choosing (c , 121 0 1 2 3 c2, x1, x2) such that it yields “exact integral” for f(x) = x , x , x , x . 6 RS – Lecture 17 Review: Gaussian Quadrature on [-1, 1] 1 Case n 2 f(x)dx c1f(x1) c2f(x 2 ) 1 Exact integral for f = x0, x1, x2, x3 – Four equations for four unknowns 1 f 1 1dx 2 c1 c2 c1 1 1 c 1 1 2 f x xdx 0 c1 x1 c2 x2 1 1 x 1 1 2 2 2 2 2 3 f x x dx c1 x1 c2 x2 1 3 1 1 x 3 3 3 3 2 f x x dx 0 c1 x1 c2 x2 3 1 1 1 1 I f ( x )dx f ( ) f ( ) 1 3 3 13 Review: Gaussian Quadrature on [-1, 1] 1 Case n 3: f (x)dx c1 f (x1) c2 f (x2 ) c3 f (x3 ) 1 -1x1 x2 x3 1 • Now, choose (c1, c2, c3, x1, x2,x3) such that the method yields 0 1 2 3 4 5 “exact integral” for f(x) = x , x , x , x ,x , x . (Again, (c1, c2, c3, x1, x2,x3) are calculated by assuming the formula gives exact expressions for integrating a fifth order polynomial). 14 7 RS – Lecture 17 Review: Gaussian Quadrature on [-1, 1] 1 f 1 xdx 2 c c c 1 2 3 1 1 f x xdx 0 c x c x c x 1 1 2 2 3 3 c1 5/ 9 1 1 c2 8/ 9 2 2 2 2 2 2 f x x dx c1x1 c2 x2 c3 x3 c 5/ 9 3 3 1 1 x1 3/ 5 f x3 x3dx 0 c x3 c x3 c x3 1 1 2 2 3 3 x 0 1 2 1 2 x 3/ 5 f x 4 x 4dx c x 4 c x 4 c x 4 3 1 1 2 2 3 3 1 5 1 5 5 5 5 5 f x x dx 0 c1x1 c2 x2 c3 x3 15 1 Review: Gaussian Quadrature on [-1, 1] • Approximation formula for n=3 1 5 3 8 5 3 I f ( x )dx f ( ) f ( 0 ) f ( ) 1 9 5 9 9 5 16 8 RS – Lecture 17 Review: Gaussian Quadrature – Example 1 • Evaluate: 4 I te 2 t dt 5216 .926477 0 - Coordinate transformation b a b a t x 2 x 2 ; dt 2dx 2 2 4 1 1 I te 2 t dt ( 4 x 4 )e 4 x 4 dx f ( x )dx 0 1 1 - Two-point formula (n=2) 4 4 1 1 1 4 4 4 4 I f (x)dx f ( ) f ( ) (4 )e 3 (4 )e 3 1 3 3 3 3 9.167657324 3468.376279 3477.543936 ( 33.34%) 17 Review: Gaussian Quadrature – Example 1 - Three-point formula (n=3) 1 5 8 5 I f ( x )dx f ( 0.6 ) f ( 0 ) f ( 0.6 ) 1 9 9 9 5 8 5 ( 4 4 0.6 )e 4 0.6 ( 4 )e 4 ( 4 4 0.6 )e 4 0.6 9 9 9 5 8 5 ( 2.221191545 ) ( 218.3926001 ) ( 8589.142689 ) 9 9 9 4967.106689 ( 4.79%) - Four-point formula (n=4) 1 I f (x)dx 0.34785f (0.861136) f (0.861136) 1 0.652145f (0.339981) f (0.339981) 5197.54375 ( 0.37%) 18 9 RS – Lecture 17 Review: Gaussian Quadrature – Example 2 • Evaluate x 2 1 1.64 I e 2 dx .44949742 2 0 - Coordinate transformation b a b a t x .82x .82 .82(1 x); dt .82dx 2 2 t 2 1 1 1.64 .82 1 [.82(1x)]2 .82 1 I e 2 dt e 2 dx f (x)dx 2 0 2 1 2 1 19 Review: Gaussian Quadrature – Example 2 - Two-point formula (n=2) 1 1 2 1 1 2 1 [.82(1 )] [.82(1 )] .82 .82 1 1 .82 2 3 2 3 I f (x)dx f ( ) f ( ) e e 2 1 2 3 3 2 0.32713267 *(0.94171147 + 0.43323413) .44978962 ( 0.065%) - Three-point formula (n=3) .82 1 .82 5 8 5 I f (x)dx f ( 0.6 ) f (0) f ( 0.6 ) 2 1 2 9 9 9 1 1 1 .82 5 [.82(1 0.6 )]2 8 [.82(10)]2 5 [.82(1 0.6 )]2 e 2 e 2 e 2 9 9 9 2 .32713267 * (0.5461465 9 + 0.63509351 + 0.19271450 ) 0.44946544 ( 0.007%) 20 10 RS – Lecture 17 Review: Multidimensional Integrals • In the review, we concentrated on one-dimensional integrals.