Multinomial Logistic Regression Models with SAS Example
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Small-Sample Properties of Tests for Heteroscedasticity in the Conditional Logit Model
HEDG Working Paper 06/04 Small-sample properties of tests for heteroscedasticity in the conditional logit model Arne Risa Hole May 2006 ISSN 1751-1976 york.ac.uk/res/herc/hedgwp Small-sample properties of tests for heteroscedasticity in the conditional logit model Arne Risa Hole National Primary Care Research and Development Centre Centre for Health Economics University of York May 16, 2006 Abstract This paper compares the small-sample properties of several asymp- totically equivalent tests for heteroscedasticity in the conditional logit model. While no test outperforms the others in all of the experiments conducted, the likelihood ratio test and a particular variety of the Wald test are found to have good properties in moderate samples as well as being relatively powerful. Keywords: conditional logit, heteroscedasticity JEL classi…cation: C25 National Primary Care Research and Development Centre, Centre for Health Eco- nomics, Alcuin ’A’Block, University of York, York YO10 5DD, UK. Tel.: +44 1904 321404; fax: +44 1904 321402. E-mail address: [email protected]. NPCRDC receives funding from the Department of Health. The views expressed are not necessarily those of the funders. 1 1 Introduction In most applications of the conditional logit model the error term is assumed to be homoscedastic. Recently, however, there has been a growing interest in testing the homoscedasticity assumption in applied work (Hensher et al., 1999; DeShazo and Fermo, 2002). This is partly due to the well-known result that heteroscedasticity causes the coe¢ cient estimates in discrete choice models to be inconsistent (Yatchew and Griliches, 1985), but also re‡ects a behavioural interest in factors in‡uencing the variance of the latent variables in the model (Louviere, 2001; Louviere et al., 2002). -
MNP: R Package for Fitting the Multinomial Probit Model
JSS Journal of Statistical Software May 2005, Volume 14, Issue 3. http://www.jstatsoft.org/ MNP: R Package for Fitting the Multinomial Probit Model Kosuke Imai David A. van Dyk Princeton University University of California, Irvine Abstract MNP is a publicly available R package that fits the Bayesian multinomial probit model via Markov chain Monte Carlo. The multinomial probit model is often used to analyze the discrete choices made by individuals recorded in survey data. Examples where the multi- nomial probit model may be useful include the analysis of product choice by consumers in market research and the analysis of candidate or party choice by voters in electoral studies. The MNP software can also fit the model with different choice sets for each in- dividual, and complete or partial individual choice orderings of the available alternatives from the choice set. The estimation is based on the efficient marginal data augmentation algorithm that is developed by Imai and van Dyk (2005). Keywords: data augmentation, discrete choice models, Markov chain Monte Carlo, preference data. 1. Introduction This paper illustrates how to use MNP, a publicly available R (R Development Core Team 2005) package, in order to fit the Bayesian multinomial probit model via Markov chain Monte Carlo. The multinomial probit model is often used to analyze the discrete choices made by individuals recorded in survey data. Examples where the multinomial probit model may be useful include the analysis of product choice by consumers in market research and the analysis of candidate or party choice by voters in electoral studies. The MNP software can also fit the model with different choice sets for each individual, and complete or partial individual choice orderings of the available alternatives from the choice set. -
Multinomial Logistic Regression
26-Osborne (Best)-45409.qxd 10/9/2007 5:52 PM Page 390 26 MULTINOMIAL LOGISTIC REGRESSION CAROLYN J. ANDERSON LESLIE RUTKOWSKI hapter 24 presented logistic regression Many of the concepts used in binary logistic models for dichotomous response vari- regression, such as the interpretation of parame- C ables; however, many discrete response ters in terms of odds ratios and modeling prob- variables have three or more categories (e.g., abilities, carry over to multicategory logistic political view, candidate voted for in an elec- regression models; however, two major modifica- tion, preferred mode of transportation, or tions are needed to deal with multiple categories response options on survey items). Multicate- of the response variable. One difference is that gory response variables are found in a wide with three or more levels of the response variable, range of experiments and studies in a variety of there are multiple ways to dichotomize the different fields. A detailed example presented in response variable. If J equals the number of cate- this chapter uses data from 600 students from gories of the response variable, then J(J – 1)/2 dif- the High School and Beyond study (Tatsuoka & ferent ways exist to dichotomize the categories. Lohnes, 1988) to look at the differences among In the High School and Beyond study, the three high school students who attended academic, program types can be dichotomized into pairs general, or vocational programs. The students’ of programs (i.e., academic and general, voca- socioeconomic status (ordinal), achievement tional and general, and academic and vocational). test scores (numerical), and type of school How the response variable is dichotomized (nominal) are all examined as possible explana- depends, in part, on the nature of the variable. -
Logit and Ordered Logit Regression (Ver
Getting Started in Logit and Ordered Logit Regression (ver. 3.1 beta) Oscar Torres-Reyna Data Consultant [email protected] http://dss.princeton.edu/training/ PU/DSS/OTR Logit model • Use logit models whenever your dependent variable is binary (also called dummy) which takes values 0 or 1. • Logit regression is a nonlinear regression model that forces the output (predicted values) to be either 0 or 1. • Logit models estimate the probability of your dependent variable to be 1 (Y=1). This is the probability that some event happens. PU/DSS/OTR Logit odelm From Stock & Watson, key concept 9.3. The logit model is: Pr(YXXXFXX 1 | 1= , 2 ,...=k β ) +0 β ( 1 +2 β 1 +βKKX 2 + ... ) 1 Pr(YXXX 1= | 1 , 2k = ,... ) 1−+(eβ0 + βXX 1 1 + β 2 2 + ...βKKX + ) 1 Pr(YXXX 1= | 1 , 2= ,... ) k ⎛ 1 ⎞ 1+ ⎜ ⎟ (⎝ eβ+0 βXX 1 1 + β 2 2 + ...βKK +X ⎠ ) Logit nd probita models are basically the same, the difference is in the distribution: • Logit – Cumulative standard logistic distribution (F) • Probit – Cumulative standard normal distribution (Φ) Both models provide similar results. PU/DSS/OTR It tests whether the combined effect, of all the variables in the model, is different from zero. If, for example, < 0.05 then the model have some relevant explanatory power, which does not mean it is well specified or at all correct. Logit: predicted probabilities After running the model: logit y_bin x1 x2 x3 x4 x5 x6 x7 Type predict y_bin_hat /*These are the predicted probabilities of Y=1 */ Here are the estimations for the first five cases, type: 1 x2 x3 x4 x5 x6 x7 y_bin_hatbrowse y_bin x Predicted probabilities To estimate the probability of Y=1 for the first row, replace the values of X into the logit regression equation. -
Chapter 3: Discrete Choice
Chapter 3: Discrete Choice Joan Llull Microeconometrics IDEA PhD Program I. Binary Outcome Models A. Introduction In this chapter we analyze several models to deal with discrete outcome vari- ables. These models that predict in which of m mutually exclusive categories the outcome of interest falls. In this section m = 2, this is, we consider binary or dichotomic variables (e.g. whether to participate or not in the labor market, whether to have a child or not, whether to buy a good or not, whether to take our kid to a private or to a public school,...). In the next section we generalize the results to multiple outcomes. It is convenient to assume that the outcome y takes the value of 1 for cat- egory A, and 0 for category B. This is very convenient because, as a result, −1 PN N i=1 yi = Pr[c A is selected]. As a consequence of this property, the coeffi- cients of a linear regression model can be interpreted as marginal effects of the regressors on the probability of choosing alternative A. And, in the non-linear models, it allows us to write the likelihood function in a very compact way. B. The Linear Probability Model A simple approach to estimate the effect of x on the probability of choosing alternative A is the linear regression model. The OLS regression of y on x provides consistent estimates of the sample-average marginal effects of regressors x on the probability of choosing alternative A. As a result, the linear model is very useful for exploratory purposes. -
Lecture 9: Logit/Probit Prof
Lecture 9: Logit/Probit Prof. Sharyn O’Halloran Sustainable Development U9611 Econometrics II Review of Linear Estimation So far, we know how to handle linear estimation models of the type: Y = β0 + β1*X1 + β2*X2 + … + ε ≡ Xβ+ ε Sometimes we had to transform or add variables to get the equation to be linear: Taking logs of Y and/or the X’s Adding squared terms Adding interactions Then we can run our estimation, do model checking, visualize results, etc. Nonlinear Estimation In all these models Y, the dependent variable, was continuous. Independent variables could be dichotomous (dummy variables), but not the dependent var. This week we’ll start our exploration of non- linear estimation with dichotomous Y vars. These arise in many social science problems Legislator Votes: Aye/Nay Regime Type: Autocratic/Democratic Involved in an Armed Conflict: Yes/No Link Functions Before plunging in, let’s introduce the concept of a link function This is a function linking the actual Y to the estimated Y in an econometric model We have one example of this already: logs Start with Y = Xβ+ ε Then change to log(Y) ≡ Y′ = Xβ+ ε Run this like a regular OLS equation Then you have to “back out” the results Link Functions Before plunging in, let’s introduce the concept of a link function This is a function linking the actual Y to the estimated Y in an econometric model We have one example of this already: logs Start with Y = Xβ+ ε Different β’s here Then change to log(Y) ≡ Y′ = Xβ + ε Run this like a regular OLS equation Then you have to “back out” the results Link Functions If the coefficient on some particular X is β, then a 1 unit ∆X Æ β⋅∆(Y′) = β⋅∆[log(Y))] = eβ ⋅∆(Y) Since for small values of β, eβ ≈ 1+β , this is almost the same as saying a β% increase in Y (This is why you should use natural log transformations rather than base-10 logs) In general, a link function is some F(⋅) s.t. -
Lecture 6 Multiple Choice Models Part II – MN Probit, Ordered Choice
RS – Lecture 17 Lecture 6 Multiple Choice Models Part II – MN Probit, Ordered Choice 1 DCM: Different Models • Popular Models: 1. Probit Model 2. Binary Logit Model 3. Multinomial Logit Model 4. Nested Logit model 5. Ordered Logit Model • Relevant literature: - Train (2003): Discrete Choice Methods with Simulation - Franses and Paap (2001): Quantitative Models in Market Research - Hensher, Rose and Greene (2005): Applied Choice Analysis 1 RS – Lecture 17 Model – IIA: Alternative Models • In the MNL model we assumed independent nj with extreme value distributions. This essentially created the IIA property. • This is the main weakness of the MNL model. • The solution to the IIA problem is to relax the independence between the unobserved components of the latent utility, εi. • Solutions to IIA – Nested Logit Model, allowing correlation between some choices. – Models allowing correlation among the εi’s, such as MP Models. – Mixed or random coefficients models, where the marginal utilities associated with choice characteristics vary between individuals. Multinomial Probit Model • Changing the distribution of the error term in the RUM equation leads to alternative models. • A popular alternative: The εij’s follow an independent standard normal distributions for all i,j. • We retain independence across subjects but we allow dependence across alternatives, assuming that the vector εi = (εi1,εi2, , εiJ) follows a multivariate normal distribution, but with arbitrary covariance matrix Ω. 2 RS – Lecture 17 Multinomial Probit Model • The vector εi = (εi1,εi2, , εiJ) follows a multivariate normal distribution, but with arbitrary covariance matrix Ω. • The model is called the Multinomial probit model. It produces results similar results to the MNL model after standardization. -
Generalized Linear Models
CHAPTER 6 Generalized linear models 6.1 Introduction Generalized linear modeling is a framework for statistical analysis that includes linear and logistic regression as special cases. Linear regression directly predicts continuous data y from a linear predictor Xβ = β0 + X1β1 + + Xkβk.Logistic regression predicts Pr(y =1)forbinarydatafromalinearpredictorwithaninverse-··· logit transformation. A generalized linear model involves: 1. A data vector y =(y1,...,yn) 2. Predictors X and coefficients β,formingalinearpredictorXβ 1 3. A link function g,yieldingavectoroftransformeddataˆy = g− (Xβ)thatare used to model the data 4. A data distribution, p(y yˆ) | 5. Possibly other parameters, such as variances, overdispersions, and cutpoints, involved in the predictors, link function, and data distribution. The options in a generalized linear model are the transformation g and the data distribution p. In linear regression,thetransformationistheidentity(thatis,g(u) u)and • the data distribution is normal, with standard deviation σ estimated from≡ data. 1 1 In logistic regression,thetransformationistheinverse-logit,g− (u)=logit− (u) • (see Figure 5.2a on page 80) and the data distribution is defined by the proba- bility for binary data: Pr(y =1)=y ˆ. This chapter discusses several other classes of generalized linear model, which we list here for convenience: The Poisson model (Section 6.2) is used for count data; that is, where each • data point yi can equal 0, 1, 2, ....Theusualtransformationg used here is the logarithmic, so that g(u)=exp(u)transformsacontinuouslinearpredictorXiβ to a positivey ˆi.ThedatadistributionisPoisson. It is usually a good idea to add a parameter to this model to capture overdis- persion,thatis,variationinthedatabeyondwhatwouldbepredictedfromthe Poisson distribution alone. -
A Multinomial Probit Model with Latent Factors: Identification and Interpretation Without a Measurement System
DISCUSSION PAPER SERIES IZA DP No. 11042 A Multinomial Probit Model with Latent Factors: Identification and Interpretation without a Measurement System Rémi Piatek Miriam Gensowski SEPTEMBER 2017 DISCUSSION PAPER SERIES IZA DP No. 11042 A Multinomial Probit Model with Latent Factors: Identification and Interpretation without a Measurement System Rémi Piatek University of Copenhagen Miriam Gensowski University of Copenhagen and IZA SEPTEMBER 2017 Any opinions expressed in this paper are those of the author(s) and not those of IZA. Research published in this series may include views on policy, but IZA takes no institutional policy positions. The IZA research network is committed to the IZA Guiding Principles of Research Integrity. The IZA Institute of Labor Economics is an independent economic research institute that conducts research in labor economics and offers evidence-based policy advice on labor market issues. Supported by the Deutsche Post Foundation, IZA runs the world’s largest network of economists, whose research aims to provide answers to the global labor market challenges of our time. Our key objective is to build bridges between academic research, policymakers and society. IZA Discussion Papers often represent preliminary work and are circulated to encourage discussion. Citation of such a paper should account for its provisional character. A revised version may be available directly from the author. IZA – Institute of Labor Economics Schaumburg-Lippe-Straße 5–9 Phone: +49-228-3894-0 53113 Bonn, Germany Email: [email protected] www.iza.org IZA DP No. 11042 SEPTEMBER 2017 ABSTRACT A Multinomial Probit Model with Latent Factors: Identification and Interpretation without a Measurement System* We develop a parametrization of the multinomial probit model that yields greater insight into the underlying decision-making process, by decomposing the error terms of the utilities into latent factors and noise. -
Nlogit — Nested Logit Regression
Title stata.com nlogit — Nested logit regression Description Quick start Menu Syntax Options Remarks and examples Stored results Methods and formulas References Also see Description nlogit performs full information maximum-likelihood estimation for nested logit models. These models relax the assumption of independently distributed errors and the independence of irrelevant alternatives inherent in conditional and multinomial logit models by clustering similar alternatives into nests. By default, nlogit uses a parameterization that is consistent with a random utility model (RUM). Before version 10 of Stata, a nonnormalized version of the nested logit model was fit, which you can request by specifying the nonnormalized option. You must use nlogitgen to generate a new categorical variable to specify the branches of the nested logit tree before calling nlogit. Quick start Create variables identifying alternatives at higher levels For a three-level nesting structure, create l2alt identifying four alternatives at level two based on the variable balt, which identifies eight bottom-level alternatives nlogitgen l2alt = balt(l2alt1: balt1 | balt2, l2alt2: balt3 | balt4, /// l2alt3: balt5 | balt6, l2alt4: balt7 | balt8) Same as above, defining l2alt from values of balt rather than from value labels nlogitgen l2alt = balt(l2alt1: 1|2, l2alt2: 3|4, l2alt3: 5|6, /// l2alt4: 7|8) Create talt identifying top-level alternatives based on the four alternatives in l2alt nlogitgen talt = l2alt(talt1: l2alt1 | l2alt2, talt2: l2alt3 | l2alt4) Examine tree structure -
A Generalized Gaussian Process Model for Computer Experiments with Binary Time Series
A Generalized Gaussian Process Model for Computer Experiments with Binary Time Series Chih-Li Sunga 1, Ying Hungb 1, William Rittasec, Cheng Zhuc, C. F. J. Wua 2 aSchool of Industrial and Systems Engineering, Georgia Institute of Technology bDepartment of Statistics, Rutgers, the State University of New Jersey cDepartment of Biomedical Engineering, Georgia Institute of Technology Abstract Non-Gaussian observations such as binary responses are common in some computer experiments. Motivated by the analysis of a class of cell adhesion experiments, we introduce a generalized Gaussian process model for binary responses, which shares some common features with standard GP models. In addition, the proposed model incorporates a flexible mean function that can capture different types of time series structures. Asymptotic properties of the estimators are derived, and an optimal predictor as well as its predictive distribution are constructed. Their performance is examined via two simulation studies. The methodology is applied to study computer simulations for cell adhesion experiments. The fitted model reveals important biological information in repeated cell bindings, which is not directly observable in lab experiments. Keywords: Computer experiment, Gaussian process model, Single molecule experiment, Uncertainty quantification arXiv:1705.02511v4 [stat.ME] 24 Sep 2018 1Joint first authors. 2Corresponding author. 1 1 Introduction Cell adhesion plays an important role in many physiological and pathological processes. This research is motivated by the analysis of a class of cell adhesion experiments called micropipette adhesion frequency assays, which is a method for measuring the kinetic rates between molecules in their native membrane environment. In a micropipette adhesion frequency assay, a red blood coated in a specific ligand is brought into contact with cell containing the native receptor for a predetermined duration, then retracted. -
Week 12: Linear Probability Models, Logistic and Probit
Week 12: Linear Probability Models, Logistic and Probit Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2019 These slides are part of a forthcoming book to be published by Cambridge University Press. For more information, go to perraillon.com/PLH. c This material is copyrighted. Please see the entire copyright notice on the book's website. Updated notes are here: https://clas.ucdenver.edu/marcelo-perraillon/ teaching/health-services-research-methods-i-hsmp-7607 1 Outline Modeling 1/0 outcomes The \wrong" but super useful model: Linear Probability Model Deriving logistic regression Probit regression as an alternative 2 Binary outcomes Binary outcomes are everywhere: whether a person died or not, broke a hip, has hypertension or diabetes, etc We typically want to understand what is the probability of the binary outcome given explanatory variables It's exactly the same type of models we have seen during the semester, the difference is that we have been modeling the conditional expectation given covariates: E[Y jX ] = β0 + β1X1 + ··· + βpXp Now, we want to model the probability given covariates: P(Y = 1jX ) = f (β0 + β1X1 + ··· + βpXp) Note the function f() in there 3 Linear Probability Models We could actually use our vanilla linear model to do so If Y is an indicator or dummy variable, then E[Y jX ] is the proportion of 1s given X , which we interpret as the probability of Y given X The parameters are changes/effects/differences in the probability of Y by a unit change in X or for a small change in X If an indicator variable, then change from 0 to 1 For example, if we model diedi = β0 + β1agei + i , we could interpret β1 as the change in the probability of death for an additional year of age 4 Linear Probability Models The problem is that we know that this model is not entirely correct.