Ordered Choice Models

Total Page:16

File Type:pdf, Size:1020Kb

Ordered Choice Models Modeling Ordered Choices William H. Greene1 David A. Hensher2 January, 2009 1Department of Economics, Stern School of Business, New York University, New York, NY 10012, [email protected] 2Institute of Transport and Logistics Studies, Faculty of Economics and Business, University of Sydney, NSW 2006 Australia [email protected] Modeling Ordered Choices Brief Contents List of Tables List of Figures Preface Chapter 1 Introduction Chapter 2 Modeling Binary Choices Chapter 3 An Ordered Choice Model for Social Science Applications Chapter 4 Antecedents and Contemporary Counterparts Chapter 5 Estimation, Inference and Analysis Using the Ordered Choice Model Chapter 6 Specification Issues in Ordered Choice Models Chapter 7 Accommodating Individual Heterogeneity Chapter 8 Parameter Variation and a Generalized Ordered Choice Model Chapter 9 Ordered Choice Modeling with Panel and Time Series Data Chapter 10 Bivariate and Multivariate Ordered Choice Models Chapter 11 Two Part and Sample Selection Models Chapter 12 Semiparametric and Nonparametric Estimators and Analyses References Index 2 Modeling Ordered Choices Contents List of Tables List of Figures Preface Chapter 1 Introduction: Random Utility Models Chapter 2 Modeling Binary Choices 2.1 Random Utility Formulation of a Model for Binary Choice 2.2 Probability Models for Binary Choices 2.2.1 Nonparametric and Semiparametric Specifications 2.2.2 The Linear Probability Model 2.2.3 The Probit and Logit Models 2.3 Estimation and Inference 2.3.1 Maximum Likelihood Estimation 2.3.2 Maximizing the Log Likelihood Function 2.3.3 The EM Algorithm 2.3.4 Bayesian Estimation by Gibbs Sampling and MCMC 2.3.5 Estimation with Grouped Data and Iteratively Reweighted Least Squares 2.3.6 The Minimum Chi Squared Estimator 2.4 Covariance Matrix Estimation Robust Covariance Matrix Estimation 2.5 Application of the Binary Choice Model to Health Satisfaction 2.6 Partial Effects in a Binary Choice Model 2.6.1 Partial Effect for a Dummy Variable 2.6.2 Odds Ratios 2.6.3 Elasticities 2.6.4 Inference for Partial Effects 2.6.5 Standard Errors for Estimated Odds Ratios 2.6.6 Average Partial Effects 2.6.7 Standard Errors for Marginal Effects Using the Krinsky and Robb Method 2.6.8 Fitted Probabilities 2.7 Hypothesis Testing 2.7.1 Wald Tests 2.7.2 Likelihood Ratio Tests 2.7.3 Lagrange Mltiplier Tests 2.7.4 Application of Hypothesis Tests 2.8 Goodness of Fit Measures 2.8.1 Perfect Prediction 2.8.2 Dummy Variables with Empty Cells 2.8.3 Explaining Variation in the Implied Regression 2.8.4 Fit Measures Based on Predicted Probabilities 2.8.5 Assessing the Model’s Ability to Predict 2.8.6 A Specification Test Based on Fit 2.8.7 ROC Plots for Binary Choice Models 2.9 Heteroscedasticity 2.10 Panel Data 2.10.1 Pooled Estimation, Clustering and Robust Covariance Matrix Estimation 2.10.2 Fixed Effects 2.10.3 Random Effects 3 Modeling Ordered Choices The Pooled Estimator The Maximum Likelihood Estimator GMM Estimation Heckman ad Singer’s Semiparametric Approach 2.10.4 Mundlak’s Correction for the Probit and Logit Models 2.10.5 Testing for Heterogeneity 2.10.6 Testing for Fixed or Random Effects 2.11 Parameter Heterogeneity 2.12 Endogeneity of a Right Hand Side variable 2.13 Bivariate Probit Models 2.13.1 Tetrachoric Correlation 2.13.2 Testing for Zero Correlation 2.13.3 Marginal Effects in a Bivariate Probit Model 2.13.4 Recursive Bivariate Probit Models 2.13.5 A Sample Selection Model 2.14 The Multivariate Probit and Panel Probit Models 2.15 Endogenous Sampling and Case Control Studies Chapter 3 An Ordered Choice Model for Social Science Applications 3.1 A Latent Regression Model for a Continuous Measure 3.2 Ordered Choice as an Outcome of Utility Maximization 3.3 The Observed Discrete Outcome 3.4 Probabilities 3.5 Log Likelihood Function 3.6 Analysis of Data on Ordered Choices Chapter 4 Antecedents and Contemporary Counterparts 4.1 The Origin of Probit Analysis: Bliss (1934), Finney (1947) 4.2 Social Science Data and Regression Analysis for Binary Outcomes 4.3 Analysis of Binary Choice 4.4 Ordered Outcomes: Aitchison and Silvey (1957), Snell (1964) 4.5 Minimum Chi Squared Estimation of an Ordered Response Model: Gurland et al. (1960) 4.6 Individual Data and Polychotomous Outcomes: Walker and Duncan (1967) 4.7 McElvey and Zavoina (1975) 4.8 Developments Since McElvey and Zavoina 4.9 Other Related Models 4.9.1 Known Thresholds 4.9.2 Nonparallel Regressions Chapter 5 Estimation, Inference and Analysis Using the Ordered Choice Model 5.1 Application of the Ordered Choice Model to Self Assessed Health Status 5.2 Distributional Assumptions 5.3 The Estimated Ordered Probit (Logit) Model 5.4 The Estimated Threshold Parameters 5.5 Interpretation of the Model – Partial Effects and Scaled Coefficients 5.5.1 Nonlinearities in the Variables 5.5.2 Average Partial Effects 5.5.3 Interpreting the Threshold Parameters 5.5.4 The Underlying Regression 5.6 Inference 5.6.1 Inference about Coefficients 5.6.2 Testing for Structural Change or Homogeneity of Strata 5.6.3 Robust Covariance Matrix Estimation 4 Modeling Ordered Choices 5.6.4 Inference About Partial Effects 5.7 Prediction – Computing Probabilities 5.8 Measuring Fit 5.9 Estimation Issues 5.9.1 Grouped Data 5.9.2 Perfect Prediction 5.9.3 Different Normalizations 5.9.4 Censoring of the Dependent Variable 5.9.5 Maximum Likelihood Estimation of the Ordered Choice Model 5.9.6 Bayesian (MCMC) Estimation of Ordered Choice Models 5.9.7 Software For Estimation of Ordered Choice Models Chapter 6 Specification Issues in Ordered Choice Models 6.1 Functional Form Issues and the Generalized Ordered Choice Model (1) 6.1.1 Parallel Regressions 6.1.2 Testing the Parallel Regressions Assumption – The Brant (1990) Test 6.1.3 Generalized Ordered Logit Model (1) 6.2 Model Implications for Partial Effects 6.2.1 The Single Crossing Feature of the Ordered Choice Model 6.2,2 Choice Invariant Ratios of Partial Effects 6.3 Methodological Issues 6.4 Specification Tests for Ordered Choice Models 6.4.1 Model Specifications – Missing Variables and Heteroscedasticity 6.4.2 Testing Against the Logistic and Normal Distribution 6.4.3 Unspecified Alternatives Chapter 7 Accommodating Individual Heterogeneity 7.1 Threshold Models – The Generalized Ordered Probit Model (2) 7.2 Nonlinear Specifications – A Hierarchical Ordered Probit Model 7.3 Thresholds and Heterogeneity – Anchoring Vignettes 7.3.1 Using Anchoring Vignettes in the Ordered Probit Model Self Assessment Component Vignette Component 7.3.2 Log Likelihood and Model Identification Through the Anchoring Vignettes 7.3.3 Testing the Assumptions of the Model 7.3.4 Application 7.3.5 Multiple Self-Assessment Equations 7.4 Heterogeneous Scaling (Heteroscedasticity) of Random Utility 7.4 Individually Heterogeneous Marginal Utilities Appendix: Equivalence of the Vignette and HOPIT Models Chapter 8 Parameter Variation and a Generalized Ordered Choice Model 8.1 Random Parameters Models 8.1.1 Implied Heteroscedasticity 8.1.2 Maximum Simulated Likelihood Estimation 8.1.3 Conditional Mean Estimation in the Random Parameters Model 8.2 Latent Class and Finite Mixture Modeling 8.2.1 The Latent Class Ordered Choice Model 8.2.2 Estimation by Maximum Likelihood 8.2.3 The EM Algorithm 8.2.4 Estimating the Class Assignments 8.2.5 A Latent Class Model Extension 8.2.6 Application 8.2.7 Endogenous Class Assignment and A Generalized Ordered Choice Model 8.3 Generalized Ordered Choice Model with Random Thresholds (3) 5 Modeling Ordered Choices Chapter 9 Ordered Choice Modeling with Panel and Time Series Data 9.1 Ordered Choice Models with Fixed Effects 9.2 Ordered Choice Models with Random Effects 9.3 Testing for Random or Fixed Effects 9.4 Extending Parameter Heterogeneity Models to Ordered Choices 9.5 Dynamic Models Chapter 10 Bivariate and Multivariate Ordered Choice Models 10.1 Bivariate Ordered Probit Models 10.2 Polychoric Correlation 10.3 Semi-Ordered Bivariate Probit Model 10.4 Applications of the Bivariate Ordered Probit Model 10.5 A Panel Data Version of the Bivariate Ordered Probit Model 10.6 Trivariate and Multivariate Ordered Probit Models Chapter 11 Two Part and Sample Selection Models 11.1 Inflation Models 11.2 Sample Selection Models 11.2.1 A Sample Selected Ordered Probit Model 11.2.2 Models of Sample Selection with an Ordered Probit Selection Rule 11.2.3 A Sample Selected Bivariate Ordered Probit Model 11.3 An Ordered Probit Model with Endogenous Treatment Effects Chapter 12 Semiparametric and Nonparametric Estimators and Analyses 12.1 Heteroscedasticity 12.2 A Distribution Free Estimator with Unknown Heteroscedasticty 12.3 A Semi-nonparametric Approach 12.4 A Partially Linear Model 12.5 Semiparametric Analysis 12.6 A Nonparametric Duration Model 12.6.1 Unobserved Heterogeneity 12.6.2 Application References Index 6 Modeling Ordered Choices List of Tables 2.1 Data Used in Binary Choice Application 2.2 Estimated Probit and Logit Models 2.3 Alternative Estimated Standard Errors for the Probit Model 2.4 Partial Effects for Probit and Logit Models at Means of x 2.5 Marginal Effects and Average Partial Effects 2.6 Hypothesis Tests 2.7 Homogeneity Test 2.8 Fit Measures for Probit Model 2.9 Prediction Success for Probit Model 2.10 Success Measures for Predictions by Estimated Probit Model 2.11 Heteroscedastic Probit Model 2.12 Cluster Corrected Covariance Matrix (7293 Groups) 2.13 Fixed Effects Probit Model 2.14 Estimated Fixed Effects Logit Models 2.15 Estimated Random Effects Probit Models 2.16a Semiparametric Random Effects Probit Model 2.16b Estimated Parameters for 4 Class Latent Class Model 2.17 Random Effects Model with Mundlak Correction 2.18 Estimated Random Parameter Models 2.19 Estimated Partial Effects 2.20 Cross Tabulation of Healthy and Working 2.21 Estimated Bivariate Probit Model 2.22 Estimated Sample Selection Model 5.1 Estimated Ordered Choice Models: Probit and Logit 5.2 Estimated Partial Effects for Ordered Choice Models 5.3 Estimated Expanded Ordered Probit Model 5.4 Transformed Latent Regression Coefficients 5.5 Estimated Partial Effects with Asymptitic Standard Errors 5.6 Mean Predicted Probabilities by Kids 5.7 Predicted vs.
Recommended publications
  • Ologit — Ordered Logistic Regression
    Title stata.com ologit — Ordered logistic regression Description Quick start Menu Syntax Options Remarks and examples Stored results Methods and formulas References Also see Description ologit fits ordered logit models of ordinal variable depvar on the independent variables indepvars. The actual values taken on by the dependent variable are irrelevant, except that larger values are assumed to correspond to “higher” outcomes. Quick start Ordinal logit model of y on x1 and categorical variables a and b ologit y x1 i.a i.b As above, and include interaction between a and b and report results as odds ratios ologit y x1 a##b, or With bootstrap standard errors ologit y x1 i.a i.b, vce(bootstrap) Analysis restricted to cases where catvar = 0 using svyset data with replicate weights svy bootstrap, subpop(if catvar==0): ologit y x1 i.a i.b Menu Statistics > Ordinal outcomes > Ordered logistic regression 1 2 ologit — Ordered logistic regression Syntax ologit depvar indepvars if in weight , options options Description Model offset(varname) include varname in model with coefficient constrained to 1 constraints(constraints) apply specified linear constraints SE/Robust vce(vcetype) vcetype may be oim, robust, cluster clustvar, bootstrap, or jackknife Reporting level(#) set confidence level; default is level(95) or report odds ratios nocnsreport do not display constraints display options control columns and column formats, row spacing, line width, display of omitted variables and base and empty cells, and factor-variable labeling Maximization maximize options control the maximization process; seldom used collinear keep collinear variables coeflegend display legend instead of statistics indepvars may contain factor variables; see [U] 11.4.3 Factor variables.
    [Show full text]
  • Logit and Ordered Logit Regression (Ver
    Getting Started in Logit and Ordered Logit Regression (ver. 3.1 beta) Oscar Torres-Reyna Data Consultant [email protected] http://dss.princeton.edu/training/ PU/DSS/OTR Logit model • Use logit models whenever your dependent variable is binary (also called dummy) which takes values 0 or 1. • Logit regression is a nonlinear regression model that forces the output (predicted values) to be either 0 or 1. • Logit models estimate the probability of your dependent variable to be 1 (Y=1). This is the probability that some event happens. PU/DSS/OTR Logit odelm From Stock & Watson, key concept 9.3. The logit model is: Pr(YXXXFXX 1 | 1= , 2 ,...=k β ) +0 β ( 1 +2 β 1 +βKKX 2 + ... ) 1 Pr(YXXX 1= | 1 , 2k = ,... ) 1−+(eβ0 + βXX 1 1 + β 2 2 + ...βKKX + ) 1 Pr(YXXX 1= | 1 , 2= ,... ) k ⎛ 1 ⎞ 1+ ⎜ ⎟ (⎝ eβ+0 βXX 1 1 + β 2 2 + ...βKK +X ⎠ ) Logit nd probita models are basically the same, the difference is in the distribution: • Logit – Cumulative standard logistic distribution (F) • Probit – Cumulative standard normal distribution (Φ) Both models provide similar results. PU/DSS/OTR It tests whether the combined effect, of all the variables in the model, is different from zero. If, for example, < 0.05 then the model have some relevant explanatory power, which does not mean it is well specified or at all correct. Logit: predicted probabilities After running the model: logit y_bin x1 x2 x3 x4 x5 x6 x7 Type predict y_bin_hat /*These are the predicted probabilities of Y=1 */ Here are the estimations for the first five cases, type: 1 x2 x3 x4 x5 x6 x7 y_bin_hatbrowse y_bin x Predicted probabilities To estimate the probability of Y=1 for the first row, replace the values of X into the logit regression equation.
    [Show full text]
  • Estimating Heterogeneous Choice Models with Oglm 1 Introduction
    Estimating heterogeneous choice models with oglm Richard Williams Department of Sociology, University of Notre Dame, Notre Dame, IN [email protected] Last revised October 17, 2010 – Forthcoming in The Stata Journal Abstract. When a binary or ordinal regression model incorrectly assumes that error variances are the same for all cases, the standard errors are wrong and (unlike OLS regression) the parameter estimates are biased. Heterogeneous choice (also known as location-scale or heteroskedastic ordered) models explicitly specify the determinants of heteroskedasticity in an attempt to correct for it. Such models are also useful when the variance itself is of substantive interest. This paper illustrates how the author’s Stata program oglm (Ordinal Generalized Linear Models) can be used to estimate heterogeneous choice and related models. It shows that two other models that have appeared in the literature (Allison’s model for group comparisons and Hauser and Andrew’s logistic response model with proportionality constraints) are special cases of a heterogeneous choice model and alternative parameterizations of it. The paper further argues that heterogeneous choice models may sometimes be an attractive alternative to other ordinal regression models, such as the generalized ordered logit model estimated by gologit2. Finally, the paper offers guidelines on how to interpret, test and modify heterogeneous choice models. Keywords. oglm, heterogeneous choice model, location-scale model, gologit2, ordinal regression, heteroskedasticity, generalized ordered logit model 1 Introduction When a binary or ordinal regression model incorrectly assumes that error variances are the same for all cases, the standard errors are wrong and (unlike OLS regression) the parameter estimates are biased (Yatchew & Griliches 1985).
    [Show full text]
  • Generalized Linear Models
    CHAPTER 6 Generalized linear models 6.1 Introduction Generalized linear modeling is a framework for statistical analysis that includes linear and logistic regression as special cases. Linear regression directly predicts continuous data y from a linear predictor Xβ = β0 + X1β1 + + Xkβk.Logistic regression predicts Pr(y =1)forbinarydatafromalinearpredictorwithaninverse-··· logit transformation. A generalized linear model involves: 1. A data vector y =(y1,...,yn) 2. Predictors X and coefficients β,formingalinearpredictorXβ 1 3. A link function g,yieldingavectoroftransformeddataˆy = g− (Xβ)thatare used to model the data 4. A data distribution, p(y yˆ) | 5. Possibly other parameters, such as variances, overdispersions, and cutpoints, involved in the predictors, link function, and data distribution. The options in a generalized linear model are the transformation g and the data distribution p. In linear regression,thetransformationistheidentity(thatis,g(u) u)and • the data distribution is normal, with standard deviation σ estimated from≡ data. 1 1 In logistic regression,thetransformationistheinverse-logit,g− (u)=logit− (u) • (see Figure 5.2a on page 80) and the data distribution is defined by the proba- bility for binary data: Pr(y =1)=y ˆ. This chapter discusses several other classes of generalized linear model, which we list here for convenience: The Poisson model (Section 6.2) is used for count data; that is, where each • data point yi can equal 0, 1, 2, ....Theusualtransformationg used here is the logarithmic, so that g(u)=exp(u)transformsacontinuouslinearpredictorXiβ to a positivey ˆi.ThedatadistributionisPoisson. It is usually a good idea to add a parameter to this model to capture overdis- persion,thatis,variationinthedatabeyondwhatwouldbepredictedfromthe Poisson distribution alone.
    [Show full text]
  • Ordered/Ordinal Logistic Regression with SAS and Stata1
    Ordered/Ordinal Logistic Regression with SAS and Stata1 This document will describe the use of Ordered Logistic Regression (OLR), a statistical technique that can sometimes be used with an ordered (from low to high) dependent variable. The dependent variable used in this document will be the fear of crime, with values of: 1 = not at all fearful 2 = not very fearful 3 = somewhat fearful 4 = very fearful Ordered logit model has the form: This model is known as the proportional-odds model because the odds ratio of the event is independent of the category j. The odds ratio is assumed to be constant for all categories. Source: http://www.indiana.edu/~statmath/stat/all/cat/2b1.html Syntax and results using both SAS and Stata will be discussed. OLR models cumulative probability. It simultaneously estimates multiple equations. The number of equations it estimates will the number of categories in the dependent variable minus one. So, for our example, three equations will be estimated. The equations are: Pooled Pooled Categories compared to Categories Equation 1: 1 2 3 4 Equation 2: 1 2 3 4 Equation 3: 1 2 3 4 Each equation models the odds of being in the set of categories on the left versus the set of categories on the right. OLR provides only one set of coefficients for each independent variable. Therefore, there is an assumption of parallel regression. That is, the coefficients for the variables in the equations would not vary significantly if they were estimated separately. The intercepts would be different, but the slopes would be essentially the same.
    [Show full text]
  • Using a General Ordered Logit Model to Explain the Influence of Hotel
    sustainability Article Using a General Ordered Logit Model to Explain the Influence of Hotel Facilities, General and Sustainability-Related, on Customer Ratings Ioana-Nicoleta Abrudan 1,*, Ciprian-Marcel Pop 1 and Paul-Sorin Lazăr 2 1 Faculty of Economics and Business Administration, Babes, -Bolyai University, 58-60 T. Mihali St, 400591 Cluj-Napoca, Romania; [email protected] 2 Faculty of Business, Babes, -Bolyai University, 7 Horea St., 400174 Cluj-Napoca, Romania; [email protected] * Correspondence: [email protected]; Tel.: +40-752-028-500 Received: 7 October 2020; Accepted: 3 November 2020; Published: 9 November 2020 Abstract: The hotel market has become extremely competitive over the past years. Hotels try to differentiate themselves through their services and facilities. To make the best choice when searching for accommodation, guests increasingly use rating systems of booking sites. Using an ordered logit model (OLM), we identify, in our study, a sample that comprises of 635 hotels from Romania. These are the hotel facilities that significantly influence customer review scores (as an expression of customer satisfaction) on booking.com, the most widespread rating system. We also identify whether their impact on intervals of satisfaction levels vary. Some explanatory variables invalidate the Brant test for proportional odds assumption. Thus, for the final estimates, we use a generalized ordered logit model (GOLOGIT). The results show that food-related facilities, restaurants, and complimentary breakfasts, are very significant for customer ratings. Relevant hotel common facilities are the pool and parking spaces, while for the room—the flat-screen TV. It is interesting to note the negative influence of pets, which seem to disturb other tourists.
    [Show full text]
  • Logit and Probit Models for Categorical Response Variables
    Applied Statistics With R Logit and Probit Models for Categorical Response Variables John Fox WU Wien May/June 2006 © 2006 by John Fox Logit and Probit Models 1 Logit and Probit Models 2 1. Goals: 2. Models for Dichotomous Data To show how models similar to linear models can be developed for To understand why special models for qualitative data are required, let • qualitative/categorical response variables. • us begin by examining a representative problem, attempting to apply To introduce logit (and probit) models for dichotomous response linear regression to it: • variables. – In September of 1988, 15 years after the coup of 1973, the people of Chile voted in a plebiscite to decide the future of the military To introduce similar statistical models for polytomous response variables, • government. A ‘yes’ vote would represent eight more years of military including ordered categories. rule; a ‘no’ vote would return the country to civilian government. The To describe how logit models can be applied to contingency tables. no side won the plebiscite, by a clear if not overwhelming margin. • – Six months before the plebiscite, FLACSO/Chile conducted a national survey of 2,700 randomly selected Chilean voters. Of these individuals, 868 said that they were planning to vote yes, ∗ and 889 said that they were planning to vote no. Of the remainder, 558 said that they were undecided, 187 said that ∗ they planned to abstain, and 168 did not answer the question. John Fox WU Wien May/June 2006 John Fox WU Wien May/June 2006 Logit and Probit Models 3 Logit and Probit Models 4 I will look only at those who expressed a preference.
    [Show full text]
  • Ologit — Ordered Logistic Regression
    Title stata.com ologit — Ordered logistic regression Syntax Menu Description Options Remarks and examples Stored results Methods and formulas References Also see Syntax ologit depvar indepvars if in weight , options options Description Model offset(varname) include varname in model with coefficient constrained to 1 constraints(constraints) apply specified linear constraints collinear keep collinear variables SE/Robust vce(vcetype) vcetype may be oim, robust, cluster clustvar, bootstrap, or jackknife Reporting level(#) set confidence level; default is level(95) or report odds ratios nocnsreport do not display constraints display options control column formats, row spacing, line width, display of omitted variables and base and empty cells, and factor-variable labeling Maximization maximize options control the maximization process; seldom used coeflegend display legend instead of statistics indepvars may contain factor variables; see [U] 11.4.3 Factor variables. depvar and indepvars may contain time-series operators; see [U] 11.4.4 Time-series varlists. bootstrap, by, fp, jackknife, mfp, mi estimate, nestreg, rolling, statsby, stepwise, and svy are allowed; see [U] 11.1.10 Prefix commands. vce(bootstrap) and vce(jackknife) are not allowed with the mi estimate prefix; see [MI] mi estimate. Weights are not allowed with the bootstrap prefix; see [R] bootstrap. vce() and weights are not allowed with the svy prefix; see [SVY] svy. fweights, iweights, and pweights are allowed; see [U] 11.1.6 weight. coeflegend does not appear in the dialog box. See [U] 20 Estimation and postestimation commands for more capabilities of estimation commands. Menu Statistics > Ordinal outcomes > Ordered logistic regression 1 2 ologit — Ordered logistic regression Description ologit fits ordered logit models of ordinal variable depvar on the independent variables indepvars.
    [Show full text]
  • Ordinal Regression Earlier (“Analysis of Ordinal Contingency Tables”)
    Newsom Psy 525/625 Categorical Data Analysis, Spring 2021 1 Ordinal Regression Earlier (“Analysis of Ordinal Contingency Tables”), we considered ordinal variables in contingency tables. Such models do not generally assume designation of an explanatory (independent) and response (dependent) variable, but they also are limited in inclusion of covariates and more complex models that involve interactions between continuous and categorical predictors and so on. In discussing regression models thus far, the focus has been on binary response variables. But both logistic and probit regression models, however, can be applied to ordinal response variables with more than two ordered categories, such as response options of "never," "sometimes," and "a lot," which do not necessarily have equal distance between the values.1 The application of these models is typically to response variables with 3 or 4 rank-ordered categories, and when there are 5 or more categories, moderate sample size, and fairly symmetrically distributed variables, there will be minimal loss of power if ordinary least squares is used instead of ordinal regression (Kromrey & Rendina-Gobioff, 2002; Taylor, West, & Aiken, 2006). 2 For outcomes that can be considered ordinal, it is generally better to use all of the ordinal values rather than collapsing into fewer categories or dichotomizing variables, even with a sparse number of responses in some categories. Collapsing categories has been shown to reduce statistical power (Ananth & Kleinbaum 1997; Manor, Mathews, & Power, 2000) and increase Type I error rates (Murad, Fleischman, Sadetzki, Geyer, & Freedman, 2003). Ordered Logit Ordered logit models are logistic regressions that model the change among the several ordered values as a function of each unit increase in the predictor.
    [Show full text]
  • Ordered Logit Model
    L. Grilli, C. Rampichini: Ordered logit model ORDERED LOGIT MODEL* Leonardo Grilli, Carla Rampichini Dipartimento di Statistica, Informatica, Applicazioni “G. Parenti” – Università di Firenze [email protected], [email protected] The ordered logit model is a regression model for an ordinal response variable. The model is based on the cumulative probabilities of the response variable: in particular, the logit of each cumulative probability is assumed to be a linear function of the covariates with regression coefficients constant across response categories. Questions relating to satisfaction with life assessment and expectations are usually ordinal in nature. For example, the answer to the question on how satisfied a person is with her quality of life can range from 1 to 10, with 1 being very dissatisfied and 10 being very satisfied (e.g. Schaafsma and Osoba, 1994; Anderson et al. 2009). It is tempting to analyse ordinal outcomes with the linear regression model, assuming equal distances between categories. However, this approach has several drawbacks which are well known in literature (see, for example, McKelvey and Zavoina, 1975; Winship and Mare, 1984; Lu, 1999). When the response variable of interest is ordinal, it is advisable to use a specific model such as the ordered logit model. Let Yi be an ordinal response variable with C categories for the i-th subject, alongside with a vector of covariates xi. A regression model establishes a relationship between the covariates and the set of probabilities of the categories pci=Pr(Yi =yc| xi), c=1,…,C. Usually, regression models for ordinal responses are not expressed in terms of probabilities of the categories, but they refer to convenient one-to- one transformations, such as the cumulative probabilities gci=Pr(Yi ≤yc| xi), c=1,…,C.
    [Show full text]
  • Regression Models with Ordinal Variables*
    REGRESSION MODELS WITH ORDINAL VARIABLES* CHRISTOPHER WINSHIP ROBERT D. MARE Northwestern University and Economics University of Wisconsin-Madison Research Center/NORC Most discussions of ordinal variables in the sociological literature debate the suitability of linear regression and structural equation methods when some variables are ordinal. Largely ignored in these discussions are methods for ordinal variables that are natural extensions of probit and logit models for dichotomous variables. If ordinal variables are discrete realizations of unmeasured continuous variables, these methods allow one to include ordinal dependent and independent variables into structural equation models in a way that (I) explicitly recognizes their ordinality, (2) avoids arbitrary assumptions about their scale, and (3) allows for analysis of continuous, dichotomous, and ordinal variables within a common statistical framework. These models rely on assumed probability distributions of the continuous variables that underly the observed ordinal variables, but these assumptions are testable. The models can be estimated using a number of commonly used statistical programs. As is illustrated by an empirical example, ordered probit and logit models, like their dichotomous counterparts, take account of the ceiling andfloor restrictions on models that include ordinal variables, whereas the linear regression model does not. Empirical social research has benefited dur- 1982) that discuss whether, on the one hand, ing the past two decades from the application ordinal variables can be safely treated as if they of structural equation models for statistical were continuous variables and thus ordinary analysis and causal interpretation of mul- linear model techniques applied to them, or, on tivariate relationships (e.g., Goldberger and the other hand, ordinal variables require spe- Duncan, 1973; Bielby and Hauser, 1977).
    [Show full text]
  • Using Generalized Ordinal Logistic Regression Models to Estimate Educational Data Xing Liu Eastern Connecticut State University, [email protected]
    Journal of Modern Applied Statistical Methods Volume 11 | Issue 1 Article 21 5-1-2012 Ordinal Regression Analysis: Using Generalized Ordinal Logistic Regression Models to Estimate Educational Data Xing Liu Eastern Connecticut State University, [email protected] Hari Koirala Eastern Connecticut State University Follow this and additional works at: http://digitalcommons.wayne.edu/jmasm Part of the Applied Statistics Commons, Social and Behavioral Sciences Commons, and the Statistical Theory Commons Recommended Citation Liu, Xing and Koirala, Hari (2012) "Ordinal Regression Analysis: Using Generalized Ordinal Logistic Regression Models to Estimate Educational Data," Journal of Modern Applied Statistical Methods: Vol. 11 : Iss. 1 , Article 21. DOI: 10.22237/jmasm/1335846000 Available at: http://digitalcommons.wayne.edu/jmasm/vol11/iss1/21 This Regular Article is brought to you for free and open access by the Open Access Journals at DigitalCommons@WayneState. It has been accepted for inclusion in Journal of Modern Applied Statistical Methods by an authorized editor of DigitalCommons@WayneState. Journal of Modern Applied Statistical Methods Copyright © 2012 JMASM, Inc. May 2012, Vol. 11, No. 1, 242-254 1538 – 9472/12/$95.00 Ordinal Regression Analysis: Using Generalized Ordinal Logistic Regression Models to Estimate Educational Data Xing Liu Hari Koirala Eastern Connecticut State University, Willimantic, CT The proportional odds (PO) assumption for ordinal regression analysis is often violated because it is strongly affected by sample size and the number of covariate patterns. To address this issue, the partial proportional odds (PPO) model and the generalized ordinal logit model were developed. However, these models are not typically used in research. One likely reason for this is the restriction of current statistical software packages: SPSS cannot perform the generalized ordinal logit model analysis and SAS requires data restructuring.
    [Show full text]