Learn About Multinomial Logit Regression in R with Data from the General Social Survey (2016)

Learn About Multinomial Logit Regression in R with Data from the General Social Survey (2016)

Learn About Multinomial Logit Regression in R With Data From the General Social Survey (2016) © 2019 SAGE Publications Ltd. All Rights Reserved. This PDF has been generated from SAGE Research Methods Datasets. SAGE SAGE Research Methods Datasets Part 2019 SAGE Publications, Ltd. All Rights Reserved. 1 Learn About Multinomial Logit Regression in R With Data From the General Social Survey (2016) Student Guide Introduction This dataset example introduces multinomial logit. This technique allows researchers to evaluate whether a categorical variable with three or more unordered categories is a function of one or more independent variables. Examples of unordered categorical variables include gender, race, and birthplace. Taking race for example, it can take values in African, American, Asian, White (any maybe others depending on the context), but those values don’t follow a general numerical order, and hence they are unordered categories. The multinomial logit model is most commonly estimated via Maximum Likelihood Estimation (MLE). This example describes multinomial logit, discusses the assumptions underlying it, and shows how to estimate and interpret multinomial logit models. We illustrate multinomial logit using a subset of data from the 2016 General Social Survey (http://gss.norc.org/). Specifically, we test whether a 5-category measure of employment status is predicted by gender, age, and education. An analysis like this allows researchers to evaluate factors that influence labor force status, which may be useful in policy designs. What Is Multinomial Logit? Multinomial logit models explain variation in a categorical variable that consists of three or more unordered categories as a function of one or more independent Page 2 of 16 Learn About Multinomial Logit Regression in R With Data From the General Social Survey (2016) SAGE SAGE Research Methods Datasets Part 2019 SAGE Publications, Ltd. All Rights Reserved. 1 variables. Categories for the dependent variable need not follow any order (if they do, you can still estimate a multinomial logit model without biasing your results, but you might consider an ordered logit model as a more statistically efficient alternative). Multinomial logit models estimate every possible two-way comparison of categories on the dependent variable, which means the number of parameters to estimate increases rapidly as the number of categories for the dependent variable increases. Multinomial logit models are therefore typically used when the dependent variable has 3 to 5 unordered categories. More than that, researchers often consider combining some categories or imposing some other restrictions. If the dependent variable only has two categories, the multinomial logit model reduces to simple logit. Multinomial logit is one example from the family of Generalized Linear Models (GLMs). GLMs connect a linear combination of independent variables and estimated parameters – often called the linear predictor – to a dependent variable using a link function. The link function typically involves some sort of non-linear transformation, which in the case of multinomial logit means that the probabilities that a given observation in the dataset falls into each of the categories of the dependent variable are non-linear functions of the independent variables. The parameters of GLMs are typically estimated using MLE. Because multinomial logit models are estimated via MLE, it is best if the dataset has a sufficiently large number of observations. Just how many is open to debate, but in his book Regression Models for Categorical and Limited Dependent Variables (SAGE, 1997), J. Scott Long suggests trying to meet two criteria: (1) have at least 100 observations total, and (2) have at least 10 observations for each coefficient estimated in the model. In simple terms, MLE is an iterative process that approximates estimates for the coefficients that maximize the fit of the model to the sample of data. By maximizing fit, MLE also minimizes the unexplained variance in the dependent variable. In that Page 3 of 16 Learn About Multinomial Logit Regression in R With Data From the General Social Survey (2016) SAGE SAGE Research Methods Datasets Part 2019 SAGE Publications, Ltd. All Rights Reserved. 1 sense, MLE accomplishes the same objective as ordinary least squares (OLS) does for standard regression. When computing statistical tests, it is customary to define the null hypothesis (H0) to be tested. In multinomial logit, the standard null hypothesis is that each coefficient is equal to zero. The actual coefficient estimates will not be exactly equal to zero in any particular sample of data, simply due to random chance in sampling. The t-tests conducted to test each individual coefficient are designed to help determine whether the coefficients are different enough from zero to be declared statistically significant. “Different enough” is typically defined as producing a test statistic with a level of statistical significance, or p-value that is less than .05. This would lead us to reject the null hypothesis (H0) that the coefficient in question equals zero. Assumptions Behind the Method Nearly every statistical model or test relies on some underlying assumptions, and they are all affected by the mix of data you happen to have. Different textbooks present the assumptions for a multinomial logit model in different ways. Here are the key factors to consider when estimating a multinomial logit: • The dependent variable must consist of categories which are assumed to be unordered; the categories could be ordered but the ordering information will be ignored. • The model is correctly specified (e.g., we have the right independent variables in the model properly measured). • The values of the independent variables are fixed in repeated samples. • The individual residuals are independent of each other and follow a logistic distribution. • Because it is generally estimated via MLE, multinomial logit regression requires moderate to large sample sizes. Page 4 of 16 Learn About Multinomial Logit Regression in R With Data From the General Social Survey (2016) SAGE SAGE Research Methods Datasets Part 2019 SAGE Publications, Ltd. All Rights Reserved. 1 While not a formal assumption, researchers should consider how many parameters a multinomial logit model estimates. For each additional independent variable, a multinomial logit must estimate J − 1 intercepts and J − 1 slope coefficients. Similarly, for each additional category on the dependent variable, a multinomial logit model must estimate another full set of slopes for all of the independent variables plus another intercept. Thus, the size and complexity of a multinomial logit model grows quickly as the number of categories on the dependent variable or the number of independent variables increase. Estimating a Multinomial Logit Model One way to understand the multinomial logit model is as an extension of the simple logit model. The simple logit model is designed to evaluate whether values of the independent variables in a model help in sorting observations on the dependent variable into one of two categories. The multinomial logit extends this logic to sorting observations into one of three or more categories on the dependent variable. One of the categories of the dependent variable is selected as the baseline, and then parameters are estimated that predict the probability of being in each of the remaining categories compared to the baseline. Suppose we have a dependent variable Y with categories labeled A, B, and C that we believe is affected by values of an independent variable named X. We might arbitrarily select category A as the baseline category. If we then estimate a multinomial logit model, we will estimate an intercept and slope that describes how X is related to the probability of an observation being in Category B versus A and another intercept and slope that describes how X is related to the probability of an observation being in Category C versus A. In other words, a multinomial logit model where Y takes on three values is similar to simultaneously estimating two simple logit models. We say “similar” because the parameter estimates of a multinomial logit model are constrained (appropriately so) by the requirement that Page 5 of 16 Learn About Multinomial Logit Regression in R With Data From the General Social Survey (2016) SAGE SAGE Research Methods Datasets Part 2019 SAGE Publications, Ltd. All Rights Reserved. 1 the probability of an observation being in Category A, B, or C must sum to 1. Because of this restriction, if we know how Category B compares to A and how Category C compares to A, we know by definition how Category B compares to Category C. It is important to select an appropriate baseline category in your analysis, as it will allow for more straightforward interpretations of the results. For example, if you are modeling the side effects of a drug as a function of age, gender, race, etc., then it may be helpful to choose “no side effect” as the baseline category. Assuming the other two possible side effects are headache and cough, then a positive effect from age on headache versus the baseline (i.e., no effect) can be interpreted as older people being more likely to have the side effect of headache. Now consider another case when you choose “cough” as the baseline; then a positive effect from age on headache versus cough is not as easy to understand nor informative as the earlier result with no effect as the baseline. Multinomial logit models still express the dependent variable as a function of one or more independent variables. We can start with the linear predictor that includes a single independent variable as shown in Equation (1): (1) ηij = β0j + β1jX1i The subscript i refers to individual observations and the subscript j refers to one of the categories of the dependent variable. Next, we need a link function for the multinomial logit model that shows how the linear predictor relates to the probability that the dependent variable falls into category j relative to a baseline category.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    16 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us