<<

REGRESSION MODELS WITH ORDINAL VARIABLES*

CHRISTOPHER WINSHIP ROBERT D. MARE Northwestern University and Economics University of Wisconsin-Madison Research Center/NORC

Most discussions of ordinal variables in the sociological literature debate the suitability of and structural equation methods when some variables are ordinal. Largely ignored in these discussions are methods for ordinal variables that are natural extensions of probit and models for dichotomous variables. If ordinal variables are discrete realizations of unmeasured continuous variables, these methods allow one to include ordinal dependent and independent variables into structural equation models in a way that (I) explicitly recognizes their ordinality, (2) avoids arbitrary assumptions about their scale, and (3) allows for analysis of continuous, dichotomous, and ordinal variables within a common statistical framework. These models rely on assumed probability distributions of the continuous variables that underly the observed ordinal variables, but these assumptions are testable. The models can be estimated using a number of commonly used statistical programs. As is illustrated by an empirical example, and logit models, like their dichotomous counterparts, take account of the ceiling andfloor restrictions on models that include ordinal variables, whereas the linear regression model does not.

Empirical social research has benefited dur- 1982) that discuss whether, on the one hand, ing the past two decades from the application ordinal variables can be safely treated as if they of structural equation models for statistical were continuous variables and thus ordinary analysis and causal interpretation of mul- linear model techniques applied to them, or, on tivariate relationships (e.g., Goldberger and the other hand, ordinal variables require spe- Duncan, 1973; Bielby and Hauser, 1977). cial statistical methods or should be replaced Structural equation methods have mainly been with truly continuous variables in causal mod- applied to problems in which variables are els. Allan (1976), Borgatta (1968), Kim (1975, measured on a continuous scale, a reflection of 1978), Labovitz (1967, 1970), and O'Brien the availability of the theories of multivariate (1979a), among others, claim that multivariate analysis and general linear models for continu- methods for interval-level variables should be ous variables. A recurring methodological used for ordinal variables because the power issue has been how to treat variables measured and flexibility gained from these methods out- on an ordinal scale when multiple regression weigh the small biases that they may entail. and structural equation methods would other- Hawkes (1971), Morris (1970), O'Brien (1982), wise be appropriate tools. Many articles have Reynolds (1973), Somers (1974), and Smith appeared in this journal (e.g., Bollen and Barb, (1974), among others, suggest that the biases in 1981, 1983; Henry, 1982; Johnson and Creech, using continuous-variable methods for ordinal 1983; O'Brien, 1979a, 1983) and elsewhere variables are large and that special techniques (e.g., Blalock, 1974; Kim, 1975, 1978; Mayer for ordinal variables are required. and Robinson, 1978; O'Brien 1979b, 1981, Although the literature on ordinal variables in sociology is vast, its practical implications *Direct all correspondence to: Robert D. Mare, have been few. Most researchers apply regres- Department of Sociology, University of Wisconsin, sion, MIMIC, LISREL, and other multivariate 1180 Observatory Dr., Madison, WI 53706. models for continuous variables to ordinal The authors contributed equally to this article. variables, sometimes claiming support from This research was supported by the National Science studies that find little bias from assuming inter- Foundation. Mare was supported by the University val measurement for ordinal variables. Yet of Wisconsin-Madison Graduate School, the Center these studies as well as the ones that they crit- for Advanced Study in the Behavioral Sciences icize provide no solid guidance because they (through the National Science Foundation), and the are typically atheoretical simulations of limited Wisconsin Center for Education Research. Winship was supported by NORC and a Faculty Research scope. Somp researchers apply recently devel- Grant from Northwestern University. The authors oped techniques for categorical-data analysis are grateful to Ian Domowitz and Henry Farber for that take account of the ordering of the helpful advice, and to Lynn Gale, Ann Kremers, and categories of variables in cross-classifications Ernie Woodward for research assistance. (e.g., Agresti, 1983; Clogg, 1982; Goodman,

512 American Sociological Review, 1984, Vol. 49 (August:512-525) REGRESSION MODELS WITH ORDINAL VARIABLES 513

1980). These methods, however, while elegant ordinal variable. For example, a and well grounded in statistical theory, are dif- may place individuals in one of a number of ficult to use in the cases where regression ranked categories, such as, "strongly agree," analysis and its extensions would otherwise "somewhat agree," "neither agree nor dis- apply: that is, where data are nontabular; in- agree," "somewhat disagree," or " strongly clude continuous, discrete, and ordinal vari- disagree" with a statement. An underlying, ables; and apply to a causal model with several continuous variable denoting individuals' de- endogenous variables. grees of agreement is mapped into categories This article draws attention to alternative that are ordered but are separated by unknown methods for estimating regression models and distances. I their generalizations that include ordinal vari- This view of ordinal variables can also apply ables. These methods are extensions of logit to variables that are often treated as continu- and probit models for dichotomous variables ous but might be better viewed as ordinal. and are based on models that include unmea- Counted variables, such as grades of school sured continuous variables for which only or- completed, number of children ever born, or dinal measures are available. Largely ignored number of voluntary-association memberships, in the sociological literature (though see may be regarded as ordinal realizations of un- McKelvey and Zavoina, 1975), they provide derlying continuous variables. Grades of multivariate models with ordinal variables that: school, for example, should be viewed as an (1) take account of noninterval ordinal mea- ordinal measure of an underlying variable, surement; (2) avoid arbitrary assumptions "educational attainment," when one wishes to about the scale of ordinal variables; and, most acknowledge that each grade is not equally importantly, (3) include ordinal variables in easy to attain (e.g., Mare, 1980) or equally structural equation models with variables at all rewarding (e.g., Featherman and Hauser, 1978; levels of measurement. The ordered probit and Jencks et al., 1979). Similarly, when a continu- logit models can, moreover, be implemented ous variable, such as earnings, is measured in with widely available statistical software. Most categories corresponding to dollar intervals of the literature on these methods focuses on and category midpoints are unknown, the mea- estimating equations with ordinal dependent sured variable is an ordinal representation of variables (Aitchison and Silvey, 1957; an underlying continuous variable.2 Amemiya, 1975; Ashford, 1959; Cox, 1970; The measurement model of ordinal variables Gurland et al., 1960; Maddala, 1983; McCul- can be stated formally as follows. Let Y denote lagh, 1980; McKelvey and Zavoina 1975), an unobserved, continuous variable (-w < Y though some of it is relevant to models with < o) and a0, al, . . a, J-1, aj denote cut-points ordinal independent variables (Heckman, 1978; in the distribution of Y, where a0 = - x and Winship and Mare, 1983). Taken together, aj = ? (see Figure 1). Let Y* be an ordinal these contributions imply that ordinal variables variable such that can be analyzed within structural equation y= j if aj-, Y < aj models with the same flexibility and power that (j= 1,..,J). are available for continuous variables. This article summarizes the probit and logit I A less common type of ordinal variable, not dis- models for ordered variables. It describes mea- cussed further inithis article, may result from a strict surement models for ordinal variables and dis- monotonic transformation of an interval variable. cusses specification and estimation of models That is, observations (e.g., of cities, persons, occu- with ordinal dependent and independent vari- pations, etc.) may be ranked according to some un- ables. Then it discusses some tests for model measured criterion (e.g., population size, wealth, misspecification. Finally, it presents an em- rate of pay, etc.). A regression model with a ranked variable requires that the nonlinear map- example which illustrates the models. dependent pirical ping between the unmeasured continuous An appendix discusses several technical topics variable and the ranks themselves be specified. of interest to those who wish to implement the Given the mapping, the model can be estimated by models. nonlinear (e.g., Gallant, 1975). 2 The ordinal-variable model can be extended to take account of measurement error. That is, an ordi- MEASUREMENT OF ORDINAL nal variable is a transformation of a continuous vari- VARIABLES able, but some observations may be misclassified A common view of ordinal variables, which is (O'Brien, 1981; Johnson and Creech, 1983). Al- though this article does not discuss this complica- adopted here, is that they are nonstrict tion, it is a logical extension of the models presented monotonic transformations of interval vari- here. Muthen (1979), Avery and Hotz (1982), and ables (e.g., O'Brien, 1981). That is, one or Winship and Mare (1983) discuss this extension for more values of an interval-level variable are dichotomous variables; Muthen (1983, 1984) discus- mapped into the same value of a transformed, ses it for ordinal variables. 514 AMERICAN SOCIOLOGICAL REVIEW

f(Y) independent variables on an ordinal dependent variable. The following discussion assumes a single independent variable, although /(Y - J) F(=- - ) equations with several independent variables are an obvious extension. For the iPh observa- tion, let Yi be the unobserved continuous de- pendent variable Y (i = . . N), be an 1 j-1 j J-1 1, ., Xi observed independent variable (which may be Figure 1. RelationshipsAmong Latent Continuous either continuous or dichotomous), Ei be a ran- Variable (Y), Observed OrdinalVariable domly distributed error that is uncorrelated (Y*), and Thresholds (aj) with X, and l3 be a slope parameter to be esti- mated, Further, let YtIbe the observed ordinal Since Y is not observed, its mean and variance variable where, as in the measurement model - are unknown and their values must be as- above, Yi =j if aj-, Yi < aj (j = 1,. J). sumed. For the present, assume that Y has Then a regression model is mean of zero and variance of one. The relationship between Y and Y* can be Yi=_8xi + Ei further understood as follows. Consider the (E(Y) = ,3X; Var (Y) = 1). (3) likelihood of obtaining a particular value of Y and the probability that Y* takes on a specific To specify the model fully, it is necessary to value (see Figure 1). If Y follows a probability select a probability distribution for Y, or distribution (for example, normal) with density equivalently for E. If the probability that Y* function f(Y) and cumulative density function takes on successively higher values rises (or F(Y), then the probability that Y* = j is the falls) slowly at small values of X, more rapidly area under the density curve f(Y) between aj-1 for intermediate values of X, and more slowly and aj. That is, again at large values of X, then either the nor- mal or logistic distribution is appropriate for E. ai The former distribution yields the ordered pro- P(Y* =j) = f f(Y)dy=F(aj)-F(aj_), (1) bit model; the latter the model.3 In contrast, a linear model, in which the unob- where F(aj) = 1 and F(a0) = 0. For a sample of served variable Yi is replaced by the observed individuals for whom Y* is observed one can ordinal variable Y* in the regression model, estimate the cutpoints or "thresholds" aj as assumes that the probability that Y* takes suc- cessively higher values rises (falls) a constant &j= F-I(pj), amount over the entire range of X. When Y* takes on only two values, then (3) where pj is the proportion of observations for reduces to a model for a dichotomous depen- which Y* < j, and F-1 is the inverse of the dent variable and the alternative assumptions cumulative density function of Y. Given esti- of normal or logistic distributions yield binary mates of the aj, it is also possible to estimate probit and logit models respectively. Replacing the mean of Y for observations within each the unobserved Yi with the observed binary interval. If Y follows a standardized normal variable yields a linear probability model. As is distribution, then the mean Y for the observa- well known, the probit or logit specifications tions for which Y* = j is are usually preferable to the linear model be- cause the former take account of the ceiling k(a-j- 1)- (ai) and floor effects on the dependent variable Yaj, aji = , (2) whereas the linear model does not (e.g., Hanushek and Jackson, 1977). When Y* is or- dinal and takes on more than two values, the where 4 is the standardized normal probability ordered probit and logit models have a similar density function and 1 is the cumulative stan- advantage over the linear regression model. dardized normal density function (Johnson and Whereas the former take account of ceiling and Kotz, 1970). floor restrictions on the probabilities, the linear model does not. This advantage of the ordered probit and logit over the linear model is MODELS WITH ORDINAL DEPENDENT strongest when Y* is highly skewed or when VARIABLES two or more groups with widely varying Model Specification 3 Other models for binary dependent variables that Given the measurement model for ordinal vari- can be extended to ordinal variables are discussed ables, it is possible to model the effects of by, for example, Cox (1970) and McCullagh (1980). REGRESSION MODELS WITH ORDINAL VARIABLES 515

skewness in Y* are compared (see example mationconsists of findingvalues of ,3and the aj below). The assumptionthat E follows a normal in (4) that make L as large as possible. or logistic distribution, however, while often Binary Probit or Logit. In practice, plausible, may be false. As discussed below, maximumlikelihood estimationof the ordered one can test this assumptionand, in principle, logit or can be expensive, espe- modify the model to take account of departures cially when the numbers of observations, from the assumed distribution. thresholds, or independentvariables are large and the analyst does not know what values of the unknown parameters would be suitable Estimation "start values" for the estimation. Moreover, In practice, one seeks to estimate the slope although the ordered models can be im- parameter(s)/3 and the threshold parameters plemented with widely available computer a1, - . , aj-1.The formerdenotes the effect of a software (see Appendix), such applicationsare unit change in the independentvariable X on harderto masterand apply routinelythan more the unobserved variableY. The latter provide elementary methods. Other methods of esti- information about the distribution of the mation are cheaper and easier to use and con- ordered dependent variable such as whether sistently (though not efficiently) estimate the the categories of the variable are equally unknownparameters. These methods are use- spaced in the probitor logit scale. Because the ful both for exploratoryresearch where many orderedprobit and logit models are nonlinear, models may be estimatedand for obtainingini- exact algebraic expressions for their parame- tial values for maximumlikelihood estimation. ters do not exist. Instead, to compute the pa- One method is to collapse the categories of rameters, iterative estimation methods are re- Y* into a dichotomy, Y* < j versus Y* - j, quired. This section summarizes the logic of say, and to estimate (3) as a binary probit or maximumlikelihood estimationfor these mod- logit by the maximum likelihood methods els as well as a useful non-maximumlikelihood availablein many statisticalpackages or, if the approach. Further technical details, including data are grouped, by informationabout computersoftware, are pre- (e.g., Hanushek and Jackson, 1977). This sented in the Appendix.4 yields consistent estimates of /3 and of ai, Maximum Likelihood. If the unobserved de- though not of the remainingthresholds. This pendentvariable Y has conditionalexpectation method can also be appliedJ-1 times, once for given the independentvariables) E(YIX)= ,3X each of the J-1 splits between adjacent and varianceone, then the measurementmodel categories of Y*, to estimate all of the as's, but (1) can be modifiedto give the probabilitythat this yields J-1 estimates of /3, none of which the ith individualtakes the valuej on the ordinal uses all of the informationin the data. dependent variable as A better alternative is to estimate the J-1 binary or probits simultaneouslyto ob- p(Y* = jjXj) = F(aj - 83Xj) tain estimates of the J-1 thresholdsand a com- - F(a-1 - /3X1), (4) mon slope parameter,3. To do this, replicate the data matrixJ-1 times, once for each of the where F(ao - 83X1)= 0 and F(aj - 83X1)= 1 J- 1 splits between adjacentcategories of Y*, to because a0 = -w and aj = x. If the model is an get a data set with (J-l)N observations. Each orderedprobit, then F is the cumulative stan- of the J-1 data matrices has a differentcoding dardnormal density function. If the model is an of the dependent variable to denote that an orderedlogit, then F is the cumulativelogistic observation is above or below the threshold function. The quantities (4) for each individual that matrix estimates, and J-1 additional col- are combinedto form the sample likelihood as umns are added to the matrix for J-1 dummy follows: variables, denoting which threshold is esti- matedin each of the J-1 data sets. This method L = L1J7Ip(Y*i_=j1Xi)dij (5) 2, which presents a iij is illustrated in Figure hypotheticaldata matrixfor a dependentvari- where dijis a variablethat equals one if Yli = i, able having 4 ordered categories. The total and zero otherwise. Maximumlikelihood esti- matrixhas 3N observations.For clarity, within each of the 3 replicates of the data, observa- 4Models with ordinal dependent variables can tions are orderedin ascendingorder of Y*. The also be estimated by weighted nonlinear least third column denotes the dependent variable squares (e.g., Gurland et al., 1960). For models is based on distributions within the exponential family, for a binary logit or probit model, which such as the logit and probit, weighted nonlinear least coded one if the observation is above the squares and maximum likelihood estimation are threshold and zero otherwise. In the first equivalent (e.g., Nelder and Wedderburn, 1972; panel, observations scoring 2 or above on Y* Bradley, 1973; Jennrich and Moore, 1975). have a one on the dependent variable; in the 516 AMERICAN SOCIOLOGICAL REVIEW

Dep. 2-4/ 3-4/ 4/ Variable Y* var. 1 1-2 1-3 XI X2 Parameter a, a2 a3 31 A Observation 1 1 0 -1 0 0 X11 X21 2 1 0 - 1 0 0 X 12 X22

1 0 -1 0 0 2 1 -1 0 0

2 1 -1 0 0 3 1 -1 0 0

3 1 -1 0 0 4 1 -1 0 0

N 4 1 -1 0 0 XIN X2N

1 1 0 0 1 0 X11 X21 2 1 0 0 1 0 X12 X22

1 0 0 1 0 2 0 0 1 0

2 0 0 1 0 3 1 0 1 0

3 1 0 1 0 4 1 0 1 0

N 4 1 0 1 0 XIN X2N

1 1 0 0 0 1 Xll X21 2 1 0 0 0 1 X12 X22

1 0 0 0 1 2 0 0 0 1

2 0 0 0 1 3 0 0 0 1

3 0 0 0 1 4 1 0 0 1

N 4 1 0 0 1 X1N X2N Figure 2. Hypothetical Data Matrix for Dichotomous Estimation of Ordered Logit or Probit Model.

second, observationsscoring 3 or above on Y* mates (within 10 percent). The standarderrors have a one; and in the third, observations of the parameters are somewhat underesti- scoring 4 have a one. The fourththrough sixth mated because the method assumes that there columns denote which of the three thresholds are (J-1)N observations when only N are are estimated in each of the panels of the data unique. In practice, however, this bias is often matrix.These are effect coded and thus can all small.5 be included as independent variables in the probit or logit model to estimate the three I In the probit model, the rationale for this method thresholds. The final two columns denote two is as follows: An ordered probit is equivalent to J- 1 independent variables, values of which are binary probits in which constants (thresholds) differ, replicated exactly across the three panels. slopes are identical (within variables across Although this method requiresa largerdata equations), and correlations among the disturbances set, it is a flexible of dataand of the J- 1 equations are all equal to one. A binary way exploringthe probit estimated over J- 1 replicates of the data as obtaining preliminary estimates of the aj's described here is equivalent to J- 1 binary probits and /3's for maximum likelihood estimation. with varying constants and identical slopes but with Estimates obtained by this method are often disturbance correlations all equal to zero. In prac- very close to the maximum likelihood esti- tice, the slope and threshold estimates are insensitive REGRESSION MODELS WITH ORDINAL VARIABLES 517

Scaling of Coefficients variable. This strategy, however, is unpar- simonious, fails to use the information that the Most computer programs for ordered probit or categories of Y* are ordered, and may still logit estimation fix the variance of E at I in the yield biased estimates if the correct model is a probit model or at ir2/3 in the logit model rather linear effect of the unobserved variable Y on than fix the variance of Y as in (3) above. the dependent variable. This section considers Although computationally efficient, this prac- several preferable solutions to this problem. tice may lead to ambiguous comparisons be- Consider the following two equations: tween the coefficients of different equations. Adding new independent variables to an equa- Zi = 31X1i + f32Yi + Ezi (6) tion alters the variance of Y and thus the re- Yi = 01X1i + 02X2i + Eyi (7) maining coefficients in the model, even if the new independent variables are uncorrelated where for the ith observation Z is continuous with the original independent variables. When and may be either an observed variable or an estimating several equations with a common unobserved variable that corresponds to an ob- ordinal dependent variable it is advisable to served dichotomous or ordinal variable, say rescale estimated coefficients to a constant Z*; Y is an unobserved continuous variable variance for the latent dependent variable (say corresponding to an observed ordinal variable = Var(Y) 1) across equations. The resulting Y* through the measurement model discussed coefficients will then measure the change in above; X1 and X2 are observed continuous or standard deviations in the latent continuous dichotomous variables; ez and Ey are random variable per unit changes in the independent errors that are uncorrelated with each other variables. and with their respective independent vari- If, for example, the computations assume ables; and the ,3's and 0's are parameters to be that Var(E) = 1 but Var(Y) = 1, then the esti- estimated. If this model is correct, that is, if the mated equation is effect of the ordinal variable Y* on Z is prop- erly viewed as the linear effect of the observed Yi = bXi + ei variable Y, of which Y* is a realization, then several methods of identifying and estimating where b = 1(3E1E, ei = Ei/o-E, and Var(E) = v>. /32 are available. These methods include: (1) = = Then roI 1/[1 + b2Var(X)] and ,3 boa. instrumental-variable estimation; (2) estima- Under this scaling assumption, oe decreases as tion based on the conditional distribution of Y; additional variables that affect Y are included and (3) maximum likelihood estimation. These in the equation, and measures the proportion of methods are summarized in turn. variance in Y that is unexplained by the inde- pendent variables. Thus 1 - o-2 is analogous to R2 in a linear regression. Instrumental Variables One method of estimating (6) and (7) is to use the fact that X2 affects Y but not Z, that is, that MODELS WITH ORDINAL X2 is an instrumental variable for Y. First, es- INDEPENDENT VARIABLES timate (7) as an ordered logit or probit model Ordinal variables may also be independent or by the procedures discussed above and, using intervening variables in structural equation the estimated equation, calculate expected models. For example, job tenure, a continuous values for Y: variable, may depend on job satisfaction, an ordinal variable measured on a Likert scale, as E(YjjXjj, X20) = Yi = 61X1i + 62X2i. well as on other variables. Job satisfaction in turn may depend on characteristics of individ- Then, in a second stage of estimation, replace uals and their jobs. One solution to this prob- Y with Y in (6) and estimate the latter equation lem is to assume that the ordered categories by a method suited to the measurement of Z constitute a continuous scale, but this is inap- ( (OLS), probit, logit, propriate if the ordered variable Y* is non- etc.). This method consistently estimates /3' linearly related to an unobserved continuous and /2 under the assumption that Ezand Ey are variable Y (as in the model discussed above) uncorrelated with each other and with X, and and it is the unobserved variable that linearly X2. Standard errors for estimated parameters affects the dependent variable. Another strat- egy is to represent Y* as J- 1 dummy variables 6 A fourth method is to rely on multiple indicators and to estimate their effects on the dependent of Y, as would be possible if, for example, Y denoted job satisfaction and Ye and Y* were Likert scales of to alternative assumptions about the disturbance satisfaction with specific aspects of a job (pay, op- correlations. portunity for advancement, etc.). 518 AMERICAN SOCIOLOGICAL REVIEW should be computed using the usual formulas tion, (6) and (7) can be estimated simulta- for instrumental-variable estimation (e.g., neously regardless of whether Z results from Johnston, 1972:280). an observed continuous or ordinal variable. This method works only if X2 affects Y but Suppose that 02 = 0 in (7), that is, that there is not Z; otherwise Y would be an exact linear no instrumentalvariable and that Var(E,) = combinationof variables already in (6) and 32 Var(E,) = 1. (Maximum likelihood, like the would not be estimable. In addition, for the methodbased on the conditionalexpectation of method to yield precise estimates, X2 and Y Y, works regardlessof whether 02 = 0.) Then should be strongly associated. When these the reduced form of (6) is conditions are not met, alternative methods should be considered. Zi= aiX1i + Vi (9)

where 8, = /31 + /3201, Vi = Ezi + 32EYi, and Using the Conditional Distribution of Y COv(VEy) = P = 820,Y = 32. The maximum If Z is an observed continuousvariable, and Z likelihood procedure estimates (7) and (9) si- and Y follow a bivariate normal distribution, multaneouslyalong with p. Thus 01, 02, and /2 then an alternativemethod of estimatingY for are estimateddirectly, and 8,3can be calculated as - 201.-7 substitutionin (6) is available. Suppose 02 = 0, 81 that is, there is no instrumentalvariable X2 The estimation procedure itself consists of computingthe joint probabilitiesof obtainingZ through which to identify /2 in (6). One can nonetheless computeexpected values of Y that (or Z* if Z is an unobservedvariable for which Y* are not linearly dependent on variables in (6) only an ordinalvariable is observed)and for by using the relationshipbetween Y and Y*: each individual and forming the likelihood which, assuming a bivariate normal distribu- on the = tion for Z and Y, depends reduced-form E(YilXli, Y*) Yi= 01X1i + E(YyiXli, Y*) parametersin (7) and (9) and on p. The method searches for values of the parameters that = lx1, +4(&?i_ - 1X1i) - ( lXi) make the likelihood as large as possible. See (D(a'j- 01X11)- c(a?i_1- 01X1- ) the Appendix for furthertechnical details. where all parametersare taken from ordered probitestimates of (7) (excludingX2). The sec- Extensions ond term in (8) is an extention of equation (2) Given estimates of equationsthat include ordi- above to the case where each observationhas a nal variables as either dependent or indepen- mean conditionalon its.value of X1 in addition dent variables, it is possible to formulate to Y*. With predicted values Y in hand, one structural equation models with mixtures of can then substitutethem for Y in (6) and esti- continuous, discrete and ordinalvariables. As mate that equationby least-squaresregression. a result one can compute direct and indirect This method is a variant of procedures for effects of exogenous variables on ultimate en- estimating regression equations that are sub- dogenous variableseven when the intervening ject to "sample selection bias" (Berk, 1983; variablesare ordinal.The same proceduresde- Heckman, 1979). It permits identification of scribed by Winshipand Mare (1983:82-86)for the effects of the unmeasuredvariable Y in (6) the path analysis of dichotomousvariables can in the absence of the instrumentalvariable X2 be applied to systems in which some of the because it takes account of the correlation variables are ordinal. between Eyand the disturbanceof the reduced In addition, the models described here can form of (6) (thatis, El + 32EY), which is ignored be extended to allow for the discrete and con- in the instrumental-variableestimation. This tinuous effects of ordinal variables. If, for method relies on the assumedbivariate normal example, a continuousvariable, say, earnings, distribution of the disturbances of the two is affected by an ordinalvariable, say, highest equations. One should be cautious about the grade of school completed, one might elect to degree to which one's results may depend on model the school effect as twofold: (1) as an this distributionalassumption. effect of a latent continuous variable, "educa- tional attainment";and (2) as the effect(s) of milestones or Maximum Likelihood Estimation attaining particular schooling

Identifying,/3 using the conditionaldistribution 7If alternative assumptions about aid of Y requiresthat Z be an observed continuous Var(E,) Var(E,) are made, the estimating formulas are dif- variable. It is not a fully efficient method in ferent, but the model is nonetheless identified pro- that it relies on two separateestimation stages. vided a scale for Y and Z (and thus E, and E,) is If Z and Y follow a bivariatenormal distribu- assumed. REGRESSION MODELS WITH ORDINAL VARIABLES 519 credentials (for example, high school degree, of the augmented model over (3). If the test college degree, etc.). To do this, augment (6) statistic is significant, this is evidence that the above-in which Z denotes earnings, Y de- functional relationship given by (3) is incor- notes "educational attainment," and X1 de- rect.8 notes determinants of earnings-with dummy variables denoting whether or not the school- Test Based on the Dependent Variable ing milestones of interest have been attained. Then by estimating the new equation (6) along An alternative test examines whether the effect with (7), one can assess the independent effects of X on Y varies with Y, that is, whether the of these two aspects of schooling on earnings. threshold parameters a( vary with X. This pos- Thus the alternative formulations of effects of sibility can be explored by expanding the ma- ordinal variables parallel those for binary vari- trix in Figure 2. Separate effects of the inde- ables (Winship and Mare, 1983; Heckman, pendent variables for varying values of the de- 1978; Maddala, 1983). pendent variable can be obtained by augment- ing Figure 2 to include interactions between the indicators for which threshold is estimated by TESTS FOR DISTRIBUTIONAL which panel of the data (2-4/1, 3-4/2, 4/1-3) MISSPECIFICAT1ON and the independent variables X1 and X2. Good Ordered probit and logit models for ordered estimates of such an interactive model can be dependent variables rely on the assumptions of obtained from binary logit or probit analysis. normal and logistic distributed errors respec- These estimates, however, will not provide a tively. Unlike the linear model, where the valid test because the binary model assumes normality assumption for the errors affects the 3N independent observations when only N are validity of significance tests but not the un- independent. To obtain the correct likelihood biasedness of parameter estimates, for the or- statistic to compare to the likelihood corre- dinal models both the parameters and the test sponding to (3), the latter model must be aug- are distorted when distributional as- mented with parameters 8j that denote the sep- sumptions are false. This section summarizes arate effect of X at each threshold. Then the methods for testing the validity of the distribu- contribution to the likelihood for the ith obser- tional assumptions. vation is: If a model such as (3) above is specified to have the correct probability distribution for the p(Y',= jiX) = F(aj - jXi) errors (or the latent continuous variable), then -F(a(j_ 1- j-,Xi)- the estimated parameters) /3 should be in- variant except for sampling variability through These quantities can be combined to form the the full range of both the independent vari- likelihood function and the parameters can be a able(s) X and the dependent variable Y. Con- estimated as described above. Again, signifi- versely, significant variation in estimated 13's cant likelihood-ratio test statistic is evidence among different segments of the range of X or that (3) is misspecified.9 Y, or among weightings of the observations that give different emphasis to different parts of the EXAMPLE: EDUCATIONAL MOBILITY distributions of X or Y, is evidence that the Data model is misspecified. This section discusses an example of ordered probit estimation applied to a model that con- Test Based on the Independent Variable an variable is to A test based on independent 8 White (1981) develops additional, more- partition the observations into k mutually ex- sophisticated versions of this test for nonlinear re- clusive segments defined by X and to create gression models. k - 1 dummy variables denoting into which seg- 9 Varying 13'sacross segments of the X or Y distri- ment each observation falls. For example, a bution may signify a number of specification ills: (1) dummy variable could be formed that takes the incorrect distribution for Y; (2) nonlinearities in the value 1 if an observation is above the mean (or effects of X; (3) interactions among independent median) of X and zero otherwise, or three variables; or (4) improperly excluded independent dummy variables could be formed that indexed variables (Stinchcombe, 1983; Winship and Mare, 1983). White whether or not each observation was in the (1981) proposes methods that, in prin- ciple, empirically distinguish (1), (2), and (3), though first, second, or third quartile. Augment (3) the practical value of these methods remains to be with the dummy variables and their interac- determined. Aranda-Ordaz (1981) and Pregibon tions with X. Then estimate the augmented (1980) suggest alternative distributions for Y that equation by maximum likelihood and perform a may be considered if the logit or probit model is likelihood ratio test for the improvement in fit rejected. 520 AMERICAN SOCIOLOGICAL REVIEW tains ordereddependent and independentvari- Model Specification ables. The data are from the 1962Occupational In the models presented below, the indepen- Changesin a GenerationSurvey (OCG)on the dent variables include a variable for color association of father's and son's grades of equalingone if the son is white and zero other- school completed for white and nonwhite men wise, a continuous variable for father's age 20-64 in the civilian noninstitutional schooling, and a continuous variable for the population (U.S. Bureau of the Census, interaction between father's schooling and 1964:17-18).They are taken from a frequency color. Thus if, for the ith individual,Zi denotes table with the following dimensions: father's son's schooling, Xi denotes a dummy variable schooling (less then 8, 8-11, 12, 13+ grades); for color, Yi denotes father's schooling, Ezide- color white, nonwhite); and son's schooling notes a randomdisturbance, and /3k,/2, and /3 (less than 8, 8-11, 12, 13+ grades). Although denote parametersto be estimated, the model the publishedtable reportspopulation frequen- is: cies, the analyses presentedhere are based on 18,345 sample observationsfor which data are Zi = /18Xi + p2Yi + /33XiYi ? Ezi (10) present on the three variables. To see if the effect of father's schooling on son's schooling differs between whites and Measurement nonwhites, one examines the size and significanceof the interactioncoefficient 3 or, Suppose one wishes to assess the effect of fa- equivalently,compares (10) to a simplerequa- ther's schooling and color on son's schooling tion that excludes the interactionvariable X1Yi. and to see whether the effect differs between In a linear model, Z and Y are observed vari- whites and nonwhites. The effect of father's ables based on the categories of son's and fa- schooling may be stronger for whites than ther's schooling.In the linearmodels presented nonwhites in the cohorts representedin these below, the four categoriesof the two schooling data because nonwhitesons of highly educated variables have the scores 5, 10, 12, and 14 fathers experience economic and discrimi- grades, which approximatethe category mid- natorybarriers to highereducation that are less points. Given these scores, (10) can be esti- severe for elementary and secondary school- mated by OLS. ing. In the orderedprobit model, however, Z and In this analysis both son's and father's Y are unobserved realizations of the ordered schooling are ordered variables with four categorical variables, say, Z* and Y*. Were categories. Grades of schooling is typically only Z unobserved, then (10) could be esti- treatedas a continuousdependent variable and mated using one of the methods for ordered analyzed using a linear model. This may seem dependentvariables discussed above. Because compelling when separate observations are Y is also unobserved, it is necessary to con- availablefor each individual,and it may even sider a second equation in which Y is the de- seem appropriatein the present case, where pendent variable: one can assign scores to the midpointsof the categories. Although estimates based on both oXi + Eyi, (11) the linear and ordered probit models are pre- Yi= sented below, there is nonetheless consider- where Eyiis a random disturbance, 0 is a pa- able reason to prefer the latter to the former. rameterto be estimated, and all other notation As noted above, grades may not be equally is as defined above. This equationpredicts the spaced either in the difficultywith which they orderedvariable, father's schooling, fromcolor are acquiredor in their rewards. Moreover, in and provides a means of estimating the samples with highly skewed schooling distri- covariancesbetween the two unobservedvari- butions or in cross-samplecomparisons where ables Y and Z and between color and the unob- degrees of skewness differ, it is better to use a served variable Y, which are needed to esti- model such as the orderedprobit which takes mate (10)."I ceiling and floor effects into account. In the present case, where schooling is grouped into 1980, 1981).Although these models avoid assigninga four broad categories and the schooling distri- metric to an orderedvariable, take floor and ceiling butions of whites and nonwhitesdiffer greatly, effects into account, and are always mathematically the orderedprobit model may be more appro- feasible, they are best suited to measures that ac- IO cumulate over time (e.g., grades, children, jobs, priate. marriages,etc.). The orderedprobit model, in con- trast, is potentiallyapplicable to all orderedvariables 10 Alternative nonlinear models that are poten- without regardto the process by which their values tially useful for studyingschooling are binaryprobit come about. or logit models that treat schoolingas a sequence of 1I Equation(10) can also be estimatedwith a sim- "continuationdecisions" (e.g., Fienberg 1980;Mare pler specificationof (11), namely, that Y is affected REGRESSION MODELS WITH ORDINAL VARIABLES 521

Estimnation able has the same standard deviation as as- The results reported below are obtained sumed in the linear regression, the coefficient for through simultaneous estimation of (10) and color is 1.088 (0.386 x 2.802). The probit (11) by maximum likelihood under the as- coefficient for father's schooling is the effect of a one sumptions that El.and are uncorrelated and standard deviation change in father's E, schooling on each follows a normal distribution. This proce- son's schooling in standard de- viation units. If the dure is an extension of methods discussed schooling variables are as- sumed above for a single, ordinal independent vari- to have the same scale as in the regres- sion, the coefficient able. That is, ( 11) is estimated along with the for father's schooling is 0.379 (0.432 x reduced form of (10), 2.802/3.194). The ordered probit results also imply that the schooling categories are approximately equally spaced in the probit Z. ` 8X, + vi, scale, as indicated by the roughly equal dis- tances between adjacent thresholds. where 8 = 13 + 1320 +?3:1Xj and v = E1 + /32E2 The ordered probit and linear models, how- + In addition, two disturbance correla- 8:3E,7Xl. ever, yield different results about a possible tions are estimated, say pi and P2, for non- whites and whites respectively. In terms of the interaction effect of father's schooling and color on son s schooling. According to the parameters of (10) and (11), p = [32 and P2 = ordered probit results, the effect of father's /32 + The maximum likelihood 3:30. procedures schooling on described in the Appendix are used to obtain son's schooling is approximately 25 percent larger for white sons than for non- the reduced-form parameters 0, 8, pi, and P2, white sons (.436 vs. and from these are derived the structural pa- .344). Rescaling the probit coefficients to conform to the standard devia- rameters 0, 1i 132, and 1:3. tions for father's and son's schooling assumed in the linear regression models yields a similar Results race difference in the effect of father's school- ing (.383 vs. Table 1 presents ordered probit and linear re- .302). The test statistic for the interaction parameter and gression estimates for the effect of father's the one degree of freedom likelihood ratio chi-square schooling and color on son's schooling for statistic (90067-90047 = 20) are models with and without terms for interaction significant, even tak- ing account of the of father's schooling and color. The first and complexity of the OCB sam- ple. 12 For the linear in third columns of ihe table show that both the model, contrast, the interaction coefficient is ordered probit and linear models indicate much insignificant and of opposite to that of higher levels of schooling for whites and for sign the probit model. This sons of more highly educated fathers. The pro- discrepancy between the ordered probit and bit and linear coefficients are not directly com- linear regression findings results from the parable inasmuch as the former are measured sensitivity of the regression model to the dif- in the scale of z-scores (inverse of the cumula- ferent skewness of the white and nonwhite tive normal distribution), whereas the latter are schooling distributions. Under the assumptions measured in grades of school completed. 12 Nonetheless the two models yield much the Because these data derive from a frequency same results about the effects of the indepen- table, it is possible to compute a likelihood ratio chi-square dent variables. To see this, note that one can statistic to assess the of the model. Under the saturated model, the log likeli- rescale the probit coefficients for color and hood statistic is minus 21;n jklog(Puk), where n,jk is the father's schooling to the same units as the number of observations in the ill category of father's linear regression coefficients. Given the cate- schooling (i= 1,2,3,4), the jth category of son's gory midpoints assumed in the regression schooling (j=1,2,3,4), and the kth color group analysis, the standard deviations of son's and (k=1,2); and Pijk is the proportion of the kth color father's schooling are 2.802 and 3.194 respec- group that is in the ith father's and jth son's schooling tively. As reported in Table 1, the probit coef- categories. For these data this statistic is 89727, ficient for color gives the difference between implying a chi-square fit statistic for the model whites and a latent with the father's schooling-race interaction of nonwhites on variable for 90047-89727=320 with 20 degrees of freedom, indi- son's schooling that has a standard deviation of cating a poor fit. A better-fitting model allows for one. Under the assumption that the latent vari- more complex interactions between father's and son's schooling than the two correlation coefficients only by the random disturbance En. Although com- assumed here. More complex interactions can be putationally feasible, this specification is tantamount included in the ordered probit model, albeit at the to assuming that color and father's schooling are expense of more complex interpretations. The uncorrelated, and thus gives an unsatisfactory esti- goodness-of-fit test is appropriate only when data mate for /,3. Parameter estimates for (I 1) are avail- come from a fixed table and cell frequencies are able from the authors on request. large. 522 AMERICAN SOCIOLOGICAL REVIEW

Table 1. Ordered Probit and Linear of the Effects of Father's Schooling and Color on Son's Schoolinga Ordered Probit (MLE) Linear Regression (OLS) (1) (2) (1) (2) Variable 3 f8/(SE(/3)) /3 f/(SE(f)) 3 f8/(SE(f3)) / 8/(SE(f3)) White (nonwhite) 0.386 17.9 0.375 16.2 1.503 23.1 1.746 10.2 Father's Schooling 0.432 72.3 0.344 16.5 0.332 56.1 0.362 17.4 White x Father's Schooling 0.092 4.2 -0.033 -1.5 Thresholds 8-11/<8 -0.390 -17.2 -0.384 -17.1 12/8-11 0.483 21.2 0.479 21.4 13+/12 1.182 50.8 1.171 51.0 Constant 6.687 89.9 6.468 40.2 -2 x (log likelihood) 90067 90047 R2 0.217b 0.234b 0.1859 0.1860 0-e 0.885 0.875 2.5279 2.5277 a Source: 1962 Occupational Changes in a Generation Survey (U.S. Bureau of the Census, 1964) (N = 18345). b R2 estimated as one minus error variance under assumption that latent continuous variable has variance one.

of the probit model, the ordered schooling The figure, therefore, is not drawnto the scale variable is a realization of a latent, normally dictatedby the results in Table 1, but nonethe- distributedvariable. As shown in Figure 1, a less illustrates the source of discrepancy be- change in an independent variable induces a tween the ordered probit and linear model re- larger shift in the ordered dependent variable sults in the Table. The curves in Figure 3 illus- for observationsconcentrated in the middle of trate the stronger effect of father's schooling the distributionthan for those concentratedat on son's schooling for whites than for non- either end. If the effect of father'sschooling on whites in the probit scale. If whites are more son's schooling is strongerfor whites than for concentratedat the top of the schooling distri- nonwhites, but the former are concentratedat bution whereas nonwhites are concentratedin one extreme of the schooling distribution the middle of the distribution,OLS will esti- whereas the latter are concentratedin the mid- mate slopes that are tangentto the curved lines dle, then the effects of father's schooling as at different points in the schooling distribution, estimated by a linear model may be approx- that is, at a point of relatively lesser slope for imately equal despite the underlyingdifference whites and greater slope for nonwhites. The between the two groups. straight lines in Figure 3 are approximately This distortion is illustrated in Figure 3, parallelfor the two groups despite the greater which presents the hypothetical relationship nonlineareffect for whites. This example illus- between father's and son's schooling by color trates the potential distortion in linear nr.,del underthe orderedprobit and linearmodels. To results when groups differ in the way that their highlightthe point, the race difference in the effects are sensitive to ceilings and floors on distributionof son's schooling is exaggerated. the ordered dependent variable.

P(Y--J) CONCLUSION (S-' SNhN'N1N~g) Much of the sociological literatureon ordinal Schoolin on Son's Schoolingfor variables offers the unhappy compromises of

and N s Nonwhit (1) ignoringordinal measurementand treating ordinalvariables as if they were continuous;(2) adopting special techniques for ordinal vari- ables that are not integrated into established frameworks for multivariate and structural equation analysis; or (3) adopting frequency Sc2hooling) IrN YW table approacheswhen regressionor structural Figure 3. Linear and Nonlinear Effects of Father's equationmodels would otherwisebe desirable. Schooling on Son's Schooling for Whites This article has reviewed methods that enable and Nonwhites one to analyze mixturesof ordered, dichotom- REGRESSION MODELS WITH ORDINAL VARIABLES 523

ous, and continuous variables in structural where c1j = a1i-61Xij and c2P = a2i, - 81X1i. Then equation models while taking account of the the likelihood function is: distinct measurementproperties of these vari- ables. Although the ordered logit and probit L = [I [I [I [p(Y* = j, Z*, = j' X 1)] ijj (A3) models are slightlymore complex than multiple i j j' regression analysis inasmuch as they rely on where dijj, is a variable that equals one if Y* = j and nonlinearestimation methods, they can be im- Z- j' and zero otherwise. Iterative estimation pro- plemented with standard statistical computer cedures pick 06, 81, and the acj that make L as large software. These methods requireassumptions as possible. about the probabilitydistributions of the un- If either Y* or Z* is a dichotomous variable, then measured continuous variables from which the likelihood is just a special case of (A3), where orderedvariables arise, but these assumptions one of the variables has only two ordered categories. are testable. If Y is an unmeasured continuous variable corre- sponding to an observed ordinal variable Y*, but Z is Like many topics in sociological methodol- an observed continuous variable, then no longer as- the ogy, problemof ordinalvariables has been sume o-, = 1, but rather that o-, can be estimated discussed in isolation from broader method- from the data and that t, = (Zi - 81X1i)/o-v.Then the ological issues and with insufficient attention likelihood is: to high-qualityresearch on the problem by Cjjd applied statisticiansin other fields. Whatever L = [I [I f [g (t, ty1) dt7 dty] ' (A4) problems they may have presented Wlj-1 methodologists,ordinal variables need present no special impediment to sound substantive where dij is a variable that equals one if Y*, equals j, research. and zero otherwise.

APPENDIX Statistical Programs for Ordered Probit and Logit Maximum Likelihood Estimation of Two-Equation Models with Ordinal Variables Single equations with ordinal dependent variables can be estimated with "user-defined" functions in a As for single equations, maximum likelihood estima- number of commonly used statistical programs. In tion for multiple equation models consists of most of these programs, the user supplies the for- specifying the probability of obtaining each observa- mula for the appropriate likelihood function and ini- tion as a function of the unknown parameters, form- tial values for the estimated parameters. Initial ing the likelihood as the joint probability of obtaining values can be obtained using the dichotomous-vari- all observations, and searching for parameter values able approach discussed in this article or by ordinary that maximize the likelihood. Consider the two- least squares. equation model given by (6) and (7) above, but for GLIM (Baker and Nelder, 1978), BMDP (Dixon, simplicity assume again that 32 = 0. Then the re- 1983), SAS (SAS Institute Inc., 1982), SPSSX (SPSS duced form of the model is given by (7) and (9). Inc., 1983), LIMDEP (Greene, n.d.), and Further, assume that Y and Z follow a bivariate HOTZTRAN (Avery and Hotz, 1983) permit the user normal distribution, where Var(E,) = 0rEY and Var(v) to specify the ordered logit or probit likelihood func- = or . That is, the joint probability density function tions and to estimate these models by maximum of the disturbances of the equations is likelihood or its equivalent. LIMDEP and HOTZTRAN can also estimate ordered probit mod- g (t,t) = [1(27r1- p)] x els directly without user specification of the likeli- {exp[-(tY - 2ptyt, + t2)/(l - p2)]}, (Al) hood function. Routines for estimating ordered pro- bit models in BMDP or through a FORTRAN pro- where ty = Ey/orEy, tZ = v/c-,, and p denotes the gram that can be run on an IBM personal computer correlation between Ey and v (e.g., Hogg and Craig, are available from the authors.'3 HOTZTRAN can 1970). Given these assumptions it is possible to form also estimate models with multiple equations, the likelihood functions for alternative types of en- ordered independent variables, latent variables with dogenous variables. several ordinal indicators, and structural equation Suppose both Y and Z are unmeasured continuous models with mixtures of continuous, discrete, ordi- variables that correspond to observed ordinal vari- nal, and truncated variables.'4 ables Y* and Z* respectively. That is, let aso, asi,-. aSJ in asi-1 denote thresholds the distribution of Y REFERENCES and Z (s = 1,2), where aso = _x, 9asJ = o,Y* = j if - < aii-1 Y < a:j, andZ* = j' if a2i'-1 Z < aS2j- To Agresti, Alan identify the scales of Y and Z assume that 0rEy = cry 1983 "A survey of strategies for modeling cross- = 1. Then the probability of obtaining an observation classifications having ordinal variables." with category j of Y* and j' of Z* is: 13 This offer expires one year after publication of = P(Y*l j Z*, = j'iX1i) = this article. Jc1' iC2J'ty1) g (tZ19 dtzdtyg (A2) 14 HOTZTRAN is available from Mathematica Clj-1 C2J'-1 Policy Research. 524 AMERICAN SOCIOLOGICAL REVIEW

Journal of the American Statistical Associ- Dixon, W. J. ation 78:184-98. 1983 BMDP Statistical Software. Berkeley: Uni- Aitchison, J. and S. D. Silvey versity of California Press. 1957 "The generalization of probit analysis to the Featherman, David L. and Robert M. Hauser case of multiple responses." Biometrika 1978 Opportunity and Change. New York: Aca- 57:253-62. demic Press. Allan, G. J. B. Fienberg, Stephen E. 1976 "Ordinal-scaled variables and multivariate 1980 The Analysis of Cross-Classified Categori- analysis: comment on Hawkes." American cal Data. Second edition. Cambridge, MA: Journal of Sociology 81:1498-1500. MIT Press. Amemiya, Takeshi Gallant, A. Ronald 1975 "Qualitative response models." Annals of 1975 "." The American Economic and Social Measurement Statistician 29:73-81. 4:363-72. Goldberger, Arthur S. and Otis Dudley Duncan Aranda-Ordaz, F. J. (eds.) 1981 "Two families of transformations to ad- 1973 Structural Equation Models in the Social ditivity for binary response data." Biomet- Sciences. New York: Seminar Press. rika 68:357-63. Goodman, Leo A. Ashford, J. R. 1980 "Three elementary views of log linear mod- 1959 "An approach to the analysis of data for els for the analysis of cross-classifications semi-quantal responses in biological having ordered categories." Pp. 193-239 in assay." Biometrics 15:573-81. Samuel Leinhardt (ed.), Sociological Meth- Avery, Robert B. and V. Joseph Hotz odology 1981. San Francisco: Jossey-Bass. 1982 "Estimation of multiple indicator multiple Greene, William H. cause models with discrete indicators." n.d. "LIMDEP: Estimator for limited and qual- Discussion Paper 82-7, Economics Re- itative dependent variable models and sam- search Center/NORC, University of ple selectivity models." Unpublished man- Chicago. uscript, Graduate School of Business Ad- 1983 "HOTZTRAN User's Manual, Version ministration, New York University. 1.1." Unpublished Manuscript, Economics Gurland, John, Ilbok Lee and Paul A. Dahm Research Center/NORC, University of 1960 "Polychotomous quantal response in Chicago. biological assay." Biometrics 16:382-98. Baker, R. J. and J. A. Nelder Hanushek, Eric A. and John E. Jackson 1978 The GLIM System. Release 3. Generalized 1977 Statistical Methods for Social Scientists. Linear Interactive Modelling. Oxford: New York: Academic Press. Royal Statistical Society. Hawkes, Roland K. Berk, Richard A. 1971 "The multivariate analysis of ordinal mea- 1983 "An introduction to sample selection bias in sures." American Journal of Sociology sociological data." American Sociological 76:908-26. Review 48:386-98. Bielby, William T. and Robert M. Hauser Heckman, James J. simul- 1977 "Structural equation models." Annual Re- 1978 "Dummy endogenous variables in a view of Sociology 3:137-61. taneous equation system." Econometrica Blalock, Hubert M. 46:931-59." 1974 Measurement in the Social Sciences: 1979 "Sample selection bias as a specification Theories and Strategies. Chicago: Aldine. error." Econometrica 47:153-61. Bollen, Kenneth A. and Kenney H. Barb Henry, Frank 1981 "Pearson's R and coarsely categorized 1982 "Multivariate analysis and ." measures." American Sociological Review American Sociological Review 47:299-304. 46:232-39. Hogg, Robert V. and Allen T. Craig 1983 "Collapsing variables and validity coeffi- 1970 Introduction to Mathematical Statistics. cients (reply to O'Brien)." American Third edition. London: Macmillan. Review Sociological 48:286. Jencks, Christopher et al. Borgatta, Edgar 1979 Who Gets Ahead? The Determinants of 1968 "My student, the purist: a lament." Success in America. New York: Basic. Sociological Quarterly 9:29-34. Bradley, Edwin L. Jennrich, Robert I. and Roger H. Moore 1973 "The equivalence of maximum likelihood 1975 "Maximum likelihood by means of non- and weighted least squares estimates in the ." Pp. 57-65 in Pro- exponential family." Journal of the Ameri- ceedings of the Statistical Computing Sec- can Statistical Association 68:199-200. tion of the American Statistical Associa- Clogg, Clifford C. tion. Washington, D.C.: American Statisti- 1982 "Using association models in sociological cal Association. research: some examples." American Jour- Johnson, David R. and James C. Creech nal of Sociology 88:114-34. 1983 "Ordinal measures in multiple indicator Cox, D. R. models: a simulation study of categoriza- 1970 The Analysis of Binary Data. London: tion error." American Sociological Review Methuen. 48:398-407. REGRESSION MODELS WITH ORDINAL VARIABLES 525

Johnson, Norman and Samuel Kotz Nelder, J. A. and R. W. M. Wedderburn 1970 Continuous Univariate Distributions-1. 1972 "Generalized linear models." Journal of the New York: Wiley. Royal Statistical Society, Series A, Part Johnston, J. 3:370-84. 1972 Econometric Methods. Second edition. O'Brien, Robert M. New York: McGraw-Hill. 1979a "The use of Pearson's R with ordinal data." Kim, Jae-On American Sociological Review 44:851-57. 1975 "Multivariate analysis of ordinal vari- 1979b "On Kim's 'multivariate analysis of ordinal ables." American Journal of Sociology variables.' " American Journal of Sociology 81:261-98. 85:668-69. 1978 "Multivariate analysis of ordinal variables 1981 "The relationship between ordinal mea- revisited." American Journal of Sociology sures and their underlying values: why all 84:448-56. the disagreement?" Paper presented to the Labovitz, Sanford Annual Meetings of the American 1967 "Some observations on measurement and Sociological Association, Toronto, Canada. statistics." Social Forces 46:151-60. 1982 "Using rank-order measures to represent 1970 "The assignment of numbers to rank order continuous variables." Social Forces categories." American Sociological Review 61:144-55. 35:515-24. 1983 "Rank order versus rank category measures Maddala, G. S. of continuous variables." American 1983 Limited-Dependent and Qualitative Vari- Sociological Review 48:284-86. ables in Econometrics. Cambridge: Cam- Pregibon, Daryl bridge University Press. 1980 "Goodness of link tests for generalized linear models." Mare, Robert D. Applied Statistics 29:15-24. 1980 "Social background and school continua- Reynolds, H. 1973 "On 'the multivariate analysis of ordinal tion decisions." Journal of, the American measures.' " American Journal of Sociol- Statistical Association 75:295-305. 1981 "Change and stability in educational ogy 78:1513-16. SAS Institute Inc. stratification." American Sociological Re- 1982 SAS User's Guide: Basics. 1982 Edition. view 46:72-87. Cary, NC: SAS Institute Inc. Mayer, Lawrence S. and Jeffrey A. Robinson SPSS Inc. 1978 "Measures of association for multiple re- 1983 SPSSX User's Guide. New York: gression models with ordinal predictor vari- McGraw-Hill. ables." Pp. 141-63 in Karl F. Schuessler Smith, Robert B. (ed.), Sociological Methodology 1978. San 1974 "Continuities in ordinal path analysis." So- Francisco: Jossey-Bass. cial Forces 53:200-29. McCullagh, Peter Somers, Robert H. 1980 "Regression models for ordinal data." 1974 "Analysis of partial mea- Journal of the Royal Statistical Society, sures based on the product-moment model: Series B 42:109-42. part one." Social Forces 53:229-46. Stinchcombe, Arthur McKelvey, Richard D. and William Zavoina 1983 "Linearity in loglinear analysis." 1975 "A statistical model for the analysis of ordi- Pp. 104-25 in Samuel Leinhardt (ed.), nal level dependent variables." Journal of Sociological Methodology 1983-1984. San Mathematical Sociology 4:103-20. Francisco: Jossey-Bass. Morris, R. U. S. Bureau of the Census 1970 "Multiple correlation and ordinally scaled 1964 "Education change in a generation March data." Social Forces 48:299-311. 1962." Current Population Reports Series Muthen, Bengt P-20, No. 132. Washington, D.C.: U.S. 1979 "A structural probit model with latent vari- Government Printing Office. ables." Journal of the American Statistical White, Halbert Association 74:807-11. 1981 "Consequences and detection of 1983 "Latent variable structural equation mod- misspecified nonlinear regression models." eling with categorical data." Journal of Journal of the American Statistical Associ- Econometrics 22:43-65. ation 76:419-33. 1984 "A general structural equation model with Winship, Christopher and Robert D. Mare dichotomous, ordered categorical, and 1983 "Structural equations and path analysis for continuous latent variable indicators." discrete data." American Journal of Sociol- Psychometrika 49:115-32. ogy 89:54-110.