<<

DIVISION OF THE HUMANITIES AND SOCIAL SCIENCES CALIFORNIA INSTITUTE OF TECHNOLOGY

PASADENA, CALIFORNIA 91125

LOG-LINEAR ANALYSIS OF CONTINGENCY TABLES: AN INTRODUCTION FOR HISTORIANS WITH AN APPLICATION TO THERNSTROM ON THE "FLOATING PROLETARIAT"

Morgan Kousser J. California Institute of Technology

Gary W. Cox California Institute of Technology

David W. Galenson University of Chicago

SOCIAL SCIENCE WORKING PAPER 417

February 1982 SUMMARY

For historians or other social scientists who se data is

available in discrete (nominal- or ordina l-level) form, recently

developed "log-linear" mu ltivariate statistical techniques offer

considerable advantages over commonsensical devices and are in many

respects superior to such multivariate methods as mu ltiple

classification analysis, weighted least-squares, and logit.

Reana lyzing Thernstrom' s Boston data on geographic mobility. we

explain the ideas beh ind and the procedures of log-linear analysis

explicitly, step-by-step. Intended for people who are already

somewhat familiar with Rtatistics ( say, through mul tiple regression) ,

the paper is self-contained and as simple as we could make it. Af ter reading it carefully , one should be well prepared to perform such an analysis himself. Substantively. we sketch a simple economic model which points to age as an important determinant of the decision to move or stay, and our results cast doubt on Thernstrom's tentatively offered notion of a "floa ting proletariat." All.ALYSIS OF CONTINGENCY TABLES: U>u-Ll11EJ\J: AN INTilODUCTION FOR HISTORIANS

WITH APPLI CATION TO TIIERNSTROM ON TIIE AN • "FLOATING PROLETARIAT"

Morgan Kous ser J, Gary W. Cox David W. Galenson

Suppose a researche r has inform ation on several attributes of

a collection of individuals, and that the data he has is avail able

only in qualitativ e (synonyms are categorical, discrete, polytomous, or ordinal- or nominal-level ), as oppo sed to quantitativ e (continuous or interval-level) fo rm. Fo r in stance , imagine that his information is about yes or no votes, occupationa l classes, or age groups, but none is in the form of, say, the do llar amounts of property held (not broken into categorie s) or the length of residence, in months or years, at a pa rticular location. Then he might construct tables, such as Table 1, which show how many people have each se t of traits -- for example, how many young, unskilled, chil dless men in a sample were

• An curlier version of this paper was presented at the Social Science History Association convention in 1981 . We wish to thank Stanl ey Enger01 an, Douglas Hibbs , Philip Hoffman, Co lin Loftin, Dou glas Riv ers , Stcvhan Thcrnstrom , Quang Vuong , Sally Ward and especially Robert McCaa for comments on various iterations of the piece, Kousser's research wa s partially supported by grant R0-20225-82 from the Na tiona l Endowment for the lhunanities, We take reponsibility for al l remaining error s. 2 3

found in both the Bost on census schedul e in 1880 and the city literature, encourage and prepare historians to make use of log-linear

directory in 1890. When there is very litt le information available, analysis as we ll as to be able to go on to more advanced tr eatment s in

say, data on only two or three variables, commonsensical methods of statistics texts . Those who de sire a brief overview of the subj ect

analysis may suffice . But what should one do when one is confronted may wish to save section III , in which we lay out the algebra step by

by such monsters as the eighty-celled "four-way" Table 1? st ep, for a second reading , while those al re ady famil iar with or

The convent ional historical answer to this que stion has been indifferent to log-l ine ar methods may wish to skip sections II-V. We

to combine the categories (or, to put it another way, to collapse the will al so attempt to show how social scientific theory can help to

table) into what are called "marginal tables," relating two or perhaps guide data analysis and shal l emphasize a hitherto often overlooked

three variables, as in the panels in Table 5, below. While this is facet of a much-studied historical problem, Thus , we hope to blend

useful and often informative, the practice may hide information which substantive with methodological po ints. Written at a fairly

is available in the ful l table, Fortunately, in the pa st fifteen elementary level and self-conta ined, the paper assumes only that one

years statist icians have developed new methods for squeezing many more has a speaking acquaintance with such st atistical concepts as Chi­

conclusions out of such tables, Historians have made too littl e use Square te sts and regre ssion analysis,

of the new techniques, generally denominated "log-line ar contingency

table analysis," probably because the initial articles and books

explaining them were somewhat ob scurely phrased and were not easily I. HISTORICAL MOBILITY STUDIES

acce ssible to those who lacked fairly advanced statistical training ,

Now that there are simp ler texts on the market and several computer Before beginning the statistical discussion, let us briefly

programs available, it is time that many more historians took introduce the substant ive problem with which we shall be concerned

advantage of them. throughout this article. During the past two decade s, many historians

Th is paper is intended to prov ide both an intuitive and a have investigated geographic and social mob il ity in nineteenth-century

practical mathemat ical understanding of the log-linear te chnique , Am erican cities, Aimed primarily at systemat ically de scribing the

demonstrate it s use fulness by reexam ining an important historica l characteristics of indiv idual s which were associated with changes in

topic, and, by mak ing every st ep in the development and appl ication of residence and in occupat ional or social rank, the ir works have rel ated the technique explicit, using the notati on now coLtmon in the both type s of mob il ity to such variables as age, occupat ion, family 4 5

soci al status , property holdings, ethnic origins, and gene rati on of From Thernstrom' s data set, we have chosen three factors which 1 residence in America, Drawing on such previously unexploited sourcos all pl ausibly bore on the 1880 Bostonian's de cision to move or stay:

as federal and state manuscript ce nsuses, ci ty directori es, and tax family status, which we will call "S", and which we cut into two

assessment rolls, the "new social hi stori an s" have attracted a good groups -- the first , sin gle or married but childless, and the se cond,

de al of attention by taking advan tage of the chance these source s married with chil dren; occupation, or "O," whi ch we broke into five

offer to study the live s of large numbers of individual s who have classes -- hi gh whi te collar, low whi te col lar, skilled, unskilled,

previously eluded the view of hi storians. and unemployed; and age, or "A," which we cut in to four sets -- 14-20 ,

3 Ye t these schol ar s have failed to make use of the available 20-30, 31-60 , and over 60 , The number of pe ople in the sample with

statistical me thods and social scientific theory as fully as they have each set of traits in both 1880 and 1890 is di spl ayed in the four

ransacked the sources. More specifically, they have generally related panels of Table 1, Since it is probable that some of those who were

only two or three vari ables to each other, thereby in effect assuming not listed in the 1890 di rectory we re simply overlooked by the

that the numerous "independent variables" in their mobility analyses canvassers but were still present in the Boston area, we wil l

were un correlated with each other; their implici t statistical models, hereafter refer to the divi sion of the sample as those who were

furthermore, generally assume, wi thout testing, that the relationship s "found" and "not found" rather than as "stayer s" and "movers" or

4 they seek are linear. By focusing on the di fferent correlates of "per si sters" and "nonpersisters" , We wil l label the variable "M" .

mobility only one or two at a time, they have generally settled for The relationship of migration with occupation plays a large

mere description, instead of confronting directly the prob lem of role in Thernstrom's explanation, that wi th age is stressed in the

building a cohe sive explanation. And their an alyse s have been less economics li terature , and that with children taps the notion that

wel l informed by social scientific theory , particul arly economic familial re sponsibilities constrain . We also initially included

theory , than they might have been. We sh all illustrate how the se measure s of homeownership, the number of generations a man had been in

problem s might have been largely obviated by reana ly zing data on America , and ethnicity in order to attempt to me asure some of the

geographic mobility ga thered by Stephan Thernstrom for his study The effects, re spectively, of different levels of transactions costs

Other Bostonians, which trace s individuals from the 1880 federal involved in the de ci sion to move, rootedness, an d a po ssible high manuscript census to the 1890 city directories of Boston and its de gree of employmen t di scrimina tion (e specially again st the Irish) in 2 suburbs . Boston. But these three variables were ei ther so closely related to 6 7

age , occupation, and family status or had so little impact that they Estimates of the cell entries, obtained by procedures to be define d

did not add much to our expl anation. In the interest of simpl ifying later, will be de signa ted by capi tal F' s, When we sum across al l

the discussion, therefore, we have left them ou t of the analysis values of a variable -- for example, when we add the peopl e in all age

pr esented here . categories together, bu t preserve our knowledge of their occupations,

family statuses, and whe ther or not they were found in 1890 -- we will

(TABLE 1 about here) repl ace the relevant subscript with a "plus ." Summing across age

while holding the other variables at level s i, k, an d would thus be 1 noted as f i+kl. II. SIMPL E MANIPULATIONS OF TABLES One can often discover a good deal about the rel ationships in

a table by perform ing qu ite simpl e operations on it. Since several of

Defore beginning the analysis, we ne ed to define a few terms the se operations are directly related to the log-linear techniques on

and establish some appropriate conventions. To identify each cell in which we will focus, a discussion of commonsensical methods will lead

a table, let us refer to each by a set of subscripts, beginning with natural ly into the expl anation of these more formal methods. The

ttltt or , more generally, with "i" and proceeding by int egers or first step that almost anyone would take after perusing Table 1 would

alphabe tically as long as we ne ed them. Thus, the cells in Table 1 be to form percentages from it, and the first of several possible

are identified by four letters or integers. Here and throughout th is percentages to cal cu l ate would be the percentage "foun d" within each

paper, the variables will be considered in the order M, A, S, For age (subscripted by j) , family (k) , and occupational grouping, 0, (1) in stance , the entry in tlie bottom right-hand corne r in Table 1 is Using the cell entry no tation developed above, this percentage would referred to as the (2, 4, 2, 5) cell, or that in which the value of M be :

is arbitr arily called the value of A is termed 4, the vnlue of S "2" (1) is 2, and the value of is 5. Su bstant ively, the cel l represents the 0 number of people present in 1880 who were not found in 1890 and who For instance , the percentage "found" among ow whi te collar childless 1 had been ag ed 61 or ol der, had ch ildren, and were un employed in 1880. men aged between 14 and 20 is 51/(51 + 28) = 65% ,

We wil l refer to the actual cel l en tries by smal l f' s, and we will subscript them by numbers or by i, j, k, etc. For example, f 5 (TAIJLE 2 abou t here) 2425 = . 8 9

Tables of percentages often reveal mor e strikin g rel ationships strikingly smal ler percentage of those above thirty than of those

than Table 2 does. Whereas for the childl ess , relationships between un der thirty years old, that the relationship between age and the

age and being found are ne arly monotonic and are posi tive for the tw o percentage who were in the hi ghest occupational class was

hi gher occupational class es an d negative for the three lower classes , un ambiguous ly monotonic and posi tive, while that be tween age and the

the relationships for men with chil dren are much more mixed across percentage in the low white collar class was negative, but weak.

cl ass and age , Looking at the "total" or "marginal" rows an d columns , Table demonstrates that in every class fathers tende d to be middle­ 4 it is clear that childless men in their twenties and unskil led men aged, and that in all bu t one cl ass , the single and childless were

regardless of age were especial ly likely not to be foun d in the 1890 most likely to be in their twenties. The modal age category for the

ci ty di rectory , but that the percentage "fou nd" among the high white childl ess un empl oyed was the teenage on e. The percentages in these

col lar and skilled and unskilled worker class es depended crucially on tables thus give us some sense of the in terrelationships among the

family status in 1880 . Table 2 thus suggests that the three independent variables.

independent variables interacted with each other to produce a pattern

too complex to be de coded with simple percentages and linear (TABL ES 3 an d about here) 4 assumpti ons,

Bu t of course there are other ways to compute percentages from An other way to try to make sense of complex tables is to

the raw data, and they may be more revealing. Tables 3 and are col laps e them into biv ariate displays , which is what hist orians who 4 calculated by first summin g across M an d then dividing each en try by lack kn ow ledge of mu ltivariate techniques generally do . The panels in the row marginal (total) for Table 3, or the column marginal for Table Table 5 show the six bivariate tables which can be drawn from Table 1.

4. In the cell en try notation introduced above, the equations are: A convenient way to refer to them is to enclose the symbols for the

variables in cu rled bracke ts. Thus the shorthand for panel A of Table (2) Table 3 en try 5 is (MO), for panel B, and so on , For anyone who has had the and [MAJ, most elementary statistics course , t11e immediate reflex action upon

Tab l e entry= f / f confronting such tables is to compute Chi-Squares, and we have done (3) 4 + jkl ++kl' so, finding such high values in each panel th at every table contains a Table 3 reveals th at 42 percent of males aged 14-20 in 1880 were significant relationship at the conven ti onal 0,05 level . un empl oyed, that among those with chil dren, the unskilled made up a 10 11

(TABLE 5 about here) analysis.

A fina l simple but instructive permut ation of Table 1 is shown

In fact , the reflex in this case is quite de sirable, for the in Table 6. There, we have calculated the probability of being found

Chi-Square distribution can be empl oyed to accomplish much more divided by that of not being found for each ce ll in Table 1. That is,

sophisticated purpo se s than its usual cookbook use suggest s. Chi­ instead of using Equation (1) , we have calculated the entries by using Tho Squares compute� in Table 5 test whether th e two vari ables in a panel the following:

are "independent ." Consider panel C, If M and S were independent in a (5) Odds of being found statistical sense , then the value of each ce ll would be purely a

product of the relevant marginals. For instance, the top left cell The "odds" , familiar from hor se racing, is simply the ratio of the

would be equal to 790 1113 I 1724 = 510. ot her entries in panel nu mber found to the number not found. Now , to make heads or tails of X Tho C would be 602, instead of 625; 280 , instead of 302; and 331, instead the probabiliti es in Table 6, we must compare the cells to each other,

of 309. As applied here, then, the so-called "Pearson Chi-Square" and one natural way of doing so is to divide the entry in one cell by

st atistic enables one to compare the observed data to a criterion, that in another, or, to put it another way, to form an "odds ratio."

that is, the Chi-Square di stribution, in order to determine whether For exampl e, for skilled workers between 31 and 60 , the odds of being

the model of independence between the two variables fits the data wel l found rose from 1.10 to 2.47 if they hod children in 1880, producing

or not, It s formula suggest s its nature quite clearly: an odds ratio of 2,25; whereas, the analogous odds ratio for high

white collar men was (3.24 I 3.50) = 0,93 , (4) Pearson Chi-Square

(TABLE 6 about here) where [ indicates a summation over al l the values for two variables,

the sm all f' s refer to the cell entries actual ly observed , and the To move beyond these commonsensical operations to fully large F's, to the frequencies expected under the independence model multivariate methods , we need to devel op ways of specifying , in this case, the values 510, 602 , 280, and 310, respectively . But 5 estimati ng , and choosing between dif ferent model s. These models, of since independence is not the only pos sible model, we can substitute which the independence model that form s the basis of the traditional for these parti cular F's predi ctions gene rated by any model which we Chi-Square te st is the most fumil iar, wil l yield various estimates of can specify mathematically. This is one of the keys to log-linear 12 13

the cell entries. We can then compare different sets of predictions to est imate anything. 7 More formal ly, without additional constraints,

to the observed entries and find one or more which are botJ1 the model is said to be "underidentified." Fortunately, the ne ces sary

su fficiently close to the reality of the table and sufficiently assumptions are qu ite natural . Since we are really interested in the

parsimonious to suit our tastes, Because it is the simpl est table effect of hav ing children, for example, and not in determining

presented thus far, let us use pane l C of Table 5 to out line the separate effects for having and not having children, we as sume that:

techniques, MS _l_ MS • = • The analysis of variance, which is often used to examine .LS' 11 MS 22 (7) 't 2 't21 cross-class ification tables, suggested to statisticians that an These as sumptions reduce the number of parameters to be equation containing a "grand mean effect," separate effects for each estimated by five, produ cing what is called a "saturated" model; that variable, and terms for the po ssible interactions between variables is , one containing as many basic pa rameters to be estimated as there would be a good pl ace to start . Since , as we w il 1 show, a are cells, and therefore having no "degrees of freedom." Fu rthermore, mu ltipl icative (but not a line ar, additive) equation allows tests for it is clear from equation (6) that the effect of each tau parameter the statist ical independence of tw o or more variables, we will use an may be measured as a deviation from a value of 1.00 , for if any tau equation in multipl icative form to est imate the entries in panel C of equals 1.00 in the mu ltiplicative form, it has no impact on the value Table 5: of the function. Note that the taus, unl ike the Pe arsonian

(6) correl ation coefficient or such famil iar coefficients for cross-

classification tables as phi or Yul e's Q, do not vary only between

zero and one or minu s one and plus one . In fact , the taus have no where the (eta ) is a form of "grand mean effect", the i;� (tau ) is lJ upper or lower bounds , the effect of being found or not found, is the effect of familial 'tks

6 st atus , and 't is the effect of the interaction be tween M and S. �ik III. TIIE OF LOO -LINEAR MODELS But because, as the subscripts indicate, equation (6) actually ALGEBRA contains nine effects and we hav e only four cells on which to base ou r Equation (6) is directly related to various odds ratios formed es t imates , we have to make addit i onal assumpt ions in order to be abl e from a sl ightly altered form of panel C of Ta ble 5 (or, more 14 15

generally, of any table of counts) , Table 7 transform s panel C of parameters for other "levels" (i.e. , categories) of the associated

Table 5 into proportion s by dividing each cell entry by the tabl e variable can be immediately derived, since, for in stance (from

total . For in stance, 4 88 I 1724 = 28,3% . equation 7) ,

(13) (TABLE 7 about here) .L 't 2s

Since it turn s out to be more convenient to do computations in If we take na tural logs, we have

logarithm ic form, we will take natural logarithms of both sides of (14) equation (6) , producing

M S MS, (8) log P og + og 't + l og 't + l og 't and all we have to do to get 't s is to change the signs of the p's in ik 1 n 1 i k ik 2

equation (11) , 8 w h ere p re f ers t o an entry in· Table 7 • Each of the terms on the ik Notice also that right hand side of equation (8) can now be expressed in terms of the

(15) log p + log p - log p - log p 11 12 21 22

1 (9) log n (log p + log p + log p + log p , = i ll 12 21 22>

Pu P12 log ( ( - ) ( ) ) • P P M zl 22 (10) log � = (log p + log p - log p - log p > , •1 4!. 11 12 21 22

The quantity to the right of the last equal s sign is the logarithm of

the product of two odds ratios, each of which consists of the (11) log (log p - log p + log p - log p > , 'tls 4!. 11 12 21 22 proportion of mal es found div ided by the proportion not found, the

first fract ion in parentheses be ing for adul t males without children , MS (12) log (log p - log p - log p +log p > • 9 'tl l 4!. 11 12 21 22 the second with ch ildrcn. Since both fract ions measure part of the

"effect" of being foun d versus not being foun d for this particul ar Because of the constraints imposed in (7) above , the values of the tau 16 17

table, the interpretation of log T� as the "effect" of variable M is Rearranging terms and taking antilogs, we have

qui te na tural,

Fu rthermore, the statement that the middle quantity in (19)

parenthesis in equation (15) equals un ity, in which ca se its log is

Cancelling terms algebraically on the right ha nd side , we obta in zero and the value of T�� is also zero, corresponds to the de f inition

of the statist ical independence between the tw o variables, To see (20) this, note that if any two var iables i and j are independent,

(16) In the sam e manner, the reader may sa tisfy himsel f that the predicted

or right hand side proportions p , p , and p are precise ly equal 12 21 22 Substi tut ing these value s into the odds ratio in (12) , we have to the ob served or left hand side proportions. And when the two sets

of proportions (and therefore the corresponding cell frequencies) are p p p +l l+ +2 P2+ (17) P P P P equal, the Pearson Chi-Square statistic given in equat ion is zero, +2 1+ +1 2+ (4) for then and since each of the summed terms cancels out algebraically, the odds

10 2 ratio shoul d be equal to one when the two variables are independent . (21) ( f - FJ 0 •

It is also easy to show that if we use equations (9) through A Chi-Square statistic can therefore be used as a test of the (12) to predict cell proportions, the model predictions in a sa turated independence between tw o variables in a log-linear model . 11 model exactly equal the ob served proportions in the original table. Moreover, if we want to test an "un sa turated" model -- that Fr om equations (9) through (12) , we know that is, one containing fewer parameters than cells, such as equation (8)

with the last term on the right hand side deleted -- we can simply se t (18) log p ll the term or terms equa l to zero in the logarithmic form or one in the

mul tiplicativ e form, and use the remaining log odds ratios from

equations (9) to (11) to est imate the cell prop ortions, To determ ine

how wel l the new model fits the or iginal observations, we can compute 18 19

a Chi-Square statistic with the resul ting predicted cell proportions equa tion for a saturated model simil ar to equa tion (8) , but invol ving

as our F' s, three, rather than two variables. For the sake of simpl icity, we will

When there are more than two ca tegories for a variable, an d/or hereafter refer to the log of tau terms as A.' s (lambda s) , drop the

when there are more than two variables, the model estimation procedure sub scripts, an d temporarily assume that each of the variables is

becomes more complex. Many estimates which have the statist ical ly div ided into only two classes, In stead of four basic terms, the

de sirable property of being "maximum likel ihood" cannot be computed equat ion for the three-var iable model has eight :

from "cl osed form" expressions, Th at is to say , while one can always (22) log write out such simpl e expressions such as equations (9) to (12) , in P ikl

some cases, the resul ting estimates will not be the "best" which can The hierarchy principle dictates that one cannot consider a model be obtained, In the se cases, fortunately, one can use either of two involving only "higher-order" terms without including the algorithms, which are called "iterative proportional fitting " or the corresponding lower-order terms. For example, the fol lowing models "Newton-Raphson" procedure, to approximate the F' s, Because the are forbidden : principles involved in generating and interpreting the models remain basically the same for larger tables as for 2 2 tables, and for MOS (23) log p A. (leaves out seven lower-order terms) X ikl numer ically approximated as for closed form estimates, and because computers can so quickly an d accurately run through the algor ithms (24) that a data analyst need not really unde rstand those parts of the routine s in order to in terpret the output , we will not prol ong the

12 present discussion by explicating these matters, M S SO 0 (25) log p log + A. + A. + A. (leaves out A. • ikl 11 )

It is, however, perfectly proper to propose

IV. HIERAR CHY , CONDITIONAL INDEPENDENCE, AND Oll!ER MODELS (26) log P ikl

Log-linear mode ls of the variety with which we are dealing are MS because all the variables involved in the second-order term J.. are constructed according to the "hierarchy princ ipl e." Consider an represented in fir st-order terDJs in the equation. 20 21

In social sc ientific appl ications, the hierarchy principle for example, if one were try in g to determ ine the importance of legal

usually seems quite natural , for it is difficult to see what sen se one imm igration compared to that of the fert il ity of the Am erican-born in

woul d make of a model based on the claim that, say, the interact ions the growth of various ethnic group s in Am erica, there woul d be periods

between M, and S were significant, but that those between M and O, when the immigration of natives of some Asian countries or Africa was 0, or and S were not, Possibly there are some situations in which the prohibited by law one can estimate models of "qua si- independence" 0 idea that one variable is a social catalyst is more than a metaphor, which exclude the bothersome cells.

but in most in stances for which we have enough data to require An even more basic idea, useful in analyzing any table with

multivariate methods, the principle of hierarchy wil l not preclude the more than two variables, is that of "conditional in dependence," Put

te sting of any model of interest, most simply, conditional independence means that if we ·control for one

In any case , the hierarchy principle enables us to simpl ify or more variables, the apparent relationship between two or more other

notation. Hereafter, in stead of us ing equat ions, we will refer to variables disappears, as illustrated for a hypothetical three variable

models using the curled bracket or "fitted marginals" ruode of case in Table 8. If M and S were in fact independent when we toot A

expression already introduced, but we will adopt the shorthand method int o account, then we coul d drop the terms involving the interaction

of explicitly not ing only the highest order terms we want to put in of M and S from the model and st ill get a good fit between the 15 the model, since by the hierarchy principle, all lower order terms estimated and observed frequencies. The more general point here is

13 involving those variables must be include d, For example, {MS) will that historian s, who ne arly always have a rich sense of the

mean the same thing as equation (26) , and (MOS) will be equival ent to interact ions between , for example, different social group s in specific

equation (22) . historical context s, may be able to formal ize and test a variety of

Log-linear models can be con structed so as to give original models using log-linear methods , The flexibility of the mathematical form to other notions than independence . In "square" technique and the multipl ication in the number of possible models as tables, that is, tables containing two variables which have the same the number of variables grows -- for example, there are 113 different

number of categories, one can test for "symmetry," "quas i- symmetry," on es in any analysis of the relationships between four variables -­ and a variety of ot her interesting concepts, several of which have frees historians to exercise their imagination, rather than being 14 been appl ied by sociol ogists to the st udy of occupational mobil ity. con st rained, as they often are with such technique s as multiple

If there are "structural zeroe s" a table of any size or shape --- regression or factor an alysis, to confine themselves to te sting in 22 23

16 concepts laid down by statisticians who had other purpose s in mind . are clustered between about 20'fo and 80% on the dependen t variable, GSK

and log-line ar techn ique s yield quite simil ar results. But since the

(TABLE 8 about here) log-linear estimates are always at least as good as those from

weighted regression and in the cases of many extreme value s, the log­

In addition to the flexibility they offer, log-linear line ar predictions are better, we see no reason to emp loy weighted 18 technique s have more de s irable statist ical properties than such linear least squares at al l.

technique s as regression with dummy variables or multiple On e final concept should be clarified before we analyze the

classification analysis OICA) , and even than weighted least-squares, Boston mobil ity data -- the difference between the particular genre of

As introduc tory econometrics texts now conventionally wa rn us, a log-linear methods which we deal with in this paper and "logit"

regression on either continuous or discrete variables of dependent analysis. To put it simply, logit analysis (which can also be appl ied

variables which take on only two (or a sm all number of) values where the variables are measured on interval sc ales) designates one

violates the assumption of "homoskedasticity" or equal variances of variable as dependent, while what we have been referring to as log­ 19 the errors. Al though the resul tant estimate s are un biased, the linear an alysis treats all variables as jointly dependent. In

estimated errors do not have the least possible variance, and the logit, in stead of estimating log (p ) we estimate log Cp 1 ) ikl 111 Pzkl

usual significance te sts shoul d not be used. Since the log-linear which is the ratio between, say, the proportion foun d and the model has no such problems, and its associated significance tests are proportion not found, The relationship be tween the two models is 17 accurate, it is to be preferred ov er MCA. obviously very close, the logit coefficients are tw ice those of the

The standard solution to homoske dasticity, weighted least relevant lambdas, and the results using the two methods are generally squares -- known as GSK or Grizzle-Starmer-Koch technique s in the case 2° qui te similar. Choosing one over the other is chiefly a matter of of discrete variables -- elude s this difficul ty, but not another habit or taste. We decided to concentrate on the "log-l inear, " rather

serious one . Unlike log-linear estimates, weighted least squares than the logit model here because the former directs attention to the estimates of probabil ities are not constrained to lie between zero and relationships between the independent variables, which are too easily one . Thus, it is possible to obtain estimates of, say, the overlooked in a mul tiple regression or logit approach, In the Boston probabil ity of being foun d for men who have certain traits which are data se t, the se relationships seem particul arly impor tant. Since we greater than one or less than zero. When all th e group ed observa tions are primarily concerned with explaining geographic mobility, however, 24 25

we will somewhat loosely refer to M from time to time as the variables, any absolute (that is, either positive or negative) value

"dependent" variable an d the others as "independent,tt over about 1.64 is statistically sign ificantly different from zero at

the 0,05 level . Ab solute values be low 1.64 indicate weaker

rel ationships ,

V, CHOOSING AMONG LOG-LINEAR IIYPOTIIESES

(TABLE 9 about here)

After all the preparation, we are ready to return to our

substantive example, Using Fay and Goodman 's ECl'A (Everyman 's The standardized lambdas are of use in deciding which of the

Contingency Table An alysis) program, we examined the relationships many models to test and in de termining whether certain categories of between the variables in Table l, Av ail able directly from Goodman , particular variables may be consol ida ted, Of the 119 lambda effects 21 ECTA is simple and inexpensive to use. Basical ly, the analyst and the one eta effect calculated, we present only the eta and the 31

24 provide s a table and format information about it, uses the fitted lambdas which had standardized value s of 1.64 or above. The most marginals notation to specify models to be estimated, indicates his striking fact about the table (which was of course impl ied by the very de sired level of closeness of fit for models which have to be large Chi-Squares for panels D through E of Table 5) is the strong approximated, and chooses which statist ics and tables he wants interactions among the independent variables A, S, and Few men 0, printed. Al l these commands may be stated in as few as four lines, un der thirty were fathers yet; few men over thirty weren' t (see rows

22 not including the table, Other, simil ar programs are available from 11-1 4 of Table 9) . Men in high white collar jobs tended to be middle­ other source s. aged, while the unemployed tended to be teenagers (see rows 17-29) .

Interpretation of ECT A' s output should be gin with the The single-variable effects (row s 2-9) simply reflect the unequal standardized lambdas estimated from the saturated model, some of which number of persons in each age and occupational bracket, and the fact are displayed for the Boston da ta in Table 9, Standardized lambdas that more people were "found" than were not, The dearth of are simply the lambdas of equa tion (22) , defined for the four-variable significant three- and four-variable lambda s (3 out of 80) is also ca se, divided by their estimated standard deviations, and are important , for it indicates that a fairly simpl e model containing few 23 available as an op tion of ECTA. Since for large samples the term s of very high orders wil l probably suff ice, standardized lambdas are distributed approximately as standard normal Final ly, and regrettably, it mus t be noted that only one of 27 26

the 59 interaction terms invol ving M, that measuring the relation Table 10 presents a series of X:, value s for a large subset (36

between age and being found, was sign ificant, In other words, when of 113 ) of the po ssible log- linear models appl ied to Table 1, The all the interactions between variables are taken into account, none of models, al l hierarchical , are identified in the fitted marginals the independent variables, nor any combination of them, predicted very no tation, (Sub stantive discussion of each model will be put off un til well whe ther an 1880 Boston resident would be found in the area in section VI. ) Whereas an an alyst using a Clt i-Square test in the 25 1890 . It should also be noted that, although Goodman has developed 2 traditional manner wishe s to find a high value of x , since that 2 a measure for log-linear an aly sis which somewhat resembles R for indicates that the "null model" of no relationship may be rej ected, regression, there is no in dex for logit or log-linear analysis which

2 2 has nearly so appeal ing an intuitive in terpretation as R does for here, we wish to find low values of x , because such value s indicate

regression or as any of the Goodman-Kruskal "proportionate error that the postul ated model yields estimates close to those in the table

26 reduction" st atistics do for simple cross-classification tables, of actual cell values,

The principal method for assessing model s involving different

sets of independent variables is to compute Ch i-Square values (TABLE 10 about here)

comparing the actual cell entries, such as those in Table 1, to

27 entries estimated by using a given model. For this purpose, the Note first that Model 1, which contains only the grand mean or

"Like! ihood Ratio Ch i- Square st a ti st ic," given by eta term an d which therefore expresse s the notion that all cells have

an equal proportion of the total number of indiv iduals, fits extremely

poorly . By contrast, Model 18, the saturated model, fits perfectly. (27) This will always be the case . Why not, then, stop with this compl ete,

and, in a sense , perfect model, accept the view that everything is more useful th an the "Pear son Ch i-Square" statist ic defined in affects everything else, and be done with it? The reason is that in equa tion (4) above, because X:, always gives at least as low an testing log-line ar models, we must simul taneously strive for parsimony

and compl etene ss. Indeed, for historians, who have a professional est imate as x! doe s, and because X:, , but not x!, can be

28 predilection for total or "kitchen sink" explana tions -- better to "partitioned," in a sense which will be made clear below . We will include every influence, however sm all, on some outcome than later to 2 therefore use x he reafter. L 28 29

be confronted with the criticism that one neglected some fac tor which contain the same lower-level terms, but differ by one or more

this emphasis on parsimony of explanation is one of the ch ief terms at the same or higher-level s. Th is procedure al so allow s an

heuristic virtue s of using multivariate statistical me thods , Beyond assessment of the importance of the linkage s between specific

heuristics, log-linear analysis and other multivariate procedures variable s. An example will clarify the notion. Model 23 contains the

provide statist ical criteria to assist us in de ciding just where to same terms as Model 22 , end {SO} , the relation between fam ily and

compromise between parimony and completeness. occupational status , as well. Model 22 is therefore said to be

None of the models numbered from 1 to 17 , which represent some "ne sted" within Model 23, or to be a subset of Model 23 , and the

of the possible poorly- fitting lower-order hypothese s, comes close to importance of the term (SO} may be gauged by taking the difference in

any reasonable significance level, In cluded chiefly for illustrative the Chi-Squares for the two models an d de termining whether that

29 purpo se s, they may be largely disregarded. But beginning with Model difference , which is al so distributed as Chi-Square, is significant.

18, all the subsequent models fit the data adequately at the 0.05 The appropriate number of degrees of freedom for the test is the

level of significance, How can one choose between them? Th ere are difference in the degrees of freedom for the two models. In this

three ways, First , one or more models might encapsulate more coherent case , the difference in Chi-Square is 16 .95 (64,54 - 47 .59 = 16 .95)

theories, or on es more consistent with basic assumptions or with and the difference in degrees of freedom is 4 (48 - 44 = 4) . A table previous studies than other models do . Ye t since several, perhap s all of the Chi-Square distribution will show that this is highly

such models in any particul ar instance might plausibly be related to significant at the 0,05 level , Model 23 is therefore to be preferred

some theory or the ories, this criterion may not be of much use. over Model 22, and by this test, at least, the linkage between fam ily

Second, one may adopt the convention that one will prefer one model to status and occupation is judged important, another if it either has a lower Ch i-Square and the sam e number of Table 11 gives the results of a series of similar tests and degrees of freedom (e.g. , Model 29 would be preferred to Models 26 and demonstrates how one ch ooses between models which are generally

27 ) or a lower Chi-Square an d more degrees of freedom (e.g., Model 28 acceptable . Te st number 1, for exampl e, compares Model s 19 and 18 would win out ov er Models 26 and 27) . from Table 10 , The fact that the difference between the Ch i-Square

The third, and , we think , the be st method, de composing Chi­ for the two models is not significant with 12 degrees of freedom means

Square, allow s us to choose between certain other models as well. In that the models yield about equally good predictions of the internal particular, it enable s comparisons of "ne sted" LJodels, tha t is, those cell entries. We therefore choose Model 19 ov er Model 18 for reasons 30 31

of parsimony an d conclude that the four-variable interaction term is of simplifying model 3 4, find that all of them fit the da ta

not a neces sary part of a satisfactory explanation, Simili arly, in significantly worse than 34, and concl ude that 34 is the best we can

32 the test s numbered 2-9 we rej ect as unne cessarily complex Models 19, do .

20, 21, 26, and 27 . For tests 10-18 we cannot rej ect the mode l Ta b le 11 al so facil itates the assessment of particul ar

containing more terms, since the differences between the Ch i- Squares interactions, but a set of tests may lead to amb iguous resul ts. For

are all significant at the 0.05 level for the appropriate de grees of instance {MAS) is evaluated by Te st 9 and foun d unnecessary, but by

freedom, but these tests allow us to rej ect Models 20 and 22 through Te st 12, the same term comes out to be significant. Likew ise, (MSO)

26 . {Duplication of te sts for some models is not strictly necessary, is rated un important in Te sts 2 and 6, but important in Te st 14.

but such tests guard against arithmetic and transcription error s and Te sts for some of the in teractions, fortunately, are less equivocal.

increase our conf idence in the stability of the results,) Panel A of {SO}, {MO), {MA), and particularly {AS] and {AO) are un doubtedly

Table 11 thus leaves three models, none of which is nested in either crucial parts of each model.

of the others or has the same number of degrees of freedom as the An other way of ascertaining the importance of various terms

30 others, still un rej ected, and of determining the extent of superiority of one model over an other

is to divide the differences in Ch i-Square between ne sted models, such

{TABLE 11 about here) as those given in Table 11, by the Ch i-Square in the simpler of the

two model s, Thus, the second line in Table 12 shows that when we move

In Pane l B of Table 11, we first compare the models which from Model 25 to Model 29, which amounts to adding to Model 21 the survived the Pane l A comparisons with the models nested in them term {SO), we reduce the Chi-Square by 40 .9% (16.96 41 .44 0,409) . I = numbered 31, 33, and 34. Since each of the pairs of models in tests Al though such pe rcentage s may be con sidered as counterparts of the

19-22 is statistically indist inguishable at the 0,05 level, we rej ect coefficients of "mul tiple de term ination" and "partial correl ation" in the models containing more higher-order terms and fewer degrees of mul tiple correl ation and regre ssion analysis, they are not really freedom . Second, wc corapare model s 31 and 32 with their ne st ma tes 34 directly analogous, for they cannot be in terpreted as measuring

31 and 33, and rej ect 31 and 32, 11l ird, we compare 33 and 34, again reduct ions in the percentage of variance explained. Tho se percentages find them statistically similar, and therefore choose the mo del do measure the in crease in one 's ab ility to reproduce the or iginal containing fewer terms, that is, 34. Finally, wc present three ways cell entries, but those entries can always be predicted exactly by 32 33

(saturated) models which may or may not capture causal relationships VI . SUilSTANTIVE CONCLUSION S

be tween the in dependent and dependent variables at all. We therefore

prefer to refer to the se measures simply as percentage reductions in If the proof of the methodol ogical pudding is in the

Chi-Square . substantive pie, where do al l these numbers ge t us? In The Other

Table 12 demonstrates that the {AO} and {AS} interactions are Bostonians, Thernstrom found that men in the higher occupational

particularly important, by th is measure, while those of {SO) , {MO) , groups were more likely to persist from 1880 to 1890 in Iloston than

(MAJ , and each of the three-variable interactions are of less were men on the lower run gs of the ladder, (He did control for age,

consequence , and that including terms such as [ASO) or [MS) reduces to some extent, by presenting statistics in the relevant table only 33 the Chi-Square hardly at all. Depending on which models are compared for men between 19 and 40 years old in 1880,) He then specul ated

to each other, the asse ssments of the impor tance of particular terms that nine teenth century Am erica contained ", • , a permanent floa ting

may differ, but these differences are usually smal l. For instance , prol etariat made up of men ever on the move spatial ly but rarely

in cluding {SO} reduces the Chi-Square by 40 .9% by Te st 11, but by only winning economic gains as a result of spatial mob il ity" and suggested

26 ,3% by Te st 13, As this example shows, it is possible to rank the that this transiency made it difficul t to mobil ize the urban masses

interactions somewhat differently in terms of the percentage reduction socially and politically and facilitated social control by the

34 in Ch i-Square, depending on which model comparison is used, According prosperous . These were among the most striking insights in

to Te st 11 , [SO} reduces the Chi-Square by more than {MAS) does by Thernstrom's stimul ating and influential book,

Te st 12, but by Te st 13 , {SO) reduces Chi-Square by less than [MAS) The multivariate analysis which we have presented suggests a

does by Test 12 , By Te st 9, [MAS) reduces Chi-Square less than [SO) more complex picture with somewhat different implications than

35 doe s in either Te sts 11 or 12. Th is ob servation provide s another Thernstrom drew. Rather than a floating proletariat, our analysis

reason not to rely heavily on the percentage reduction in Chi-Squnre sugge sts thnt the outstanding features of the landscape were youthful in evaluating models, mobil ity, comparatively settled middle age, and the accumul ation of

hum an and phy sicnl capital over the life cycle. Ra ther than an

(TAnLE 12 about here ) irrational ly or at least un successful ly gyrating whirlpool of

movement, our models are consistent with -- though they do not, of

course, prove that there were -- patterne d searches for opportunity by 34 35

rational indiv idual s. They al so, on average, have less to lose , both in economic and

Suppose job chances in each occupation differed somewhat from in social terms, Le ss likely to own homes and other fixed property,

pl ace to place, and that one were trying to decide whe ther to move or they have fewer tr ansactions costs to bear if they vacate. Likely to

stay put. Consider two men who could, by moving to a particular city, have invested less heavily in building up goodwill with employers,

each raise their salaries by the same amount, whose costs of mov ing employees, or customers, they can move on without di scarding so much

were fairly substant ial and roughly equal , but who were of different of this ttcapital .H Since they have had fewer years to make

ages. Then the younger of the two would be more likely to move, for friendships, and, especially if they are single, have a lower

he could expect to real ize the higher sal ary for a longer time (such probabil ity of belonging to a family un it which has large numbers of

things as age-specific health being assumed equal between the two friends and relatives in an area, there are al so fewer social ties to

men), With luck, the younger man would much more than make up for the keep young men in a particular place . Al though they were in gene ral

costs of relocating, while a much ol der man might barely cover those less able to make independent economic decisions than men in the

costs before disability or death ov ertook him. nine teenth century , the same considerations would, naturally, apply to

To make the example a bit more real istic, suppose that we do women.

not assume that each man has perfect information about the present A person's stage in the life cycle, furthermore, can be

discoun ted value of the wages in the two places. Then if they were expected to affect his social , as well as his geographical mobility.

each about equally certain of the geographical wage differential , the Young people almost invariably make some investment in human and/or

younger man would still be more like ly to move than the older. For if physical capital. Later in life, they may enj oy the returns from

they guessed either too low or correctly on the wage different ial , and the ir earl ier investments in the form of immediate measureable social

both moved, the younger man would enj oy the higher wage in the new mobility by buying shops, or by moving from un skil led to skilled or

area for a longer time; if they overest imated the wage differential , low white collar to high white col lar jobs, as well as in the form of

and both moved, the younger man, by mov ing again, could rectify his consumption by purchasing homes or intergenerational social mobility

mistake , while the ol der man, pushing into the age when health and by increasing their children's life chance s, In cross-sectional data,

other concerns make starting out once more increasingly difficul t, therefore, we should expect to find dis roportionate n"Clllbers of youths

would be more likely to be stuck with his bad decision. In short, the in lower occupational strata and disproportionate numbers of the

36 young can better afford to take risks be cause they have more to gain. middle-aged in higher strata. 36 37

These theoretical consideration s suggested an analysis of analysis are assumed to be nonline ar and condi tional. as opposed to

geographic mobil ity that ought to include age and fam ily sta tus among the linear and un condi tional effects of the linear regression model . 37 the independent variables, Our mul tivariate an alyses general ly Fo r inst on ce , change due to age in the probabil ity of be ing foun d in

diminish -- but do not ent irely el iminate -- occupational class 1890 varies for each age category , and, within age groups, for each

different ial s in persistence, and they underl ine the importance of occupationa l class and family condition . Thus , the effect on mobility

age. The preferred model, that numbered 34 in Table 10, include s of being in one 's twenties in 1880 differs for the un skilled and high

direct links between age and persistence, as well as be tween white collar workers, and within the un skilled catego ry , for those

occupation and persistence, and Tables 11 and 12 especially highl ight with and without children, and each such effect differs for each age

the interaction s be tween age and occupation and between age and group . The ambda s in essence 11 average out" al l the condit iona l 1 marital status , No model of persistence in Boston which is at all effects for each category of each variable. Rather than tran sform

satisfactory, at least among those contain ing the independent them and increase both the number of coe fficients, and, perhaps , the

variables wh ich we examined, can disregard age as a direct influence, reader's confusion , we concentrate on the lambdas alone , con sidering

nor can it neglect the in terrelation ships between age and other them simply as measures, without any precise natural meaning, of the

independent variables. Fo r example, Mo de l 7 in Table 10, which relevant effects,

con ta ins only (MS) , (MO) , and lower-order terms, show s a Chi­ (MA), Square of 1832, which is an extremely poor fit. And mo del 8, which (TABLE 13 about here)

include s all the in teraction s be tween mobility, status , and occupation , but exc ludes the direct and indirect effects of age on The single-variable or "main" effects in Panel A of Table 13 mobility, also do es very badly. show only that there were, for in stance , fewer people in the 14-20 age

The lambda or effects co effic ients for mo del 34 from Ta ble 10, group than in other age categories, fewer high than low white col lar displayed in Table 13 , further demon strate the se po int s. Un like men in 1880, etc. Included for reasons of completeness only, they may regression co efficien ts, which me asure the impa ct on a dependent be largely disregarded. variable ot a given change in an independent variable, the lambda s While none of the lambdas in Panel B is significant at the ha ve no simple intuitive in terpretation . The essential reason for 0 level , two of the effects for age -- but none of tho se for .OS this is that the relation ships in log-linear (or logit or probit) occupa t ion -- are sign if icnnt n t tho 0 ,l 0 level . Men in their 38 39

tw enties were about as likely to move as to stay in Boston dur ing the (AO} coefficients, one for each combination of age and occupation,

1880s, if their other tr aits are statist ically controlled, while those which are of most intere st . Nearly hal f of them are statist ically

between thirty and sixty were much more likely to stay than to move, significant at the 0.05 level . High white collar men were very likely

even taking into account their different class and familial to be middle-aged or older, while low white col lar men tended to be

38 situations. The old apparently died. 1880 professionals were below the age of 30, Skilled workers were ov erwhelmingly middle-aged,

somewhat more like ly to be found in the Hub City than low white collar rather than either below 20 or over 60, while the numbers in the

workers were ten years later, and men in blue collar occupations in un skil led group varied monotonically and inversely with age , The

1880 were even less likely than clerks were to appear in the 1890 city unemployed were usually either teenaged or elderly. Al l the se cross­

directory. The effects of occupation on mobility, however, are in sectional rel ationships suggest that many Bostonians who began in

each case smaller th an their associated standard error s, and none even lower strata coul d expect to move up an occupational notch as their

comes close to statist ical significance, Control ling for age and human and physical capital matured and they reinforce Thern strom's

family status, therefore, almost entirely "washes out" the more general picture of the late 19th century Amer ican labor market as

39 relationship which Thernstrom stressed between occupation and one of limited but real opportunities,

persistence , Fur thermore, if death rates were higher for lower class For although he found no impermeable division be tween classes,

than for upper class men among the older group , and if directory Thernstrom did, in effect, de scr ibe late 19th century society as

enumer ators were more prone to skip lower class than upper class men separated into two basic sorts of men . the one side were striving On (b ecause poor ne ighborhoods were more dangerous, because busine ssmen workers, clerks , and professionals, who with a little luck and

often advertised in directories, or simply because it was probably perserverence could reasonably expect to cl imb at least a small way up

harder to locate every indiv idual in a densely popul ated tenement than the social ladder or to increase their initial weal th somewhat. On along a tree-lined str eet of single- family houses) then the "true" the other side were the evane scent wandering workers, disappearing

effects of occupational class on persistence might vanish entirely. from the sample, assumed never to find a settled place in society,

Pane l C displ ays th e relat ionships among the age , family, and un ions, or pol itics -- a conj ectured reserve labor army on the move.

occupation variables, The coefficients for (AS } and (SO} show, not By taking age into accoun t in a mul tivariate analysis of spatial, and,

surprisingly , that few of the young or the un employed had children, to a very limited extent, of social mobil ity, we have largely yet, while those over 30 were general ly fathers. It is the twenty el iminated the necessity for postul ating the ex istence of that second 40 41

class, at least to the extent that it emerged out of the Boston da ta, FOOlNOTES

Of course there were many mobile Am ericans, some un doubtedly never

foun d a comfortable niche , and disproportionate numbers of them were

probably relatively un skilled and poor, But if the class difference s 1. Since Stephan Thernstrom laun ched this area of study with his

in geographic mobility were principal ly a product of age differences, Progress and Poverty: Social Mobility in a Nineteenth Century

if age also correl ated strongly with a man's pl ace in the occupational City (Cambr idge, Harv ard Un iversity Press, 1964) , the MA, strata in 1880 -- both of which we have tried to show -- and if many bibl iography has become much too long to list here. Some of the

workers, both blue and whi te collar, progressed upw ard dur ing their lending recent works include Th ernstrom's The Other Bostonian s:

working lives, which Thernstrom showed, then Thernstrom' s se cond class Poverty and Progress in the American Metropolis, 1880-1970

merges with his first, both apparently engaged in rational se arches (Cambridge, Harvard University Press, 1973) ; Michael B. Katz , MA, for job opportunities and a good many enj oy ing some success at it. The People of Hamilton: Family and Class in a Mid-Nineteenth

Many of these conclusion s would have been missed -- in fact, Century City (Cambridge , Harvard University Press, 1975) ; MA,

were missed � by historians who ignored the in terrelationships Clyde and Sally Griffen, Natives and Newcomers: The Ordering of

be tween independent variables and who were content to use the Opportunity of Mid-Nineteenth Century Poughkeepsie (Cambr idge ,

available data merely to de scribe bivar iate rel ationships in stead of Harvard Un iversity Pr ess, 1977). MA, combining appropriate the ory with multivariate techniques, such as 2. If the example which Thernstrom se t by depo siting his data at the log-line ar one s, to build and test more comprehen sive explanations. Historical Data Archives of the In ter-University Con sortium for Having rendered the technique more accessible than be fore to Pol itical and Social Re search at the Un iversity of Mich igan we re historians, we inv ite them to use it to devise more sophisticated as widely fol lowed as his path-breaking studies of historical 40 approaches to this and other simil ar problems in social hist ory . mobil ity were, the profession would benefit greatly. As we hope

to demon strate, secondary data analyses, too seldom performed by

historians, may un cover new facets of the da ta. We ob tained Data

Set ICPSR 7550 from the Consortium . Naturally, neither # Thernstrom now the ICPSR bears any responsibil ity for the

analyse s we performed, 42 43

3. The occupational classifications are Thernstrom's. Age, of Fienberg, and Paul W. llol land, Discrete Multivariate Analys is:

course, could be treated as an interval level variable, We cut Theory and Practice (Cambridge, The �IIT Pre ss, 197 5) ; MA: it into categories only in order to illustrate this particular Stephen E. Fienberg, The Ana lys is of Cross-Classified Categorical

form of log-line ar analysis, The number of cases in the sample Data (Cambridge, and : The MIT Press, 1977) ; Leo A MA. , was cut from 3362 to 1724 by our decision to exclude the 35 Goodman, Analyzing Qu alitative/Categorical Data: Log-Line ar

Negroes, white men for whom any data was missing, and most Models and Latent Struc ture Analys is (Cambridge, Abt Books, MA: importantly, to exc lude all males under 14 years of age in 1880 . 1978) ; H. T. Reynolds, The Analys is of Cross-Classifications (New

84� of our exclusions (1362 of the 1628) were due to age, One York : The Free Press, 197 7) ; and David Knoke and Peter J. Burke,

indication that el iminating cases for which there was missing Log-Linear Models (Beverly Hills, CA : Sage Publications, Inc, ,

data did not seriously distort our findings was that the 1980) . We shall hereafter cite these books at only a few po int s,

proportion "found " in the smaller sampl e differed from that in but we acknowledge our gene ral dependence on them .

the larger sample by less than one percent. The age and status 6. Note that the superscripts do not mean that, e.g. , • is raised i variables were col l apsed into four and two categories, th to the M power, We show below that in the multipl icative respectively, to simpl ify the presentation, Log-line ar runs on a

MS 100-cell table with status broken into three categories and age, form, a statement that ' = 1 is equivalent to saying that the ik into five, produced resul ts very simil ar to those presented two variables are statist ically independent , For a proof that below. that is not the case for a line ar, additive form, such as

S MS 4. On this point, see Richard J. Jensen, "Found: Fifty Million F n + ' + ' see Bishop, Fienberg, and Hol land, Discrete ik = k ik' Missing Americans," paper delivered at the Social Science History Multivariate Analys is, 23-24 . There is a good brief critique of Association Convention, Nov ember 8, 1980 , one such technique , AID, in ibid. , 360,

5. We make no cl aim to originality in our discussion of log-line ar M 7. For exampl e, • � can be divided into , ' the effect of being 1 1 ana lysis. We have merely combined the discussions of other

M schol ars in a way which clarifies the subj ect, at least to us, found, and • , the effect of not being found. The total number 2 We have chiefly rel ied upon Yvonne M, Bishop, Stephan E. M of effects is therefore one for the eta, two for the , , two for 44 45

MS MS, MS , MS , MS the and four for the i; (i.e. , i: i; i; and i; ) , for 0 12. The interested reader may consul t Bishop, Fienberg, and Hol l and, -cs, 11 12 21 22 Discrete Mu ltivariate Analys is, chapters 2 and 3, total of nine ,

13 . The term "fitted marginals" refers to the fact that for 8. A change from F to p does not change the taus, but does ik ik hierarchical models, the maximum likel ihood estimates insure that require o rede finition of the eta. (Note that F. = N ' ' where 1 k plk the estimated marginals are equal to the ob served marginals, N is the number of observations.) For detail s on the

redefinition, see Bishop, Fienberg, and Hol land, Discrete 14. See Otis Dudley Duncan, "How De stination Depends on Or igin in the Multivariate Analys is, 19. Occupational Mob ility Table," American Journal of Sociology, 84

(1979) , 793-803 ; Leo A, Goodman, "Mul tipl icative Mo dels for the 9. The three parallel line s indicate that the quantities are equal

Analysis of Occupational Mob il ity Tables and Other Kinds of by definition,

Cr oss-Classification Tables," ibid. , 804-819, 10. Note that if we make the same substitut ions into equations (10)

15. In the actual analysis, below, of Thernstrom's data, only one of or (11) , the resul ting quantities do not equal unity. Try it, the two-variable interactions can be eliminated, Table 8 is 11 . Al l equations (18) through (20) real ly do is illustrate the .1!.l!1'.!t!Y hypothetical . notion of "closed form" estimates and show that substitut ing 16. The render should be able to satisfy himself just by permuting equations (9) to (12) in equa tion (8) and simpl ifying gives us combinations of al l four letters that the number of possible the proper identities. In the special case of saturated models models is much larger than the 37 given in Table 10, belOll' . The expressions like (9) to (12) are equivalent to the "maximum fact that the total number is 113 for the 4-vnrinble case comes likel ihood" estimates, which may be generated either through n from Bishop, Fienberg, and Hol l and, Discrete Multivariate more compl ex procedure or just through inserting the relevant Analys is, 77 , observed proportions in (9) to (12) , or, for cases with larger

numbers of variables, analogues of these equations, On the 17 . hetereoskeda st icity, see, e,g, , Er ic A, Hanushck and John E. On meth ods used to generate maximum like l ihood estimates for cross- Jackson, Statistical Me thods for Soc ial Sc ientists (New York : classification tables, see Bishop, Fienberg, and Hol l and, Academic Press, 1977) , 141-46 . For on appl ication of MCA to 19th Discrete Mu ltivariate Analys is, chapter 3. century geographical mobil ity data, see Michael B, Kotz, Mi chael 46 47

J, Doucet, and Mark J, Stern, "Popul ation Persistence and Early numbers in the eel entries, instead of proportions in the cell 1 Industrial ization in a Canadian City: Ham il ton, Ontario, 1851- entries, Formally, however, the two models are the same, so we

1871 ," Soc ial Science History 2 (197 8) , 208-29. From our textual make no distinction here. For some empirical resul ts comparing

discussion, it is obvious that we disagree with Richard J. the two, see Sw afford, "Three Parametric Technique s," 664-690 ,

Jensen's statement in "New Presses For Ol d Grape s: I: Mul tiple To see the relation between the logit and the log-line ar

Cl assification Analysis," Historical Methods, 11 (1978) , 175-76 , coefficients, start with equation (22) in the text . For p il l '

that " , • , MCA may now be considered the 'technique of choice' the proportion "found,"

for most problems in quantitative social history."

M S MS MO SO MOS (22.1) log P log 0 18. Compare Michael Sw afford, "Three Parametric Te chnique s for 1u � +Al +A + A + A lk +Al l +A + A lkl ' Contingency Table Analysis: A Nontechnical Commentary," American where the subscripts on every term involving "M" indicate that Sociological Review, 45 (1980) , 664-90 , with Take shi Amemiya, the equation models the effects of the first level of M only. "Qual itative Response Models: A Survey ," Journal of Economic Likew ise, for the proportion not found, Literature, 19 (1981) , 1486-87 .

19. Logit (as well as probit and Tobit) analysis can be appl ied to (22.2) log p 2kl left- and/or right-hand side variables measured on nominal,

ordinal, or interval scales, as illustrated, for exampl e, in J. Since it is a mathematical fact that log = log a - log b, and

Mo rgan Kousser, "Making Separate Equal : Integration of Black we define the logit of variables i, j, and k, considering the and White School Funds in Kentucky," Journal of Interdiscipl inary

History, 10 (1980) , 399-428. first variable as dependent, as log pl jk , we obtain the p 2jk

20 , Fienberg, Analys is of Cross-Classified Categorical Data, chapter equation for that logit by subtracting equa tion (22 .2) from

6, contains a good treatment of the relationship between logit equation (22,1) . Al l terms not involving M drop out, and we have

and the more general log-l inear model , Fienberg refers to the

logit model in the text os a "line ar logistic response model "

re serving the term "logit" for models which predict ratios of raw 48 49

and the W or logit coefficients are equal to tw ice the relevant (22.3) log p - log p ljk 2jk log-linear coefficients. Note also that for a two-category

dependent variable expressed in proportions,

(22 .7) where the parentheses are inserted for convenience. But by the

assumpt ion stated in equa tion (7) in the text , transformed into Th erefore, the left hand side of equation (22.6) can be expressed

logarithms, p ljk as log ����- , and, dropping subscripts, we have an equa tion 1 - p ljk M M A -k which looks very simil ar to a conventional mul tiple regression 1 2 MS MS equation: A -).. 1 2 (22 .4) MO MO A -).. 1 2 S O OS (22.8) log = W + W + W + W . MOS MOS ____!!__1 - p A -).. 1 2

21 . Department of Statistics, University of Ch icago, Chicago , IL, Th erefore, 606 80 .

22 , For tables which have zeroes in any cell, many statist icians (22.5) log p - log p ljk 2jk advise the user to add some small number, such as 0,50, to each

And if we rename the logit coefficients, and drop the M cel l. This makes it po ssible to estimate many models which

superscripts and the constant 2's because they appear for every cannot otherwise be est imated because they contain zeroe s in the variabl e on the right hand side of the equation, we have the marginals and make s convergence go much faster in tables with logit equation zeroe s in the cells. It also reduces the "asympotic bias" and

the "mean squared error" for est imates of the lambdas, which,

P translated into English, means that if one estimated the l jk + S O + os (22 .6) log = W + W W , p W coefficients ov er and over again from simil ar data or once from 2jk 50 51

an extremely large sampl e, he would find that the estimated independe nt variables were related to each other, table 5

lambdas were on average closer to the values for the popul ation di storts the actual na ture of the causal relationships. Table

as a whol e if he added a small value to each cel l than if he 10, by control ling each of the bivariate relations for the

didn't, See Goodman, Analyz ing Qualitative/Categorical Data, effects of the other variables, gives a more accurate picture of

114. We should note that some statist icians do not approve of the actual effects of S, A, and on M. 0 this procedur e, and that theoretical and simul ation work on it is 26 . No summary statistic based on the idea of "proportionate needed. reduc tion in errors" (PRE) can be calcul ated for log-line ar

23 . For the two-variable case , if lambda is written as models, for all est imation algorithms always fit the marginal

� e log f where e is e constant depending on the number rel ating to the dependent variable exactly. Summery measures ij ij ' ij based on the PRE concept cal cul ate the ·number of errors one would

of levels for each variable (if i = j = 2, es in equa tions (9) to make in putting subj ects into each class of the dependent

(12) in the text, a = 1/4) , then Goodman hes shown that the ij variable, For example, if 1113 people in the Boston sub-sample standard dev iation of lambda is the square root of [; e� /f for j ij in 1880 were found end 611 were not found in 1890 , then the best lJ way for en analyst who knew no more about the peopl e to gue ss the saturated case, end that this quantity is e lower bound of

which group each was in would be to put everyone into the "found" the standard deviation of each lambda for unsatur ated models.

category . He woul d therefore guess wrong 611 1724 = 35.4% of See Goodman, Analyz ing Qualitative/Categorical Date, 114, and I the time. PRE measures ere based on gauging how much better the citations given there.

analyst woul d do if he had information about , say, the subj ects' 24. There ere separate lambda effects -- not all independent of each age s, occupations, etc, But since the internal cell estimates in other -- calcul ated for each level of e mul ti-category variable. log-linear models ere obt ained by using information about the

Thus , for instance, there are tw enty separate effects (4 marginal cells -- for instance the estimate of the percentage of categories times 5 categories) for the interactions of age end teenagers found is based on know ing the percentage of all age- occupation. group s found the marginals will always be fit as closely es

one desires. As e consequence, PRE measures cannot be define d 25 . Table 5 showed weak but significant Chi- Sc1ueres be tween M and A, for log-line ar models. M and S, and M and taken two at a time. But since the 0, 52 53

27. It is al so possible to calcul ate standardized cell residuals -- mode l -- i.e., between any of Models 18-3 0 with any of Models

i.e. , to subtract the actual cell entries in Table 1 from those 1-17 which are ne sted in them wil l show a significant

estimated using a particular model, and then to div ide them by difference, we offer only two of them (tests 17 and 18) . Those

some measure of their variance -- in order to assess which cells tests are included in order to allow us to assess the importance

fit particularly poorly, or to put it in more substant iv e terms, of the terms {AO} and {AS} •

which combinat ions of the independent variables do not predict 31, In logit, as in the usual mul tiple regression, all possible the dependent variable well, Space limitations prevent us from relationships be tween independent variables must be al lowed for, de scrib ing the procedure more ful ly, but see, e.g. , Bishop, but analysts do not usually pay much attention to them. In this Fienberg, and Holland, Discrete Mu ltivariate Analys is, 136-55 . case, {ASO} woul d appear in every logit model, Th at the analyst

2 is not so constrained in log-line ar analysis seems to us an 28. See ibid, , 124-130, x is assymptotically -- i. e. , for very L advantage. 2 large samples -- distributed as x . The partitioning procedure 32. Ac tually, we tried all five of the models formed by el iminating is based on the handy and well-known fact that the sum of two one two-variable term at a time from model 34, as well as all ten 2 2 independent x variates also follows the x distribution. of the models formed by elimina ting two terms at a time, Al l

29. Th is method of test ing is actually a like l ihood ratio test, Dy fail the tests against model 34 by larger margins than do 35-37 .

contrasting the fit of a given model with the observed 33 . Th ernstrom, Other Bostonians, Table 3,3, p. 40 . 2 frequencies or proportions, 11. in effect compares the fit of some

34. Ibid. , 42 , 231-32. given model with that of the saturated model, since the saturated

model fits the observed data exactly. For this and a se ries of 35. In his "Found: Fifty Million Missing Am ericans," Jensen has

other tests of goodne ss of fit in what are of ten referred to in suggested that low measured rates of persist ence were to a large

the economics literatur e as "quanta! choice" methods, see Take shi degree artifacts of the sloppine ss of census takers, the

Amemiya, "Qualitative Response Models," 1502-1507 . empl oyees of city directory companies, and transcribers.

30. Since every comparison of an acceptable with an unacceptable 36. Th ernstrom does not specifically treat the conne ction between age 54 55

and social mobility, although he does "control for" age, in a 40 , Th ere are a great many illustrations of appl ications of log­

fashion, by conf ining some of his tables to men between 20 and line nr and related methods in the economics literature listed and

39. See, e,g, , Other Bostonians, table 4.3, p, 53 . Some of his cited in Amem iya, "Qual itative Response Models, " 1483-1484 . For

tables e.g. , tables 4.6 and 4.7 on pp. 60,61 -- imply that an interesting historical example of the use of log-line ar

people did move up the occupational ladder ov er time from 1880 to techniques to evaluate the validity of mortal ity statist ics,

1900 . using a capture-recapture model , see Sheryl B. Dow, "The

Mortality Rate in Norfolk, Virginia, in 1870: A New Approach to 37. For a rev iew of the "human capital" and other economic approaches the Appl ication of the Capture-Recapture Me thod" (unpub , paper, to the topic of geographic mobil ity, see Michael J. Greenwood, Harvard University, 1981) , The problem of the degree to which "Re search on Internal Migration in the Uni ted States: A Survey," occupational mobility is a product merely of a sh ift in the mix Journal of Economic Literature, 13 (1975) , 397-433 , especially of jobs might also be approached by concentrating attention on pp, 406-408. human capital , the (neo)clnssic starting point On the relevant single-letter terms in the sorts of models discussed is, of course, Gary S. Becker, Human Capital, 2nd ed. (Chicago: in our paper. In fact, the range of appl ications seems limited University of Ch icago Press, 1975) , more by analysts' imaginations than by data or computer

38. Using national figures on age-specific mortality, we ran analyses av nil abi1 ity.

in every respect paral lel to those presented herein, elimina ting

the est imated proportion of men of each age who coul d be expected

to have died in the de cade , Unfortunately, we know of no

occupation-specific estimates, but if they were available, they

coul d hardly help strengthening tho general point s made in the

text . The parallel analyse s lead to the same model choice and so

generally track the argument as to mnke their present ation

needl ess,

39, Other Bostonians, 258. 2

TABLE TABLE 2 l NUMBER OF HALES PERSISTING , 1880-1 890 , PERCENT FOUND , 1880 AND 1890 BY AGE, FAMILY STATUS , AND OCCUPATION

Occupation Age Ii i.w. Lo .W. SK . UNSK. UNEMP . TOTAL

Age Hi.W. Lo .w. SK. UNSK. UNEHP . TOTAL

No Children Panel A: Single or Married Without Children, Persistent

14-20 l 51 21 57 106 23 6 14-20 50 65 66 61 70 66 21-30 13 72 50 54 17 206 21-30 59 67 57 47 65 58 31-60 7 12 11 13 44 31-60 78 92 52 54 50 64 l 61+ l l 0 0 0 2 61+ 100 100 0 50

TOTAL 22 136 82 124 124 488 TOTAL 65 68 59 53 68 62

Panel B: Sing le or Married Without Children, Not Found in 1890

14-20 l 28 11 37 46 123 21-30 9 36 37 61 9 152 Children 31-60 2 l 10 11 l 25 61+ 0 0 0 0 2 2 14-20 TOTAL 12 65 58 109 58 302 21-30 50 73 61 66 50 65 31-60 76 70 71 62 69 86 Panel C: Married with Children, Persistent 61+ 53 50 17 53 17 44

14-20 0 0 0 0 0 0 TOTAL 73 69 69 62 53 67 21 -30 2 16 22 33 74 l 31-60 94 98 16 8 162 6 528 61+ 9 3 l 9 l 23 TOTAL 105 117 191 204 8 625

Panel D: Married With Children, Not Found in 1890

14-20 0 0 0 0 0 0 21-30 2 6 14 17 40 l 31-60 29 43 68 99 1 240 61+ 8 3 5 8 5 29

TOTAL 39 52 87 124 7 309

*The occupa tional categories are high white collar, low white collar, skilled , unskil led , and unemployed . The classif ication of occupation into categories was done by Thernstrom. 3 4

TABLE 3 TABLE 4

PERCENT OF TOTAL SAMPLE PERCENT OF TOTAL SAMPLE SUMMED ACROSS COLUMNS (FOUND PLUS NOT FOUND) SUMMED ACROSS ROWS

OccuQation Age Hi.W. Lo.W. SK. UNSK. UNEMP . TOTAL Age Hi.W. Lo. W. SK. UNSK. UNEMP. TOTAL

Panel A: No Children Panel A: No Children

14-20 O* 22 9 26 42 100 21-30 6 30 24 32 7 100 14-20 6 39 23 40 84 45 31-60 13 19 30 35 100 21-30 65 54 62 49 14 45 61+ 25 25 100 31-60 26 6 15 10 l 9 so 61+ 3 O* 1 Al l Ages 4 25 18 29 23 100 TOTAL 100 100 100 100 100 100

Panel B: Children Panel B: Children

14-20 21-30 4 19 32 44 2 100 14-20 31-6 0 16 18 31 34 1 100 21-30 3 13 13 15 13 12 61+ 33 12 12 33 12 100 31-60 85 85 80 47 82 83 61+ 12 4 2 5 40 6 All Ages 15 18 30 35 2 100 TOTAL 100 100 100 100 100 100

* 0 indicates less than 0.5%. -- indicates no cases in cell. * 0 indicates less than 0,5%. -- indicates no cases in cell. 5 6

TABLE 5 Panel D: Occu2ation and Age

SIX TWO-WAY MARG INAL TABLES BASED ON TABLE l

Age Hi.W. Lo .W. SK, UNSK. UNEMP. TOTAL Panel A: Occu2ation and Persistence

14-20 2( 1.1) 79( 21 .4) 32( 7.7) 94( 16 . 8) 152(77.2) 359 21-30 26 (14.6) 130(35.1) 123(29.4) 165( 29 .4) 28( 14.2) 47 2 Hi. W. Lo. W. SK. UNSK. UNEHP. 31-60 132(74. 2) 154(41 .6) 257(61 .5) 285(50. 8) 9( 4.6) 837 61+ 18(10 .1) 7( 1.9) 6( 1.4) 17 ( 3.0) 8( 4.1) 56

Found 127( 71 .3)* 253(6 8.4) 273 (65 .3) 328( 58. 5) 132(67 .O) TOTAL 17 8( 100. 0) 370 41 8 561 197 Not Found 51 117 145 23 3 65 2 x 559.21 TOTAL 17 8( 100.0) 370 41 8 561 197 2 x 15.66 Panel E: Age and Family Status

Panel B: Age and Persistence 14-20 21-30 31-60 61+

14-20 21-30 31-60 61+ No Ch ildren 359(100 .0) 358(75. 8) 69(8.2) 4(7 .1) Ch ildren 0 114 76 8 52

Found 236( 65 .7) 280(59.3) 57 2(68.3) 25(44.6) TOTAL 359(100.0) 472 837 56 Not Found 123 192 26 5 31 = 2 x 1105. 72 TOTAL 359(100.0) 472 837 56 2 x 20 . 81 Panel F: OccuQation and Family S tatus

Panel C: Family Status and Persistence Hi.W. Lo.W, SK, UNSK. UNEMP.

No Children Children To tal No Children 34(19.l) 201(54.3) 140(33 .5) 23 3(41 .5) 182( 92.4) Ch ildren 144 169 27 8 328 15

Found 488( 61. 8) 625(66 .9) 1113( 64 . 6) TO TAI, 17 8( 100 .0) 370 41 8 56 1 197 Not Found 302 309 611 2 x 263 . 77 TOTAL 790(100.0) 934 1724 2 x 4.95

* Percentages, sunnned by column, in parenthesis. 7 8

TABLE 6 TABLE 7

ODDS OF BEING FOUND , 1880 AND 1890 (X 100) PANEL C OF TABLE S EXPRESSED AS PROPORTIONS

Hi.W. Lo .W. SK, UNSK. UNEMP . TOTAL No Children Ch ildren

Panel A: No Children Found P 28.3 P 36 ,3 P =64 .6 11= 12= l +

Not Found P 17 .S P 17 .9 P =3S.4 21= 22= 2+ 14-20 182 191 1S4 230 187 so 21 -30 144 200 lJS 89 189 136 31-60 3SO 1200 llO l18 176 so 61+ P =4S .8 =S4.2 P =lOO so +l p +2 ++ TOTAL 183 209 141 ll4 214 16 2

Panel B: Children

14-20 21-30 26 7 1S7 194 l 8S so 31-60 324 228 247 164 600so 220 61+ 113 20 113 20 79 so TOTAL 26 9 22S 220 16S 114 20 2 9 10

TABLE 8 TABLE 9

A HYPOTHETICAL EXAMPLE OF CONDITIONAL INDEPENDENCE STANDARDIZED LMIBDAS FOR SATURATED MODEL FOR DATA IN TABLE l

Panel A: Apparent Bivariate Re lationship Effect of Standardized Row Variable(s) Level of Variable Lambda # Var iable A l l. 1.780 2 2. Mn 1. 874 3 Variable B Level Level 2 3. A 1(14-20) -3 .435 l 4. 2( 21-30) 8. 257 5. 3(31-60) 8.391 Leve 1 l 35% 10% 6. 4(61+) -6 .656 7. 0 l(Hi.W. ) -2 .657 Level 2 15% 40% 8. 4(UNSK. ) 3.7 93 9. 5(UN�MP ,) -3 .445 10. MA 3(A\ 2.308 Panel B: Variables A and B Conditionally Independent, Given Variable C 11. AS l (A) 9 .16 8 12. 2 5.351 13 . 3 -10. 801 l'•· 4 -6 .134 4 Variable C,Level l Variable C,Leve l 2 15. 1(0) -2 .395 so 16. 5 2.974 5 17. AO l(A) 1(0) -2 .080 Variable A Variable A 18. l 5 3.429 19. 2 l -2.253 20 . 2 3 2.334 Variable B Level l Level 2 Level l Level 2 21 . 2 4 l.88 7 22. 2 5 -2 .134 23 . 3 l 2.201 Level l 35% 35% 20% 20% 24. 3 3 3.028 25. 3 4 l. 85 2 Level 2 15% 15% 30% 30% 26 . 3 5 -4 .6 81 27 . 4 l 2.650 28. 4 3 -1 . 795 29 . 4 5 1.724 5 30. AS O l(A) 1(0) -2.195 31 . 2 l 2.055 32. 3 2 -1 . 828

Notes : l. There is no standard deviation for , This is the unstandardized effect, n

2. Only standardized lambdas above 1.64 in absolute value are listed in the table.

3. Definitions of age and occupation categories in parenthesis. 11 12

4. For 2 variable interactions, the levels listed are for the TABLE 10 variable in parenthesis. The other variable in the MA , AS , and SO interactions is at level one (found in the case of CHI-SQUARE VALUES FOR MODELS BASED ON TABLE l M, and no children in the case of S),

5. In the AO and ASO interactions the levels are of the variables in parenthesis on their right, Degrees of Model Margins Fit Freedom #

Panel A: Palpably Unsuitable Models

1. Equiprobability 3134 79 2. {M} 2989 78 3. {M}{A} 21 80 75 4. {M} {A} {S} 2168 74 5. {M}{A}{S}{O} 187 2 70 6. {MA} {MS } 2142 70 7. {MA}{MS }{MO} 1832 62 8. {MSO} {A} 1566 57 9. {MAO } 1340 40 10. {MAO} {S } 1329 39 11. {MSO} {AO } 1070 45 12. {MAO} {SO} 1049 35 13 . {MAS } 857 64 14. {MAS }{O} 561 60 15. {MAS }{MO} 546 56 16 . {MSO} {AS } 279 54 17 . {MAS }{MSO} 260 48

Panel B: Statistically Significant, But Easily Re jected Models

18. {MASO} 0 o.o 19. {MAS } {MAO } {ASO} {MSO} 6 .53 12 20 . {HAS } {MAO } {ASO} 10.26 16 21 . {MAS } {MAO} {MSO} 16 .12 24 22. {MAS }{AO } 64.54 48 {l!AS }{AO} {SO} 47 .59 44 24.23. {HSO}{AS }{AO } 45.67 42 25 . {MAO}{AS } 41 .44 36 26 . {MAS }{ASO} 38.12 32 13 14

Panel C: Final Contenders TABLE 11

ASSESSING THE EFFECT OF TERMS IN MODELS FROM TABLE 10 27 . {MAS } {NAO } 38.08 32 28. {MAS }{MSO} {AO} 29 .08 36 29 . {MAo}{As }{so} 24 .49 32 30. {MAS} {ASO}MO} 21 .69 28 Difference In 31. {ASO}{MA}{MO} 25.92 32 32. {ASO} {MA} {MO}{MS} 23 . 21 31 Degrees 33 . {AS } {AO } {SO} {MA} {MO} {MS } 3 2.70 43 Model s Terms Difference of Preferred #' 34. {AS}{AO}{SO} {MA} {MO} 35.39 44 Test (From Table 10) Assessed Freedom Model 35. {AS }{Ao } {so}{MA } 50.95 48 it 36 . {AS}{AO} {SO} {MO} 57 .34 47 37 . {AS } {AO } {MA} {MO} 52.34 48 Panel A: Narrowing Down AcceQtab le Models

1. 18,19 {MASO} 6 .53 12 19 2. 19,20 {MSO} 3.73 4 20 3. 20 ,30 {MAO } 11 .43 12 30 4. 20,29 {MAS }{ASO} 18. 82 20 29 5. 21 ,28 {MAO} 12.96 12 28 6. 21,27 {MSO} 21 .96 8 27 7. 26 '23 {ASO} 9.47 12 23 8. 20 ,29 {MAS }{ASO} 14.23 16 29 9. 27 ,25 {MAS } 3,36 4 25 10. 20,26 {MAO}{MO} 27 . 86* 16 20 11. 29 ,25 {SO} 16 . 96* 4 29 12. 28,24 {MAS } 16 . 59* 6 28 13. 23 ,22 {SO} 16 . 95* 4 23 14. 28,23 {MSO} 18.51 * 8 28 15. 30 , 26 {MO} 16 .43* 4 30 16. 30,22 {ASO}{MO} 42. 85 * 20 30 17 . 28, 17 {AO} 23 1.O* 12 28 18. 29 ,12 {AS } 1025 .0* 3 29

Panel B: Choosing The Best Model

19. 28,33 {MAS }{MSO} 3.62 7 33 20 . 30,33 {MAS } {ASO} 11 .01 15 33 21 . 30,31 {MAS } 4.23 4 31 22. 29 ,34 {MAO} 10.90 12 34 23 . 31 ,34 {ASO} 9.47 12 34 24 . 32,33 {ASO} 9.49 12 33 25. 33,34 {MS } 2.69 l 34 26 . 34,35 {MO} 15.56* 4 34 27 . 34,36 {MA} 21 . 95* 3 34 28. 34 ,37 {SO} 16 . 95* 4 34

Note : * indicates a significant difference in the Chi-Squares at the 0.05 leve 1. 15 16

TABLE 12 TABLE 13

PERCENTAGE REDUCTION IN DUE TO PARTICULAR TERMS LAMBDA OR EFFECTS COEFFICIENTS FOR MODEL 34 OF TABLE 10 x2L

Test Model if's % Reduction Standard ff (From Tab le 11) (From Table 10) Terms Assessed In Variable Level Lambda Error X2L

10. 20 ,26 {MAO} {MO} 73 .l Panel A: Sing le-Variable or Unequal Marginal Effects 11. 29 ,25 {SO} 40 .9 12. 28,24 {HAS } 36 .3 13. 1 23 ,22 {SO} 26 .3 Eta 1 14. 28,23 {MSO} 38. 9 M Found 1.70.225*65 0 .126 15. 30,26 {MO} 43 .1 A (14-20) -0 . 840* 0.338 16 . 30 ,22 {ASO} {MO} 66 .4 (21-30) 0.909* 0.140 17 . 28,17 {AO } 88. 8 (31-60) 0. 981 * 0,143 5. 21 ,28 {MAO} 44 .6 (61 +) -1 . 050* 0.189 7. 26 ,23 {ASO} 19.9 No Kids 0 .192 0.126 s 9. 27 , 25 {MAS } 8.1 0 (Hi. W.) -0 .6 87* 0.410 18. 29 ' 12 {AS } 97 . 7 (Lo.W, ) 0.307 0.1 87 21 . 30 ,31 {MAS } 16.3 (Sk,) 0.190 0.211 22. 29 ,34 {MAO } 30. 7 (Unsk. ) 779* 0.165 25 . 33 ,34 {MS } 7.6 (Unemp .) -0o..5 88* 0.205 26 . 34,35 {MO} 30.5 27 . 34 ,36 {MA} 38.3 28. 34,37 {SO} 32.4 Panel B: Interact ions of IndeJ:!endent Variables With DeJ:!endent Variab les

(14-20) 0.110 0.338 MA ( 21-30) 0.004 0.140 (31-60) 0.197** 0.143 (61+) -0 .311** 0.1 89 MO Hi.W. 0.106 0.410 Lo,W, 0.056 0.187 Sk, -0 .036 0.21 1 Unsk. -0.164 0.165 Unemp . 0.038 0.205

Panel C: Interactions Among IndeJ:!endent Variables

AS No Kids, 14-20 1. 894* 0.338 No Kids , 21-30 0.465* 0.140 No Kids, 31-60 -1 . 244* 0.143 No Kids, 61+ -l .ll5* 0.1 89 No Kids, Hi.W, -0 .141 0.410 so No Kids , Lo .W. 0.037 0.187 No Kids, Sk. -0 .1 87 0. 21 1 17

No Kids, Unsk, -0 .197 0.165 No Kids, Unemp . 0.488* 0.205 AO 14-20, Hi.W. -1 .606** 1.192 21-30, Hi.W. -0 .224 0.433 31-60 , Hi.W. 0.731* 0.427 61+, Hi. W. 1.098* 0.477 14-20 , Lo.W. 0.255 0.459 21-30, Lo.W. 0.251 0.207 31-60, Lo.W. 0.055 0,214 61+, Lo.W. -0 .561** 0.350 14-20, Sk. -0 .248 0,505 21-30, Sk. 0.453* 0.226 31-60, Sk. 0.535* 23 2 o. 61+, Sk. -0 . 739* 0.415 14-20 , Unsk, 0.240 0.403 21 -30, Unsk. 0.174 0.182 31-60, Unsk. 0 .080 0.189 61+, Unsk. -0 .494** 0.309 14-20 , Unemp . 1.358* 0.488 21-30, Unemp . -0 .654* 0.280 31-60, Unemp . -1 .401* -4 .614 61+, Unemp . 0.696* 0.313

1 Notes : The constant or grand mean effect has neither categories nor a standard error associated with it.

* Designates lambdas which are statistically significant at the 0, 05 level.

** Designates lambdas which are statistically significant at the 0 .10 level .