by using a familiar statistic, chi-square. As already noted, we are aware of the pitfalls ahead in this naive approach, but we intend to stumble into every Sex in Graduate Admissions: one of them for didactic reasons. We must first make clear two as- Data from Berkeley sumptions that underlie consideration of the data in this contingency table approach. Assumption 1 is that in any Measuring bias is harder than is usually assumed, given discipline male and female ap- do not differ in of their and the evidence is sometimes to plicants respect contrary expectation. intelligence, skill, qualifications,prom- ise, or other attribute deemed legiti- to their as P. J. Bickel, E. A. Hammel, J. W. O'Connell mately pertinent acceptance students. It is precisely this assumption that makes the study of "sex bias" meaningful, for if we did not hold it any differences in acceptance of ap- Determining whether deceision to admit or to deny admission. plicants by sex could be attributedto because of sex or ethnic identity is be- The questionwe wish to pursueis wheth- differencesin their qualifications,prom- ing practiced against persons seeking er the decision to admit or to deny was ise as scholars, and so on. Theoretical- passage from one social status or locus influenced by the sex of the applicant. ly one could test the assumption, for to another is an important problem in We cannot know with any certainty example, by examining presumablyun- our society today. It is legally impor- the influences on the evaluators in the biased estimatorsof academic qualifica- tant and morally important. It is also Graduate Admissions Office, or on the tion such as Graduate Record Exam- often quite difficult. This article is an faculty reviewing committees, or on ination scores, undergraduate grade exploration of some of the issues of any other administrativepersonnel par- point averages, and so on. There are, measurement and assessment involved ticipating in the chain of actions that however, enormous practical difficul- in one example of the general prob- led to a decision on an individual ap- ties in this. We therefore predicate our lem, by means of which we hope to plication. We can, however, say that discussion on the validity of assump- shed some light on the difficulties.We if the admissions decision and the sex tion 1. will proceed in a straightforwardand of the applicant are statistically asso- Assumption 2 is that the sex ratios indeed naive way, even though we ciated in the results of a series of ap- of applicants to the various fields of know how misleading an unsophisti- plications, we may judge that bias graduate study are not importantlyas- cated approach to the problem is. We existed, and we may then seek to find sociated with any other factors in ad- do this because we think it quite likely whether discrimination existed. By mission. We shall have reason to chal- that other persons interested in ques- "bias" we mean here a pattern of as- lenge this assumption later, but it is tions of bias might proceed in just sociation between a particulardecision crucial in the first step of our explora- the same way, and careful exposure and a particular sex of applicant, of tion, which is the investigationof bias of the mistakes in our discovery pro- sufficient strength to make us con- in the aggregate data. cedure may be instructive. fident that it is unlikely to be the re- sult of chance alone. By "discrimina- tion" we mean the exercise of decision Tests of AggregateData Data and Assumptions influenced by the sex of the applicant when that is immaterial to the quali- We pursue this investigationby com- The particular body of data chosen fications for entry. puting the expected frequenciesof male for examination here consists of ap- The simplest approach (which we and female applicants admitted and plications for admission to graduate shall call approach A) is to examine denied, from the marginal totals of study at the University of California, the aggregate data for the campus. Table 1, on the assumption that men Berkeley, for the fall 1973 quarter. In This approach would surely be taken and women applicants have equal the admissions cycle for that quarter, by many persons interested in whether chances of admission to the university the Graduate Division at Berkeley re- bias in admissions exists on any cam- (that is, on the basis of assumptions ceived approximately 15,000 applica- pus. Table 1 gives the data for all 1 and 2). This computation,also given tions, some of which were later with- 12,763 applications to the 101 grad- in Table 1, shows that 277 fewer wom- drawn or transferred to a different uate departmentsand interdepartmental en and 277 more men were admitted proposed entry quarter by the appli- graduate majors to which application than we would have expected under cants. Of the applications finally re- was made for fall 1973 (we shall refer the assumptionsnoted. That is a large maining for the fall 1973 cycle 12,763 to them all as departments). There number, and it is unlikely that so large were sufficiently complete to permit a were 8442 male applicants and 4321 a bias to the disadvantageof women female applicants. About 44 percent would occur by chance alone. The of the males and about 35 percent of chi-squarevalue for this table is 110.8, Dr. Bickel is professor of statistics, Dr. the females were admitted. Just this and the probability of a chi-square Hammel is professor of anthropology and associ- ate dean of the Graduate Division, and Mr. kind of simple calculation of propor- that large (or larger) under the as- O'Connell is a member of the data processing tions us to examine the data noted is small. staff of the Graduate Division, at the University impels sumptions vanishingly of California, Berkeley 94720. further. We will pursue the question We should on this evidence judge 398 SCIENCE, VOL. 187 that bias existed in the fall 1973 ad- Table 1. Decisions on applications to Graduate Division for fall 1973, by sex of applicant- missions. On that we should naive aggregation. Expected frequencies.are calculated from the marginal totals of the observed account, frequencies under the assumptions (1 and 2) given in the text. N =12,763, 2 = 110.8, look for the responsible parties to see d.f. = 1, P 0 (18). whether they give evidence of dis- Outcome crimination. Now, the outcome of an Difference application for admission to graduate Applicants Observed Expected study is determined mainly by the Admit Deny Admit Deny Admit Deny faculty of the department to which the Men 3738 4704 3460.7 4981.3 277.3 - 277.3 prospective student Let us applies. Women 1494 2827 1771.3 2549.7 - 277.3 277.3 then examine each of the departments for indications of bias. Among the 101 departments we find 16 that either had no women applicants or denied square of 3091 and that the probability deciding therefrom that bias existed admission to no applicants of either of obtaining a chi-square value that in favor of men has now been cast sex. Our computations, therefore, ex- large or larger by chance is about into doubt on at least two grounds. cept where otherwise noted, will be zero. For the 2 X 85 table on the de- First, we could not find many biased based on the remaining 85. For a partments used in most of the analysis, decision-making units by examining start let us identify those of the 85 chi-square is 3027 and the probability them individually. Second, when we with bias sufficiently large to occur by about zero. Thus the sex distribution take account of the differences among chance less than five times in a hun- of applicants is anything but ran- departments in the proportions of men dred. There prove to be four such dom among the departments. In ex- and women applying to them and departments. The deficit in the number amining the data in the aggregate as avoid this problem by computing a of women admitted to these four (un- we did in our initial approach, we statistic on each department separately, der the assumptions for calculating pooled data from these very different, and aggregating those statistics, the expected frequencies as given above) independent decision-making units. Of evidence for campus-wide bias in favor is 26. Looking further, we find six course, such pooling would not nullify of men is extremely weak; on the departments biased in the opposite di- assumption 2 if the different depart- contrary, there is evidence of bias in rection, at the same probability levels; ments were equally difficult to enter. favor of women. these account for a deficit of 64 men. We will address ourselves to that ques- The missing piece of the puzzle is These results are confusing. After tion in a moment. yet another fact: not all departments all, if the campus had a shortfall of Let us first examine an alternative are equally easy to enter. If we cast 277 women in graduate admissions, to aggregating the data across the 85 the data into a 2 X 101 table, distin- and we look to see who is responsible, departments and then computing a guishing department and decision to we ought to find somebody. So large statistic-namely, computing a statistic admit or deny, we find that this table a deficit ought not simply to disappear. on each department first and aggregat- has a chi-square value of 2195, with There is even a suggestion of a sur- ing those. Fisher gives a method for an associated probability of occurrence plus of women. Our method of ex- aggregating the results of such in- by chance (under assumptions 1 and amination must be faulty. dependent experiments (3). If we ap- 2) of about zero, showing that the ply his method to the chi-square sta- odds of gaining admission to different tistics of the 85 individual contingency departments are widely divergent. (For Some Underlying Dependencies tables, we obtain a value that has a the 2 X 85 table chi-square is 2121 probability of occurrence by chance and the probability about zero.) Now, We have stumbled onto a paradox, alone, that is, if sex and admission these odds of getting into a graduate sometimes referred to as Simpson's in are unlinked for any major, of about program are in fact strongly associated this context (1) or "spurious correla- 29 times in 1000 (4). Another com- with the tendency of men and women tion" in others (2). It is rooted in the mon aggregation procedure, proposed to apply to different departments in falsity of assumption 2 above. We have to us in this context by E. Scott, yields different degree. The proportion of assumed that if there is bias in the a result having a probability of 6 women applicants tends to be high in proportion of women applicants ad- times in 10,000 (5). This is consistent departments that are hard to get into mitted it will be because of a link be- with the evidence of bias in some and low in those that are easy to get tween sex of applicant and decision to direction purportedly shown by Table into. Moreover this phenomenon is admit. We have given much less at- 1. However, when we examine the more pronounced in departments with tention to a prior linkage, that between direction of bias, the picture changes. large numbers of applicants. Figure 1 sex of applicant and department to For instance, if we apply Fisher's is a scattergram of proportion of ap- which admission is sought. The tend- method to the one-sided statistics, test- plicants that are women plotted against ency of men and women to seek ing the hypothesis of no bias or of proportion of applicants that are ad- entry to different departments is bias in favor of women, we find that mitted. The association is obvious on marked. For example, in our data al- we could have obtained a value as inspection although the relationship is most two-thirds of the applicants to large as or larger than the one ob- certainly not linear (7). If we use a English but only 2 percent of the ap- served, by chance alone, about 85 weighted correlation (8) as a measure plicants to mechanical engineering are times in 100 (6). of the relationship for all 85 depart- A women. If we cast the application data Our first, naive approach of examin- ments in the plot we obtain = .56. into a 2 X 101 contingency table, dis- ing the aggregate data, computing ex- If we apply the same measure to the tinguishing department and sex of ap- pected frequencies under certain as- 17 departments with the largest num- plicants, we find this table has a chi- sumptions, computing a statistic, and bers of applicants (accounting for two- 7 FEBRUARY 1975 399 thirds of the total population of ap- all of identical size (assumption 1), example that illustrates the danger of plicants) we obtain p = .65, while the swim toward the net and seek to pass. incautious pooling of data, consider remaining 68 departmentshave a cor- The female fish all try to get through two departmentsof a hypotheticaluni- responding = .39. The significanceof the small mesh, while the male fish versity-machismatics and social war- A under the hypothesis of no associa- all try to get through the large mesh. fare. To machismaticsthere apply 400 tion can be calculated. All three values On the other side of the net all the men and 200 women; these are ad- obtained are highly significant. fish are male. Assumption 2 said that mitted in exactly equal proportions, The effect may be clarifiedby means the sex of the fish had no relation to 200 men and 100 women. To social of an analogy.Picture a fishnetwith two the size of the mesh they tried to get warfare there apply 150 men and 450 different mesh sizes. A school of fish, through. It is false. To take another women; these are admitted in exactly equal proportions, 50 men and 150 women. Machismaticsadmitted half the Table 2. Admissions data by sex of applicant for two hypothetical departments. For total, applicants of each sex, social warfare X2 = 5.71, d.f. = 1, P = 0.19 (one-tailed). admitted a third of the applicants of Outcome each sex. But about 73 percent of the Difference men Applicants Observed Expected applied to machismatics and 27 percent to social warfare, while about Admit Deny Admit Deny Admit Deny 69 percent of the women applied to Department of machismatics social warfare and 31 percent to Men 200 200 200 200 0 0 Women 100 100 100 100 0 0 machismatics.When these two depart- Department of social warfare ments are pooled and expected fre- Men 50 100 50 100 0 0 quencies are computed in the usual Women 150 300 150 300 0 0 Totals way (with assumption 2), there is a Men 250 300 229.2 320.8 20.8 - 20.8 deficit of about 21 women (Table 2). - Women 250 400 270.8 379.2 20.8 20.8 A discrepancy in that direction that large or larger would be expectable less than 2 percent of the time by 100 chance; yet both departments were seen to have been absolutely fair in ` O Number of applicants 40 dealing with their applicants. D The creation of bias in our 90 original situation is, of course, much more C D complex, since we are aggregating tables. It results from an inter- CZg many 80 O action of the three factors, choice of t1, 1 C department, sex, and admission status, whose broad outlines are suggested by 70 our but which cannot be described D plot ED D in any simple way. .4-P0 In any case, aggregation in a simple e O~~~~~~~~arO DO C E 60 F J-^ and straightforward way (approach A) C C C0 is misleading. More sophisticated meth- U, C ods of aggregation that do not rely C m on 2 are legitimate but c.) 50 E assumption have their difficulties. We shall have 0. 0J 0. on this later. _d- more to say CL 0 r_ 40 U m- Disaggregation 0- LIl riLC - L I I E1inZL 30 D] 1 .The most radical alternative to ap- -DC10 proach A is to consider the individual one one. h ..Do .3 graduate departments, by 20 [P LiQ \ However, this approach (which we \. . may call approach B) also poses diffi- - [ o culties. Either we must sample ran- from the different departments, 10 [-'" domly l0 or we must take account of the proba- D C_L~ _] ~~bility of obtaining unusual sex ratios of admittees by chance in a number 0 o -10 20 30 40 50 60 70 80 of simultaneously conducted indepen- dent That is, in examining Percent women applicants Percent womena s experiments. 85 separate departments aat the same Fig. 1 .Proportion of applicants that are women plotted against proportion of appli- e dea es e ae cants admitted, in 85 departments. Size of box indicates relative number of applicants time for evidence of bias we are con- 85 simultaneous experiments, to the denartment.-r-- ducting VOL. 187 400 SCIENCE, and in that many experiments the Pooling admitted if they were being treated probability of finding some marked just like the men. We do this computa- departures from expected frequencies The difficulty we face is not only tion for each department separately, "just by chance" is not insubstantial. technical and statistical but also ad- since each is likely to have a different The departmentwith the strongest bias ministrative.In some sense the campus probabilityof admission and a different against admitting women in the fall is a unit. It operates under general number of women applying, and we 1973 cycle had a bias of sufficient regulations concerning eligibility for sum the results to obtain the number magnitude to be expectable by chance admission and procedures for admis- of women expected to be admitted for alone only 69 times in 100,000. If we sion. It is a social communitythat shares the campus as a whole (11). This esti- had selected that department for ex- certain values and is subject to certain mate proves to be smaller by 60 than amination on a random basis, we general influences and pressures. It is the number of women observed to have would have been convinced that it identifiable as a bureaucratic unit by been admitted (Table 3). was biased. But -we did not so select its own members and also by external The computation of Table 3 is as it; we looked at 85 departments at agencies and groups. It is, as a social follows: For a four-cell contingency once. The probability of finding a and cultural unit, accountable to its table of the following format: departmentthat biased against women various publics. For all these reasons- (or more biased) by chance alone it makes sense to ask the question, Admit Deny in 85 simultaneous trials is about 57 Is there a campus bias by sex in gradu- Men ai bi times in 1000. Thus that particular ate admissions?But this question raises Women c_ di is not so serious difficulties. Is cam- department quite certainly conceptual thet p cellc oof interest is biased as we have first bias to be measured the net particular c% might believed, pus by tthe numbern oof women ad- .057 a much num- bias across all its constituent subunits? containing being very larger . mitted. The expected frequency under ber than .00069, although still a small How does one define such a bias? How...... suchabias?. the hypothesis of no bias is E = wip1 enough probability to warrant a closer For any definition, it is easy to imagine _ ( + ) ( +c,)/Ni, where N= is look. This department was the worst a situation in which some departments te tal of a icas to department one in of bias women are biased in one direction and other he o , 0 respect against i., The observed number, O, is the in admissions; the of find- departments in another, so that the . probability number in c,. The difference between ing departmentsless biased chance of the campus may be zero these to quaniie - s e by these two quantities, O - E, summed alone is of course than .057. even though very strong are ap- greater over n departmentsis We can also examine events in the parent in the subunits. Does one look other direction. The department most instead at the outliers, those depart- o - E) DIFF biased against men had a bias suffi- ments that have divergencesso extreme i= ciently large to be expectableby chance as to call their particular practices Then, alone about 20 times in a million, into question? How ex- and the chance of finding a department treme is extreme in x2 _ _ _ (DIFF)2 n that biased (or more biased) in that such a procedure, and 2 direction by chance alone in 85 simul- what does one do (a + b)(a + c)(c, + di)(b, + df) N,(N,- I) taneous trials (9) is about .002. about units so small as There is a further difficulty in ap- to make such assessment meaningless? with d.f. = 1. Ninety-six departments proach B. Although it makes a great We believe that there are no easy were included in the computation,since deal of sense to examine the individual answers to these questions, but we are 5 of the total 101 each had only 1 departments that are in fact the in- prepared to offer some suggestions. applicant.If Ni - 1 is replaced by N in dependent decision-making entities in We propose that examination of cam- the denominator, all 101 departments the graduate admissions process, some pus bias must rest on a method of can be included, yielding X2= 8.61; of them are quite small, and even in estimation of expected frequencies that O- E remains 60.1 and the expected some that are of ordinary size the takes into account the falsity of as- and observedfemale admitteesare each number of women applying is very sumption 2 and the apparent propen- increased by 1. (This statistic makes it small. Calculationof the probabilityof sity of women to apply to departments possible to include contingency tables observed deviations from expected fre- that are more difficult to enter. having an empty cell, so that no infor- quencies can be carried out for such We reanalyze Table 1, using all the mation is lost; there is thus an advan-- units, but when the numbers involved data leading to it, by computing the tage over methods that pool the chi- are very small the evidence for decid- expected frequencies differently than square values from a set of contingency ing whether there is no bias or gross in approach A, since we now know tables.) bias is really worthless (10). This de- the assumptionsunderlying that earlier The probability that an observed fect is evident not only in approach computation to be false. We estimate bias this large or larger in favor of B but also if we use some reasonable the number of women expected to be women might occur by chance alone method of aggregationof test statistics admitted to a departmentby multiply- (under these new assumptions) is to avoid the pitfalls of approach A ing the estimated probability of ad- .0016; the probability of its occurring such as that of Fisher, or even the mission of any applicant (regardless if there were actual discrimination approach we suggest below. That is, of sex) to that department by the against women is, of course, even large biases in small departments or number of women applying to it. Thus, smaller. This is consistent with what in departmentswith small numbers of if the chances of getting into a de- we found using Fisher's approach and women applicants will not influence a partment were one-half for all appli- aggregating the test statistics: there is reasonable aggregate measure appreci- cants to it, and 100 women applied, evidence of bias in favor of women. ably. we would expect 50 women to be [The test used here was proposed in 7 FEBRUARY 1975 401 another context by Cochran (12) Table 3. Sum of expected departmentalout- were seeking to overcome long-estab- comes of women'sapplications compared with lished of in and Mantel and Haenszel (13).] sum of observedoutcomes, Graduate Division, shortages women their We would be remiss if we did not Berkeley,fall 1973. x2 = 8.55, d.f. = 1, P = fields. Overall, however, it seems that point out yet another pitfall of ap- .003 (two-tailed). the admissions procedure has been proach A. Whereas the highly signifi- Expectedfemale admittees 1432.9 quite evenhanded. Where there are di- cant values of the Mantel-Haenszelor Observedfemale admittees 1493.0 vergences from the expected frequen- cies are small in Fisher statistics just mentioned for Difference (O - E) 60.1 they usually magni- 1973 are evidence that there is bias tude (although they may constitute a in favor of women, the low values ob- substantial proportion of the expected tained in other years (see below) do frequency), and they more frequently not indicate that every department was look for individualdepartment outliers. favor women than discriminateagainst operating more or less without bias. Because the numbers of women stu- them. Such low values could equally well dents applying to some of them in any arise as a consequence of cancellation. one year are often small, we aggregated We illustrate'with the hypothetical de- the data for each departmentover the More GeneralIssues partments of machismatics and social 5-year span, using the method just ex- warfare. If machismatics admitted 250 plained. (This procedure of course We have already explained why as- men and 50 women, creating a short- hides the kind of change that the ag- sumption 1-the equivalence of aca- fall of 50 women, while social warfare gregating approach reveals when pur- demic qualificationsof men and women admitted 200 women and no men, sued through time, but it enables us applicants-is necessary to the statisti- creating an excess of 50 women, the to focus on possible "offenders" in cal examination of bias in admissions. aggregate measure of bias we have either direction in a campus that is on But the assumption is clearly false in introduced would be zero. We only the average behaving itself.) During its most extensive sense; there are areas argue that if an aggregate measure of the 5-year period there were 94 units of graduatestudy that men and women bias is wanted the one we propose is that had at least one applicant of each simply have not hitherto been equally reasonable. Of course, if we combine sex and admitted at least one applicant preparedto enter. One of the principal two-sided statistics by the Fisher meth- and denied admission to at least one in differentiatorsis preparationin mathe- od this phenomenon does not occur. at least one year. Two of the 94 units, matics, which is prerequisitein an elab- We would conclude from this exam- one in the humanities and one in the orate stepwise fashion to a number of ination that the campus as a whole did professions, show a divergence from fields of graduate endeavor (15). not engage in discrimination against chance expectations sufficientto arouse This differentiationwould have little women applicants. This conclusion is interest. One of these admitted 16 effect on women's chances to enter strengthenedby similarlyexamining the fewer women than expected over 5 graduate school if it were unrelated to data for the entire campus for the years years, a shortfall of 29 percent; the difficulty of entry. But it is not. Al- 1969 through 1973. In 1969 the num- probability of such a result by chance though it would appear in a logical ber of women admitted exceeded the alone in 94 trials is about .004. The sense that the departments requiring expected frequency by 24; the prob- other unit admitted 40 fewer women more mathematicswould be more diffi- ability of a deviation of this size or than expected over the 5-year period, cult to enter, in fact it appears to be larger in either direction by chance a shortfall of 7 percent, with a prob- those requiring less mathematics that alone is .196. In 1970 there were four ability in 94 trials of about .019. The are the more difficult. (For the 83 fewer women admitted than expected, next most likely result by chance was graduate programs with matching un- the probability of chance occurrence at a level of .094 and the next after dergraduatemajors, the Pearson r be- being .833. In 1971 there were 25 more that at .188. Converselythere were two tween proportion of applicants ad- women than expected, with a proba- units significantlybiased in the opposite mitted and number of recommended bility of .249. In 1972 there were seven direction, with chance probabilities of or required undergraduate units in more women than expected, the prob- occurrence of .033 and .047, account- mathematicsor statisticsis .38.) In part ability being .709. For 1973 as shown ing for a combined shortfall of 50 men, this may be because departments re- above the deviation was an excess of 13 and 24 percent respectively of the quiring less mathematicsreceive appli- 60 women over the expected number: expected frequencies in the individual cations from persons who might have the probability of a chance deviation units. preferredto enter others but cannot for that large or larger in either direction The kinds of statistics we may wish lack of mathematical (or similar) is .003. These data suggest that there to use in examination of individual background, as well as from persons is little evidence of bias of any kind departmentsmay differ from those em- intrinsicallyinclined toward nonmathe- until 1973, when it would seem signifi- ployed in these general screening pro- matical subjects. In part it is because cant evidence of bias appears, in favor cesses. For example, in one of the in the nonmathematicalsubjects (that of women. This conclusion is supported cases of a shortfall of women cited is, the humanities and social sciences) by all the other measures we have above, it seems likely that an intensified students take longer to get through examined. For instance, pooling the drive to recruit minority group mem- their programs; in consequence, those chi-squarestatistics by Fisher's method bers caused a temporary drop in the departments have lower throughput yields a probability of .99 in 1969, proportion of women admitted, since and thus less room, annually, to accept 1970, and 1971, a probability of .55 most of the minority group admittees new students. Just why this is so is a in 1972, and a probability of .029 in were males. In most of the cases in- matter of debate and of great complex- 1973 (14). volving favored status for women it ity. Some of the problem may lie in the We may also take approach B and appearsthat the admissionscommittees very lack of a chain of prerequisites 402 SCIENCE, VOL. 187 such as that characterizing graduate sities in seeking to equalize the prog- F is referred to the upper tail of a chi- square distribution with 2n degrees of work in, let us say, the physical sci- ress of men and women toward their freedom where n = number of experimental ences. Some may lie in the nature of degrees (17). A university can use its results to be aggregated, here 85. In our application here, Ti is the usual contingency the subject matter and the intractabil- powers of suasion to equalize the prep- table chi-square statistic, with P value ob- of its data and the asked aration of and in tained from a table of the chi-square dis- ity questions girls boys the primary tribution with 1 degree of freedom. of the data. Some may lie in the less and secondary schools for entry into 5. This method uses as a statistic favorable career opportunitiesof these all academic fields. By its own objec- i fields and in consequence a lower pull tive research it may be able to deter- X2x from the professionalemployment mar- mine where and how much bias and i= 1 having a chi-square distribution with d.f. = ket. Some may lie just in the higher discriminationexist and what the suit- n (= 85 here). X,2 is the usual X" statistic in proportion of women enrolled and the able corrective measures be. the ith 2 X 2 table. may 6. In this application of Fisher's statistic (4), possibility that women are under less Ti, is ? the square root of the chi-square to their studies statistic with sign plus if there is an excess pressure complete (hav- of men admitted and sign minus otherwise; ing alternative options of social roles Summary the P value is the probability of a standard not to and have normal deviate exceeding Ti. open generally men) 7. Transformation to linearity by simple changes less favorable employment possibilities Examination of aggregate data on of variable, for example to log (odds), is also not successful. if they do complete, so that the pull of graduate admissions to the University 8. If 7ri, pi, and p't represent, respectively, the the market is less for them. Whatever of California, Berkeley, for fall 1973 probability of applying to department i, the probability of being admitted given that ap- the reasons, the lower productivity of shows a clear but misleading pattern plication is to department i, and the prob- these fields is a fact, and it crowds the of bias female Ex- ability of being a male given that application against applicants. is to department i, then a reasonable mea- departmentsin them and makes them amination of the disaggregated data sure of the association of the numbers pi, p'i more difficult to enter. reveals few is the correlation (weighted according to the decision-makingunits that share of each major in the applicant pool) The absence of a demonstrablebias show statistically significant departures = - - in the graduate admissions does from of female p Z7r(pt p.)(p'I p'.)l system expected frequencies [Z7t(pt - p.)2Z,r,(p't _ not for that and about as units p',)2] give grounds concluding admissions, many where p., p'. are defined by ZTripi, Szrip'i, there must be no bias anywhere else appear to favor women as to favor respectively. As usual, Ipl = 1 indicates linear in the dependence between the pi, p'i while p = 0 educational process or in its men. If the data are properly pooled, suggests "no relation." Positive values indicate culmination in professional activity. taking into account the autonomy of "positive association" and so on. This correlation can be estimated by Our intention has been to investigate departmental decision making, thus substituting the observed proportions of ap- the general case for bias against women for the of women plicants to department i, admitted applicants correcting tendency to department i among applicants to de- in a specific matter-admission to to apply to graduate departmentsthat partment i, and male applicants to department school-not because we are more difficult for of i among applicants to department i for graduate only applicants 7ri, pi and p'i, respectively. This is the sta- had the data base to do so but also either sex to enter, there is a small tistic we call p. because of bias in the ad- We can use p as a test statistic for the allegations but statisticallysignificant bias in favor hypothesis that p = 0. To do so we need the missions process had been aired. Our of women. The graduate departments distribution of p under that hypothesis. It turns out that p/Var p,/2 has approximately approach in the beginning was naive, that are easier to enter tend to be those a standard normal distribution. The expression as befits an initial investigation. We that require more mathematics in the Var p is complicated because of the sta- tistical dependence between pt and p'i. Edi- found that even the naive question undergraduatepreparatory curriculum. torial considerations have prompted its de- could not be answered The bias in the data stems letion. It is obtainable from the authors. adequately aggregated 9. The probability that an observation as ex- without recourse to sophisticated not from any pattern of discrimination treme as (or more extreme than) the most and careful examination on the of admissions extreme cne would occur by' chance alone, methodology part committees, where n number of simultaneous indepen- of underlying processes. We take this which seem quite fair on the whole, dent experiments or observations, and p = prob- to warn those who are ability of occurrence by chance of the most opportunity all but apparently from prior screening extreme observation if it had been selected concerned with problems of bias about at earlier levels of the educational sys- at random for a single observation, is 1 - (1 -p)", and thus for p close to zero is these methodological complexities (16). tem. Women are shunted by their so- approximately np. We also find, beyond this immediate cialization and education toward fields 10. Smallness of numbers of women applicants also invalidates the normal approximation area of concern in graduateadmissions, of graduate study that are generally used in the significance probabilities of ap- that the of bias and discrimi- more less of com- proach B, but this can be remedied. questions crowded, productive 11. This be as nation are more subtle than one may expressed might pleted degrees, and less well funded, 85 have imagined, and we mean this in and that frequently offer poorer pro- E (w)(pi) more than just the methodological fessional employment prospects. sense. If prejudicial treatment is to be where wi is the number of women applying to the ith and is the References and Notes major pi probability of minimized, it must first be located entry of any applicant into the ith major, the latter estimated from We have shown that it is 1. C. R. Blyth, J. Am. Stat. Assoc. 67, 364 being the number accurately. of admittees divided the number of not characteristicof the ad- (1972). by ap- graduate 2. J. Neyman, Lectures and Conferences on plicants. missions here Mathematical Statistics and 12. W. G. Cochran, Biometrics 10, 417 (1954). process examined (al- Probability (U.S. 13. N. J. Am. Stat. Assoc. 690 Department of Agriculture Graduate School, Mantel, 58, though this judgment does not elimi- Washington, D.C., ed. 147. (1963). 2, 1952), p. 14. Further of 3. R. A. Fisher, Statistical Methods Re- analysis these data, in particular nate the possibility of individual cases for examination of individual units search Workers (Oliver and Boyd, London, through time, of prejudicial treatment, and it does ed. 4, 1932). is in progress. 4. Fisher's statistic is 15. Research currently being conducted by L. not deal with politically or morally de- Sells at Berkeley shows how drastic this fined null hypotheses). The fairness of a screening process is, particularly with respect the in admissions is an F --2 ln p(T1) to mathematics. faculty impor- i==l 16. There is a real danger in naive determina- tant foundation for further effort. That tion of bias when the action following posi- where p(Ti) is the P value of the test statistic tive determination is punitive. On the basis effort can be made directly by univer- calculated for the ith experiment (department). of Table 1, which we have now shown to be 7 FEBRUARY 1975 403 misleading, regulatory agencies of the federal sciences, and also pioneering women in the 18. If the same naive aggregation is carried out government would have felt themselves justi- physical and biological sciences, where fed- for the 85 departments used in most of the fied in withholding substantial amounts of eral support has been more concentrated. analysis, N = 12,654, X2 = 105.6, d.f. = 1, research funding from the university. A 17. In fact, data in hand at Berkeley suggest a P =0. further danger in punitive action of this kind dramatic decrease in the early dropout rates 19. The investigation was initiated by E.A.H., is that, being concentrated in the research of women and the disappearance of the using data retrievable from a computerized area, which provides an important source differential in dropout rates of men and system developed by V. Aldrich. Advice on of support for graduate students, it punishes women. It will be several years before we statistical procedures in the later stages of not only male but also female students- will be able to judge whether this phenome- the investigation was provided by P.J.B., and women in areas in which women have tra- non is one of decreased or simply of delayed programming and other computation was done ditionally been enrolled, such as the social attrition. by J.W.O'C.

wrought by the hurricane'swind and floodwaters brings competition among different managers whose conceptions of recovery differ: Is the goal to re- establish the status quo, including Crisis Management: Some slums, or to seize upon the opportunity for urban renewal? Similarly, a trans- Opportunities portation strike may cause such eco- nomic chaos that the Congress-535 crisis managers-might threatento pass International emergency cooperation involving laws that are detrimental to a union leadership's prestige and control over governments, technology, and science is now foreseeable. its members. It is useful to note the characteris- tics common to most crisis Robert H. Kupperman, Richard H. Wilcox, Harvey A. Smith manage- ment. Perhaps the most frustrating is the uncertainty concerning what has happened or is likely to happen, coupled with a strong feeling of the Many alarming trends of our pres- changing character of the players as necessity to take some action anyway ent culture share common roots. World- the negotiations for relief leave one or "before it is too late." This leads to wide inflation, worldwide resource more parties dissatisfied. an emphasis on garneringinformation: shortages, extensive famine, and the In a sense, crises are unto the be- military commanders press their in- inexorablequest for more deadly weap- holder. What is a crisis to one individ- telligence staffs, and civil leaders try to ons may very well reach crisis pro- ual or group may not be to another. get more out of their field personnel portions if these trends continue. They However, crises are generally distin- and management information systems. serve already as examples of national guished from routine situations by a Unfortunately, few conventional in- and international failures of efficient sense of urgency and a concern that formation systems are equal to the task resource allocation and communica- problems will become worse in the of covering unconventional situations, tions. It is important that we under- absence of action. Vulnerabilityto the so managersin a crisis must frequently stand the possible future implications effects of crises lies in an inability to fall back upon experience, intuition, that these failures hold and, more im- manage available resources in a way and bias to make ad hoc decisions (1). portant, that we develop means for that will alleviate the perceived prob- The problems of uncertaintyare ex- dealing with them. lems tolerably. Crisis management, acerbated by the dynamic nature of In discussing the crisis management then, requires that timely action be many crises. Storms follow unpredict- demanded by such situations it is taken both to avoid or mitigate un- able courses; famine is affected by tempting to start by defining what is desirable developments and to bring vagaries in the weather; terroristsper- meant by a crisis, but this is a difficult about a desirable resolution of the form apparently irrational acts; and matter. Crises are matters of degree, problems. foreign leaders, responding to differ- being emotionally linked to such sub- Crises may arise from natural causes ent value systems or simply interpret- jective terms as calamity and emer- or may be induced by human adver- ing situations differently, select unex- gency. In fact it is not necessary to saries, and the nature of the manage- pected courses of action. Thus, with define crises in order to discuss prob- ment required in response differs ac- limited information and resources the lems generally common to their man- cordingly. Thus the actions required manager may find it difficult just to agement, including the paucity of ac- to limit physical damage from a severe keep up with rapid developments, let curate information, the communica- hurricane and to expedite recovery alone improve the overall picture of the tions difficulties that persist, and the from it differ substantially from the situation. tactics needed to minimize the eco- During a crisis, not only does an nomic effects of a major transportation involved manager suffer from poor in- Dr. Kupperman is Chief Scientist and Mr. Wil- strike and to moderate the conditions formation, but he has the problem of cox is Chief of Military Affairs of the U.S. Arms Control and Disarmament Agency, Washington, which caused it. Yet each also exhibits identifying the objectives he wishes to D.C. 20451. Dr. Smith is Professor of Mathe- some characteristicsof the other. For accomplish and ordering them by matics at Oakland University, Rochester, Michi- gan 48063. example, recovery from the devastation priority in accord with his limited re- 404 SCIENCE, VOL. 187