When is Sex Normal?

Jennifer Johnson-Hanks UC Berkeley

Nobody realizes that some people must expend tremendous energy merely to be normal.1

A lot of regression-based sociological research is terribly boring…Throwing away regression methods is no cure for mediocrity and boredom in . The root problem is that a lot of everything is mediocre and boring, and ever will be, because “average” is the literal definition of “mediocre,” and mediocrity is a never-fail formula for boredom.2

If “average” mediocre and mediocrity is boring, then why write a paper about averages? Well, averages are everywhere: Policy analysts often report on the expected effect of a policy change on the average person or household. Newspaper stories often start with a narrative of the experiences of presumably average person. Comparative datasets include national-level life expectancy, per-capita GDP, and average years of completed schooling. Whether thinking about social norms or about average treatment effects, much of focuses on understanding central tendencies, that is, understanding what is average, or normal. The words “norm” and “normal” are commonly used in at least two quite different senses. On the one hand, social norms are abstract ideas about what forms of behavior are right and good; they reflect shared values, even ideals, whether or not those preferred forms of action are ever observed in practice. On the other hand, statistical norms are measures of central tendency, of average behavior in a defined population: what is, not what should be. From our first courses in quantitative social science we are taught to distinguish these two ideas, to recognize that statistical norms only correspond to social ones under particular circumstances, and to hesitate to assume a social norm where our only evidence is statistical. The reductio is: if lice in common in a particular elementary school, we should not conclude that there is a social norm in favor of lice. This is good advice, and papers that conflate social with statistical norms often make grave conceptual errors. And yet, it is also important to remember that the itself was introduced into the social sciences by the Belgian astronomer- demographer Adolphe Quetelet, most famously in his 1835 book Sur l'homme et le développement de ses facultés, specifically in order to quantify social norms. Over the past 180 years, we have come to view this equation skeptically; however, the intuitive attractiveness of

1 A. Camus, Notebooks 1942-1951 2 Stolzenberg 2003:426.

isomorphism between averages and ideals also provides Quetelet with a steady stream of (usually unwitting) supporters, who assume that social ideals are transparently revealed through statistical outcomes, or that the observed statistical outcomes transparently accord with the intentional projects of the social actors3. In this paper, I consider the relationship between statistical and social norms in the 19th century, with the work of Adolphe Quetelet and Francis Galton, and then again in the latter part of the 20th century, with the US sex surveys of Alfred Kinsey and Edward Laumann4. In both periods, the concepts of norms and the normal are in close relation with the concepts of distribution and variation. Indeed, much of the development of our contemporary conceptual armature of normalness occurs through the expansion and normalization of ideas about variation and measures of . My central argument is that the relationship between statistical and social norms varies with time and context, and that this variation reveals something important about those contexts themselves. Social norms can influence statistical ones, and vice versa. And statistical distributions can similarly both shape, and be shaped by, the distribution of social processes. As quantitative social scientists, we often work with data about statistical norms, whether we use data from the census of from cell phones. Many of the explanations that we offer for those statistical patterns draw on social norms. We know that those two are different things, and are usually careful to distinguish between them. Still, we rarely undertake to explicitly consider how social and statistical norms are related, either in a specific study or over time. By thinking carefully about the relationships between social and statistical norms, we can begin to out the relationships between different kind of human aggregates: between a subgroup and a society, or between a population and a public.

Social and Statistical Norms in the 19th century: Quetelet and Galton Although people collected and reported population averages starting in the late 17th or early 18th century, it was Adolphe Quetelet in the 19th who created quantitative social science as a discipline of averages. Quetelet, the Belgian astronomer-come-demographer who imported the Gaussian distribution into the social sciences, was a remarkable person, who stood at an extraordinary historical crossroads. If Quetelet had not integrated emerging concepts in probability with a growing body of data about social things, surely someone else would have eventually done so. But it would probably have taken a very long time. Quetelet had a unique biography and a unique intellectual temperament that led him to construct a new kind of quantitative social science. Adolphe Quetelet was born in Ghent, which was then part of Revolutionary France, in 1796. He was the 5th of 9 children, son of a municipal officer named François Quetelet, who died when Adolphe was seven years old. This put the family in financial hardship,

3 For a spirited defense of this latter position, see Goldthorpe 2016, especially pages 124-125. Quetelet himself considered individual choice to be only a minor perturbation. 4 That is, Kinsey et al. et al. Sexual Behavior in the Human Male (1948) and Sexual Behavior in the Human Female (1953); and Laumann et al. 1994 The Social Organization of Sexuality.

contributing to Adolphe’s starting to teach mathematics at the age of 175. Beyond his family milieu, this was a period of exceptional political disruption: Quetelet was eight at Napoleon’s coronation, and 19 when French was annexed to the Netherlands following Napoleon’s defeat at Waterloo. Under the Dutch, a new university was built at Ghent, and Quetelet submitted the first doctoral thesis at the new University in 1819, and was appointed to the chair of Elementary Mathematics at the Atheneum in 6. He was 23 years old. Four years later, Quetelet went to Paris to study astronomy at the Observatory under the guidance of François Arago and Alexis Bouvard7. Through Bouvard, he met , Siméon Denis Poisson, and Pierre Laplace8. These scholars introduced him to new ideas in mathematics and sciences through which probability theory was coming into conversation with the evaluation of empirical data. Today we think especially of the Central Limit Theorem, but it is useful to think of Quetelet adopting more broadly the Laplacian view of “probability as an instrument for repairing defects in knowledge”9. In the early 19th century, that was a completely new and radical proposal, particularly because of its anti- religious implications. These were unsettled times, politically and scientifically, and Quetelet was part of a cohort of thinkers forging a new understanding of the human world.

Figure 6.1: Quetelet’s drawing of the Normal distribution

5 Information about Quetelet’s early life comes from Hankins, 1908, pages 10-11. 6 The dissertation was in mathematics, on conic sections, titled De quibusdam locis geometricis, necnon de curva focal. 7 Donnelly (2016:95) argues that Quetelet’s respect for, and skills in, scientific administration are also properly attributed to his apprenticeship to Arago and Bouvard. 8 Add a note on Villarme on population here. 9 Gillespie 1997: 14-15.

Quetelet returned to Brussels in 1824 and was promoted to professor of higher mathematics at the Antheneum. That same year, he married Cécile Curtet, the daughter of a French physician working in Brussels10. Quetelet worked tirelessly to see an Observatory built in Belgium, and in 1832, the observatory in Brussels finally opened. Quetelet became its first director, and lived at the Observatory where he worked on statistical, geophysical, and meteorological data. Although he worked to promote astronomy throughout his life, it is for the application of new scientific ideas to the study of the social for which Quetelet is best known. His earliest publications included mortality tables with special reference to actuarial problems of insurance (CITE), (1827), and a general statistical handbook of Belgium (1828). He was instrumental in instigating the 1829 population census11, and indeed became the leading advocate for the normalization of censuses worldwide. In his most famous work, Sur l’homme et le développement de ses facultés12, Quetelet argues for the development of quantitative methods for understanding human society. In particular, he argues for bigger datasets, on the grounds that more data will yield better estimates: "The greater the number of individuals, the more the influence of the individual will is effaced,

10 Lottin 1912: 29. The couple had two children, a son Ernst and a daughter whose name appears not to have survived. Cécile died in 1858, when Quetelet was 62 years old, and his daughter “soon after” (Lottin 1912:101). 11 Of which the results were published later and separately for Holland and Belgium, following the Revolution of 1830. 12 This is the title of the original publication of 1835, which appeared with the subtitle Physique sociale. In 1869 he published a revised version, including an introduction by John Herschel, with the title and subtitle reversed.

being replaced by the series of general facts that depend on the general causes according to which society exists and maintains itself."13 This idea was based in the emerging theory of probability, joined to the assumption that individuals within a population are like independent draws from a single distribution. In Paris, Quetelet had learned that errors in astronomical observation approximate the Normal distribution, and the best estimate of the actual location of an astronomical body is the of the distribution of observations. He then argued that variation in a human population was analogous to errors in observation; both are subject to the same Laplacian Law. The population mean, therefore, was not a mere artifact, but a real thing, the ideal to which all strive, the target at which all are aimed. He called this average l’homme moyen: the average man: This determination of the average man is not merely a matter of speculative curiosity; it may be of the most important service to the science of man in the social system. It ought necessarily to precede every other inquiry into social physics, since it is, as it were, the basis. The average man, indeed, is in a nation what the center of gravity is in a body; it is by having that central point in view that we arrive at the apprehension of all the phenomena of equilibrium and motion.14 However, Quetelet did not consider all variation to be analogous to error. He distinguished two kinds of forces. Constant forces—under which he included age, sex, profession, educational level, and economic and religious situation—he thought would act in a “continuous manner, with the same intensity and in the same direction”15. These forces he thought acted jointly16, and jointly accounted for the mean of each population distribution17. The other kinds of forces Quetelet called accidental or "perturbative" causes. These are the analogy to error, in that Quetelet argued that perturbative causes do not operate in a constant direction, but “equally in both directions” and so tend to cancel out when a population is considered at the aggregate level. The idea that the population mean is like the center of gravity in a body is most plausible when the variation looks similar to the Normal distribution. That is, when the variation is symmetric, not too large, and the distribution is monomodal, such that the population average is also the uniquely most common outcome. This is approximately true for heights, separated by sex, for example, although not for heights when men and women

13 Quetelet 1831:80. Recherches sur le Penchant au Crime aux Différents Âges. 14 Quetelet 1842:96. 15 Quetelet 1849:107. 16 He often considers these constant forces in conjunction, presenting 2-way, 3-way, and even 4-way tables, and thereby anticipating the 20th century approach to partial causation enshrined in multivariate regression. 17 Thus Quetelet’s social physics is quite amenable to policy interventions: by changing levels of , religion, or profession, it would be possible to change the mean level of an outcome such as suicide or marriage.

are taken together. 18 In cases where a variable is nearly normally distributed and the variance is relatively small, the population mean does give a good “feel” for the population as a whole. For example, the average American woman is 5’6” and the standard deviation is about 3.5”: about 70% of American women are between 5’ 2” and 5’ 10”. Here, the mean is informative. Quetelet was very aware that the idea of the l’homme moyen would only make sense when distributions were (at least approximately) Normal. But he also felt confident that any natural measure in any natural population would follow that distribution. He was so confident, in fact, that he asserted that any time we observe a distribution that is not Normal, we should infer that the data come not from a single, true population, but rather from multiple populations mixed together (CITE). This is actually a rather strange idea. It is of course true that if you mix two or more Normal distributions, you are likely to end up with a non-Normal one19. But that does not imply that any non-Normal distribution is the product of population mixing, as long as you can accept that some attributes are not normally distributed. And that, Quetelet was unwilling to concede. His book begins with sex- segregated height and chest circumference and advances through demographic outcomes like fertility and mortality to end with what he calls moral traits, like bravery and criminality. All of these, he asserted, followed the same Law as astronomical observation and games of chance. Although Quetelet had limited data about the population distribution of “moral qualities”, he was adamant that they, too, would follow the law of errors. Working from the data available to him, Quetelet shows regularities in the number of suicides from year to year, in the number and kinds of , and in the rate of marriage for each sex and age cohort. Because he was working in a period of relatively limited population growth, and over relatively short periods of time, the difference between working with counts and with rates was arithmetically small. Still, it was conceptually important, because Quetelet anticipated one contemporary approach to rates, writing that the probability of committing a crime at various ages was the propensity to crime—the penchant au crime20. This idea rests on the argument that “causes are proportional to the effects they produce”21. If a man can lift twice as much as another man, he is twice as strong as that man. If a man runs away at half the threat as scares another man, he is half as brave. The theory of penchants therefore allowed Quetelet to argue

18 That is, when men and women are considered separately, height is approximately, but not exactly, normally distributed. It is not exactly because the tails are too heavy and there is a slight amount of skew. Mixing men and women together produces either a “lumped” or a mildly bimodal distribution. See Shilling et al., 2002. The case of male height offered Quetelet an early application of his theory. In 1844 he showed that there was a discrepancy between the distribution of height of 100,000 French conscripts and the theoretical distribution. From this he estimated that some 2,000 men had evaded the draft by getting measured as just below the minimum height. 19 To be clear, it is indeed possible to produce a Normal distribution by mixing other Normal distributions, as shown most dramatically by Galton’s double quincunx a half-century later. The point here is simply that mixing two Normal distributions does not usually produce that effect. 20 And sometimes he writes “tendance au crime” 21 Quetelet 1831:7

that the Normal distribution applied even to rare, punctate events like crime: what was Normally distributed was not crime itself, but the propensity to crime, of which the only evidence were the rates. To say that the propensity to crime is Normally distributed in a population is, for Quetelet, equivalent to saying that the mean propensity is a real attribute of the population, the produce of the constant causes, and the variance is error, the result of perturbative causes. Thus, when he famously wrote that, “Society prepares the crime, and the guilty person is only the instrument by which it is executed,”22 he is again referring to the l’homme moyen. All men are subject to the same forces, and some will happen to have an unlucky draw from the distribution of perturbative causes. For Quetelet, therefore, statistical norms reveal underlying social forces, not through the intentions of individual people—intentions for him are among the more minor of the perturbative causes—but in the same way that gravitational forces act on astronomical bodies. The statistical captures the social as a telescope captures the transit. In the summer of 1855 Quetelet experienced what appears to have been a stroke, and his mind never fully recovered. He continued to write, but never with the same vigor or clarity. He died in 1874.

Francis Galton A half-century after Quetelet’s Sur l’homme, Galton published Hereditary , arguing that the population average is not the ideal to which all strive, but the mediocre “mean” to which subsequent generations regress. Francis Galton was born into a remarkable family in Birmingham, England in 1822. More interested in nature than nurture23, Galton shared Quetelet’s faith in the Law of Errors, but saw little role for social forces in crafting it. His grandfathers were Erasmus Darwin, the physician-philosopher, and Samuel Galton, a leading industrialist24. The youngest of nine children, Francis was born six years after his next older sibling and was the “pet of us all,” according to his sister Elizabeth25. Francis’ earliest years were in the care of his sister Adèle, who taught him to read and write, as well as some Latin and French, basic arithmetic, and the calculation of money and time, all before he was five26. At eight he was sent to boarding school abroad that he might become fluent in French, and at 16 he began as a medical student at Birmingham General Hospital, then at King’s College in London. In 1840, he undertook the first of a series of grand trips, this one to Germany, Austria, Turkey, and Italy, before returning to England to study mathematics at

22 Quetelet 1835:PAGE 23 A turn of phrase that Galton himself invented, in CITE. 24 Erasmus Darwin, Samuel Galton, and Josiah Wedgewood were all founding members of the “Lunar Society”, so named because they met on the evening of the full moon. It has always been a very small intelligencia. 25 Forrest, 1974:5. 26 Several authors have noted Francis Galton’s own genius as the source of his interest in the topic, although Forrest corrects Terman’s estimate of Galton’s IQ from 200 down to only 160. Ibid, page 7.

Cambridge as an interlude in his medical studies. At Cambridge he did passably, but not as well as might be expected from his astonishing childhood. He was sickly and nervous, and completed his studies there without distinction. Galton then returned to complete his medical training, which he continued to pursue until his father’s death in 1844 brought him a considerable inheritance and freed him from the obligation of a medical career27. After the death of his father, Galton travelled and read, spending more than a year (October 1845 to November 1846) exploring Egypt and Syria, and another two full years in south-western Africa (April 1850 until April 1852), in lands that had been previously largely unknown to Europeans. Descriptions of this time are somewhat contradictory. On the one hand, it is described as a time of careless debauchery when Galton “sowed his wild oats”28 and did little of intellectual merit. On the other hand, this was a time when Galton very clearly began to make his scholarship his own, something he pursued for his own reasons, rather than only to please his father. He put his sextant and trigonometry to use in calculating the body shape of a “Hottentot Venus” from a distance29, and his letters—and later his books—describe the people he encountered with such curiosity and passion that it is hard to see these voyages as “blank years” as they are often described30. He left for Egypt apparently only certain that he did not want to study medicine; he returned from south-western Africa enthralled with empirical questions about human variation. Galton returned to London in 1852, and settled in with his mother and sister Emma. His was awarded the Gold Medal of the Royal Geographical Society in 1953 for his exploration of “the countries of the Namaaquas, the Damaras, and the Ovampo”31, and married that same year. Louisa Butler Galton had a lively intelligence and curiosity for travel. They set off almost immediately for the continent, and travelled extensively, although always in places more comfortable and familiar than Francis had visited alone. After success in publishing an account of his southern African adventures, Galton published a book of advice for wayfarers, called The Art of Travel; or, Shifts and Contrivances Available in Wild Countries. It is a wonderfully practical book, with advice on avoiding blisters and fording rivers, and went to five editions. With money inherited from his father, an intelligent wife from a scholarly family, and the regard of his colleagues, Galton was able to live the comfortable life of a gentleman scientist of the age, serving on the Council of the British Association for the Advancement of Science and writing extensively on his inventions, on , anthropology, photography, fingerprints, and—most famously—eugenics and heredity32.

27 This paragraph depends on Bulmer 2003:4-5 and Forrest 1974:10-26. 28 Bulmer 2003:10. 29 Galton 1853:53-54. 30 Pearson 1914:PAGE. Although he equivocates, Forrest (1974, especially page 35) takes a view similar to mine. 31 Galton 1908, Memories of My Life. London. Page 150. 32 That is not to say that his life was uncontroversial or without conflict. As a prominent member of the Royal Geographical Society (RGS), he played a leading role in the “Stanley Affair,” in which H.M. Stanley, a newspaper correspondent, set off to find (and publish the

In his Memories, Francis Galton describes reading ’s Origin of the Species soon after its publication in 1959: “The publication … made a market epoch in my own mental development, as it did in that of human thought generally. Its effect was to demolish a multitude of dogmatic barriers by a single stroke, and to arose a spirit of rebellion against all ancient authorities whose positive and unauthenticated statements were contradicted by modern science”33. The core of Darwin’s book argues that there are generally more creatures born than can survive to reproduce, and so those who do reproduce must (on average) have some advantage—however slight—over those who do not. If these traits could be passed on somehow to the offspring, then the generation of offspring would have a slightly different distribution of traits than the generation of parents. The force that leads to differential survival and reproduction he calls “.” But before moving to the case of natural selection, Darwin begins with a chapter on selection under domestication. For many readers in the middle of the 19th century, the idea that breeds of dogs or horses could be improved over generations by careful breeding would have been familiar and even acceptable within their religious worldviews. By starting with selection under domestication, Darwin was thus trying to make his radical ideas somewhat more familiar. For Galton, however, it was this chapter—on domesticated creatures—that opened radical new ideas. If humans could select faster horses or finer dogs by selective breeding, perhaps they could do the same for smarter people. Perhaps not only simple animalian traits, but also complex human ones like genius, were hereditary. Quetelet came to thinking about questions of human variation out of an interest in probability and its application to data. Human variation provided a rich body of as-yet unexplored data, and he was fascinated how the probability theory he learned in the context of astronomy seemed to apply to human society as well. But Galton’s intellectual journey was the reverse: he came to questions of probability out of an interest in the empirical variation between people, both within and across populations. He was an indifferent and not particularly distinguished student of mathematics at Cambridge, and his first passion was for exploration. The fact that Galton ended up making important contributions to statistics (particularly through Edgeworth), while Quetelet shaped social theory (through Halbwachs and Durkheim) is therefore a double irony. Galton wrote extensively: at least 20 books and over 300 other papers, notes, and circulars. He is best known for his 1969 book Hereditary Genius, for the idea of and term “eugenics,” and for contributions to the methods of correlation and regression, particularly as journal of) Britain’s beloved explorer Dr. Livingstone, who funded by the RGS, overturning what the RGS perceived as proper protocol and legitimate scientific procedure. See the amusing discussion in Forrest 1974:114-121. 33 Galton 1908: 287. This passage is a poetic complement to Darwin’s discussion of his own inspiration upon reading the work of Thomas Malthus: "In October 1838, that is, fifteen months after I had begun my systematic inquiry, I happened to read for amusement Malthus on Population, and being well prepared to appreciate the struggle for existence which everywhere goes on from long- continued observation of the habits of animals and plants, it at once struck me that under these circumstances favourable variations would tend to be preserved, and unfavourable ones to be destroyed. The results of this would be the formation of a new species. Here, then I had at last got a theory by which to work". (1876: PAGE)

developed in an 1885 paper in the Journal of the Anthropological Institute, but also in other papers. We will consider him here for his interest in population distributions, and especially the role that Normal distributions played in his thinking about variance. Hereditary Genius argues that, “man’s natural abilities are derived by inheritance, under exactly the same limitations as are the form and physical features of the whole organic world”34. He does this by tracing in great detail the genealogies of eminent British men in a large number of domains: judges, military commanders, literary men, men of science, and so forth, showing that their relatives are more eminent than would be predicted by chance. For his model of how ability would be distributed by chance, he takes the Normal distribution, as described by Quetelet in a paper translated into English and published in 184935. After discussing the heights of men of an isolated and intermarrying population, he continues: This law of deviation from an average is perfectly general in its application. Thus, if the marks had been made by bullets fired at a horizontal line stretched in front of the target [instead of by marking men’s heights], they would have been distributed according to the same law. Wherever there is a large number of similar events, each die to the resultant influences of the same variable conditions, two effects will follow. First, the average value of those events will be constant; and, secondly, the deviations of the several events from the average, will be governed by this law (which is, in principle, the same as that which governs runs of luck at a gaming-).36 This is precisely Quetelet’s model of human variation as a form of error, following the same natural law as any other chance process. In the next paragraph, Galton continues along this same path, asserting—again like Quetelet—that when observing a distribution that does not follow this law, you should infer that the data come from two different “races” mixed together. Galton’s next move, however, breaks with Quetelet. Whereas Quetelet sees the mean of the Normal distribution as the ideal to which all strive, like a target at which the bullets are aimed, Galton sees the mean as “mediocrity”37 to which subsequent generations “regress”. High ability—the far right tail of the distribution—is Galton’s interest. He presents data from the admissions exam for the Royal Military College at Sandhurst from 1868, compared to a

34 Galton 1869:1. 35 Ibid, page 26-33 and pages 377-383. Galton is purported by his biographers to have read French very well. Still, he refers only to the one paper of Quetelet’s that was translated into English, not to any of the other work, and his reference to “La Place” (page 378) shows no evidence that Galton read his work. 36 Ibid, pages 28-29. 37 “The meaning of the word ‘mediocrity’ admits of little doubt. It defines the standard of intellectual power found in most provincial gatherings, because the attractions of a more stirring life in the metropolis and elsewhere, are apt to draw away the abler classes of men, and the silly and the imbecile do not take part in the gatherings. Hence, the residuum that forms the bulk of the general society in small provincial places, is commonly very pure in its mediocrity.” Galton 1869:35.

theoretical distribution based on Quetelet’s tables. These two, he asserts, accord “quite as closely as the small number of person could have led us to expect.”38

Figure 6.2: Galton’s original Sandhurst table, Page 33:

Despite Galton’s assertion, the two distributions only accord at the top, as is clear when you graph them39. Galton argues that men of low ability did not compete, accounting for the differences on the left side of the distribution. Perhaps. But the center of the distributions differ as well, and in the opposite direction. If Galton were interested in any part of the distribution other than the right tail, he would surely have noted that the observed data are too concentrated and are skewed, in contrast to the theory of symmetry. Indeed, perhaps he did notice in the privacy of his own office. But the text nonetheless asserts that the two distributions are indistinguishable, and the book proceeds on that assumption.

Figure 6.3: Galton’s Sandhurst table, in graphical form

38 Ibid, page 32. 39 I have scaled the “in theory” numbers to the sum of the “in fact” numbers.

25

20

15

10

5

0 below 400 to 1100 1600 2300 3000 3700 4400 5100 5800 400 1100 to to to to to to to to 1600 2300 3000 3700 4400 5100 5800 6500

"According to Fact" "According to Theory"

In later work, Galton thought about the relationship between two Normal distributions, such as the distribution by height of fathers and sons, and through those observations developed the concepts of correlation and regression. These discoveries remain enormously influential, and underlie much of contemporary practical statistics. The mathematics behind his ideas are simple enough, and even at the time, they were not difficult for other people of science to understand once Galton published them. Galton’s gift was to see them first40. Yet despite his remarkable abilities of insight, he apparently never explored the idea that some human distributions might not be Normal. As did Quetelet, Galton considered human variation to strictly follow the Law of Errors, and allowed for no exceptions. Galton’s statistical norms are remarkably devoid of a social context. He showed, with him double quincunx, that a Normal mixture of Normal distributions was itself Normal, and argued—sometimes against his own data—that all forms of ability were Normally distributed generation after generation. From this he inferred (or perhaps he assumed all along?) that marriages were not assortative by ability, despite the fact that his own life and family belied this inference. Similarly, Galton shows very little interest in the ways that ability might be shaped by training or context, although his own training surely attested to the importance of connections and money in acquiring the skills and credentials necessary for the forms of accomplishment he calls genius. The social world Galton depicts is one governed by the laws of heredity and chance, and not by laws or norms. Louisa Butler Galton died in 1897. Following her death, Francis was increasingly unwell, and from about 1907 was largely infirm. He died in London in 1911 at age 88. They had no children.

40 Keynes 1991:99 makes this same point.

Why should the distributions of social traits be Normal? And why are they mostly NOT Normal? Galton gets his probability theory from Quetelet, and Quetelet bases his ideas about human probabilities on models from coin tosses (Binomial) and errors of observation (Normal). These two distributions are extremely similar, especially when the sample size is large. Why might analogies to coins or astronomy work for understanding the social world? What would that mean if they were? Why should the distribution of social traits be Normal anyway? Let us first consider the Binomial distribution, which describes the number of heads in a set of coin tosses, or the shape of balls in a quincunx. If we toss ten fair coins at once and count up the number of heads, we will get a number from zero to ten. If we do this a large number of times and graph the results on a histogram we will get a distribution heaped up in the middle, and declining at either end: there is only one way to get ten heads in ten coins, but there are ten different ways to get exactly nine heads, 45 ways to get exactly two, and a 252 ways to get exactly five.

Figure 6.5: The Binomial distribution when 10 coins are thrown

300

250

200

150

100

50

0 0 1 2 3 4 5 6 7 8 9 10

Now, if we go from tossing ten coins at a time to tossing 100 coins at a time, the numbers become enormous but the shape becomes very familiar41:

Figure 6.6: The Binomial distribution when 100 coins are thrown

41 Note that although the number of times we expect to see exactly 10, 20, or even 30 heads out of 100 tosses is so miniscule as to be invisible on this scale, it is not exactly zero. We would expect to get 33 or fewer heads from 100 coins about 4 times in 10,000 runs.

1E+29

8E+28

6E+28

4E+28

2E+28

0 0 4 8 12 20 24 32 40 44 52 60 64 72 80 84 92 16 28 36 48 56 68 76 88 96 100

As the number of cases increases, the Binomial distribution comes to approximate the Normal (or Gaussian) distribution that was so dear to Quetelet. In the Normal as in the Binomial, the expected number of cases in the tails is miniscule but not exactly zero, and the majority of the cases are bunched up symmetrically in the middle. For tosses of fair coins, the reason for the shape is combinatorial: each coin is independent, and so there are simply many more combinations that yield a moderate number of heads than that yield a very low or very high number. When we think about Gaussian error, the principle is similar: the myriad little forces that act to perturbate an observation or outcome are—in theory at least—independent and unbiased. Therefore, in the large majority of cases, some of these little forces will perturbate up and others down, or some left and others right, with the net effect being a distribution of errors that closely resembles the Binomial. This is what we mean when we say that “errors cancel”: like the differences in air resistance, pressure, and force that act on a fair coin, the each of the myriad little forces of error are no more likely to push left than right, and so the joint effect is most often neutral, resulting in the familiar hunched-up shape. Any time that an outcome is the result of a large number of small, symmetric, and independent forces, we might plausibly—following Quetelet and tradition since him—hypothesize that the population distribution will be roughly Normal. In fact, most of us—both demographers and lay-people—assume that most outcomes are Normally distributed at the population level. Grounded in theory and enshrined in over 150 years of social science, it just feels intuitively right that the average is the middle of a symmetrical distribution with very light tails. However, it turns out that most traits are not Normally distributed across most populations: weight, income, wealth, years of education, speed at which you can throw a baseball, speed at which you can complete a marathon, completed fertility, age at first birth, days of school absence, age at death, annual days of hospitalization, lifetime number of residential moves, distance of a residential move, number of violent crimes committed, AP

scores, reaction time, learning time42… almost every quantitative measure that comes to mind proves on inspection to be not Normally distributed. Our common intuition simply is wrong. Two familiar graphs will suffice to illustrate the general point:

Figure 6.7: US family income, 2012, From the Washington Post

Figure 6.8: Age at death, UK females through 201043

42 Measured by either the total errors adjusted method or mean tries to success method, see Bento-Torres et al. 2017, figure 1. 43 Source: https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths /articles/mortalityinenglandandwales/2012-12-17

How can this be? All of the examples in the last paragraph would seem, at first glance, to be complex outcomes who are the result a myriad little unbiased and independent factors: many different genes play a role, as do various aspects of early childhood experience, different choices, many aspects of local conditions, and brutal chance. Those seem like the just the myriad little forces that theory predicts should produce a Normal distribution at the population level. But they don’t. So why doesn’t it work? There are at least four possible explanations, which we will call mixing, truncation, correlation, and structure. Quetelet argued that in every natural population, any trait would indeed be Normally distributed, and so any population distribution that deviated from normal indicated a mixing of two or more underlying natural populations. Sometimes this happens. For example, this is almost certainly the right explanation (minus the “natural” attribution) for first birth rates in the United States: there is a population, overwhelming less educated and less privileged, that has first births early, and a second population, overwhelmingly more educated and more privileged, that has first births late, as seen in figure XX. Rachel Sullivan Robinson has argued that first birth rates in the US are becoming bimodal, and here we see evidence that is true. Or if not bimodal, then something like a lumpy hat, or the elephant inside a snake from Le Petit Prince. However, Quetelet’s in fact applies to relatively few situations: most of the time we do not see the bimodal or over-dispersed pattern that would plausibly indicate population mixing. Instead, we more commonly see distributions that are left-skewed, where the most common outcomes are at or near the very bottom of the scale.

Figure 6.9: First birth rates by age in the US

First birth rate by age, data from CDC 2005

60.0

50.0

40.0

30.0

20.0

10.0

0.0 14 20 22 24 30 32 34 40 42 44 16 18 26 28 36 38 46 48

The fact that for many outcomes the most common values are at or near the bottom on the scale—as we saw above for income—suggests that we should think about how truncation matters. You cannot have a negative income (in the way we count income), or miss a negative number of days of work, or bear a negative number of children. As a result, there should be heaping on zero. But in most of these cases, we see not so much heaping on zero, but also a broad pattern of skew—the mode is to the left of the median, and the median is to the left of the mean. For many human traits, this approximate shape makes sense when we remember that the average person is pretty poor at most things. Most of us are very bad a throwing a baseball or running long distances, and so there is not room between the average and the bottom of the distribution for anyone to be as much worse as the average as elite athletes are good. If the average person can run, for example, for two miles before they have to stop and rest, and the best athletes can run 50, then it is not arithmetically possible for anyone to be as much weaker as the strong are stronger. For these kinds of outcomes, we commonly use Exponential, Lognormal, Poisson, or Gamma distributions, which allow or require a dense mass near the left edge of the distribution and strong right tail, as in figure 6.10. Each has a somewhat different logic: the Gamma distribution should describe waiting times between certain kinds of events, but also fits the size of loan defaults and the amount of water in a reservoir44. The Lognormal is logically the same as the Normal, except that the relationship between the myriad little factors is multiplicative instead of additive. This helps a great deal in thinking about something like athletic ability, where the benefit of one physical attribute may interact with the benefit of another one: throwing speed likely depends not only on muscle contraction speed and arm length, but also on their interaction.

44 http://wiki.stat.ucla.edu/socr/index.php/AP_Statistics_Curriculum_2007_Gamma, last accessed July 9, 2017.

Figure 6.10: The Lognormal distribution, showing the relative locations of the mode, median, and mean45

Because it maintains the central logic of the Normal distribution but allows for at least a slight bit more interaction between the myriad little factors that underlie it, the Lognormal distribution quite often fits social data better than does the Normal distribution. But it also opens up a larger question: what if the myriad little factors are not independent and unbiased? What if in addition to having multiplicative effects, they are themselves correlated? Genes that are co-located on the same arm of a chromosome are co-inherited more often, despite recombination. Many of the social factors that might influence age at death, for example, are strongly correlated. Education, employment, life style, neighborhood, and civil status: many of the correlates (and perhaps even causes) of differential life expectancy are associated with each other through the differential social sorting of people into categories. As a result, the myriad little forces that push mortality up or down are not independent and unbiased, but rather work in often subtle but consistent ways. Whenever the myriad little forces are dependent or bias, the resulting distribution will not be Normal. Truncation, correlation, and interaction of the myriad underlying factors are useful in thinking about intelligence. IQ of course is normally distributed, because it is a measure that results from data that have been transformed into a normal distribution. To get from test scores to IQ, people take tests, which are then normed (so a score of 127 answers right is transformed to the 88th percentile, for example). The percentiles are then translated into z-

45 Redraw this. Citation for current figure: https://math.stackexchange.com/questions/1775796/what-does-it-mean-that-log-normal- distribution-is-positively-skewed

scores, assuming a mean of 100 and a standard deviation of 15. That is, regardless of what the underlying distribution of test scores look like, IQ is by definition Normal. This point is often missed in discussions about the heritability of IQ, or IQ and race. When Murray and Herrnstein wrote the book The Bell Curve in 1994, many people criticized their methods and metrics, arguing that racial differences in IQ for example, are social products, not biological imperatives46. But almost no-one pointed out that IQ only follows a bell curve because it is parameterized to produce a bell curve; the assumption that intelligence and not only IQ was apparently accepted, even by Murray and Herrstein’s critics. So let’s look now at some unparameterized, that is raw, scores from intelligence tests. It is surprisingly difficult to get data on raw scores from any intelligence test—standardized scores fly around everywhere, but the raw ones are rarely released. In a 1997 paper, Miyaguchi proposed the extension and cultural adaptation of the Wechsler test to Japan. In arguing his case, he shows both the unparameterized scores, and the standardization to IQ. The data come from a set of Japanese high schools, selected to be representative. This pattern, with a heavy right tail, seems consistent with what we would expect for most kinds of physical ability—throwing a baseball, or playing the piano. Most people are just not very good, and a few people are exceptionally good, and there is not any room on the down side for some people to be just as bad as a few are good.

Figure 6.11: Intelligence test raw scores, from Miyaguchi (1997)

46 CITE.

Finally, population distributions of human traits or outcomes are not Normally distributed because the traits or outcomes are themselves structured in ways that the count of heads in 100 coin tosses are not. When we reduce a set of coin tosses to the count of heads, we explicitly decide that it makes no difference to the measured “outcome” whether it is coin #27 or coin #72 that shows heads. Whether the heads all appear in the upper left cluster or are evenly distributed across the board does not matter. The configuration of the coins is not relevant, only the count. But many social outcomes work differently. The structured configurations matter a lot, so that even the valence of one factor (say, a gene associated with risk aversion) for age at first birth can differ based on the configuration of other factors, like education47. This is not the same as correlation, above—education is not necessarily correlated with the genetic propensity for risk aversion—but rather the effects of risk aversion on age at first birth depend on the configuration of other factors, including education. Since Quetelet, we have come to accept the idea that people are like independent draws from a single distribution, and so human traits within a population will follow the Central Limit Theorem, approximating a Normal distribution as the sample gets large. Indeed, population-level regularities are often remarkable. However, most distributions are not Normal. Negative skew is the most common, as we saw in income, raw intelligence scores, and first birth by age. Positive skew is also common, as with age at death. The general point here is that the equation of human variation with coin tosses and astronomical errors is an imperfect analogy. While useful for some purposes, the analogy also breaks down, and we must be careful to remember that it is only an analogy. Not all variation is error: social differences often have social causes.

Social and statistical norms in the 20th century Kinsey and Laumann Quetelet, Galton, and others in the 19th century created quantitative social science in much of its modern form. Quetelet gave us the idea of a statistical norm, and a conceptual framework for linking it to social norms. Galton gave us the beginnings of a set of tools to analyze statistical norms, through his work on correlation and regression. Over the next half- century, anthropologists and sociologists developed a framework for thinking about social norms, albeit often without (or with limited) reference to statistical norms. Boas and his students, Durkheim, Parsons48, and others explored how people acquire social values, habits, and expectations. This literature is often quite consistent with Quetelet’s idea that “society prepares the crime;” and with the image of the individual as little more than the incarnation of the social system. But statistical and social norms are not always equivalent. What is most common is not always what is most admired or accepted. And the relationship between social and statistical norms can take a number of different forms. In this part of the paper, we will explore two major projects in 20th century American quantitative social science, namely the sex surveys of Alfred Kinsey and Edward Laumann. We do so with an eye to the relationships between social and statistical norms that the studies

47 The example comes from Schmidt 2008. 48 Add lots of citations here!

assume, and also that the studies create. With the Kinsey at least, it appears that the statistical norms were not only the product of certain social practices, but also an instigator of certain kinds of social change. On many dimensions, the Laumann study is not directly comparable to the Kinsey reports: the data collection, questions, and analysis differ in important ways. When considered separately, it becomes clear how the statistical distributions reported in each study are in part the product of the social expectations of the day: statistical norms follow social ones. This is true even though the Kinsey reports were considered shocking in their day for discussing subjects that had been considered private and taboo: that is, what was shocking was the explicit description, sometimes more than the actions themselves. But although the Kinsey and Laumann studies are in many ways incomparable, on a number of questions, the two studies collected and reported data that is similar enough in form to be compared—while acknowledging that the samples and methods are different. This is not a time series as much as a different view of a similar terrain. For example, both collected and reported coital frequency among married women. Viewed as a time series, it appears that the mean frequency of intercourse reported by women in married couples changes little between the two surveys, but the variance declines from the Kinsey to the Laumann. It is as if people learned the “right” answers from the widely publicized Kinsey study, and adjusted either their actions or their description of their actions to match the reported statistical norm. That is, social norms may follow published statistical ones as much as statistical norms may follow social ones.

The Kinsey Reports What we know as the Kinsey Reports are two substantive monographs: Sexual Behavior in the Human Male, published in 1948, and Sexual Behavior in the Human Female, published in 1953. Alfred Kinsey, the lead author, was a professor of Zoology at Indiana University, and the Director of a Research Center there called the Institute for Sex Research. Although the work of Kinsey and his colleagues may have had radical consequences for how sex and sexuality were publically conceived, the research itself was solidly within the scientific mainstream, at least in an institutional sense. In addition to support from the Rockefeller foundation, the Institute for Sex Research was primarily funded by the National Research Council, which is part of the National Academy of Sciences. Kinsey’s trained at Harvard as an entomologist, earning his doctorate for a dissertation on gall wasps in 1919. His research on gall wasps was well-regarded by specialists, and his introductory biology textbook was widely used. But Kinsey would unlikely be so well known were it not for a group of students who saw the university quite differently than did the administration. In 1938, as Hitler is rising in Europe, the students at Indiana University presented the then-new university president with a petition requesting a new course on marriage and the family, to include sex education. As James H. Jones describes, the president brought the petition to the board of trustees, who debated long before they approved it49. Kinsey became the lead for the new team-taught course, and taught the materials on the

49 Jones 1997:325.

physiology and endocrinology of sex and reproduction50. The syllabus also included the legal framework of marriage and divorce, the economics of marriage, the sociology of family, demographic approaches to fertility, and the ethics of family—topics that become important to how he thinks about sex in America. The course transformed his career; from that point until his death in 1956, Kinsey focused his work on human sexuality. Both volumes combine a wide-ranging literature review on aspects of human sexuality with descriptive statistics and illustrative examples from interview materials. Kinsey and his team interviewed some 12,000 men and nearly 8,000 women, and Kinsey himself interviewed more than 7,000 men. The approach to statistical representativeness is oddly contemporary: as is often the case with new kinds of digital detritus data, Kinsey collected as many interviews as he could with anyone who would agree, and relied on an post-enumeration weighting to adjust to a nationally representative sample based on the sample, at least for men. The sample of women was not only smaller, but also more homogenous, so that Kinsey did not consider the results for non-white women adequate for separate interpretation, and instead dropped them from the statistical analysis. The sample is not, itself, remotely like a probability sample, as is clear from the map of interview locations that Kinsey includes51. Kinsey wanted to get people to talk about private and personal topics, and he believed that to do that, he had to approach people in the right way: sensitively, and often through a contact. These are therefore somewhere between snowball samples and convenience samples, augmented by the idea from entomology that you should introduce variation into your sample when you can. For the post-enumeration weighting, Kinsey had only the census. That is, his weighted sample is representative by race, age, and household income, but he had no way to know whether it was representative on any of the variables he presents.

Figure 6.12: NYT Cartoon about Kinsey

50 Ibid, page 327. 51 Cite to page.

Both volumes became national best-sellers, and sparked huge debate. It is hard to imagine 800 page books in quantitative social science gaining the kind of social centrality that Kinsey did. Certainly I will never write anything that gets its own cartoon in the New York Times. These books were bought, read, discussed, and became cultural objects. The books discuss masturbation, what he calls “petting,” premarital sex, marital sex, extramarital sex, homosexuality, orgasm, and “animal contact”. Remember that Kinsey was a zoologist—there are long discussions of the mechanics of human sexual response, so the cartoon about the husband buying Kinsey’s book for some advice is not entirely ridiculous. Section after section showed very high variance in the outcomes: some married people reported sex less than once and moth, and other multiple times per week; some people reported homosexual contexts even within an apparently “normal” heterosexual lifestyle and others did not. The take away message from the whole work was, “You are normal. Lots of things are normal. Whatever you are experiencing or doing, lots of other people are doing, too.” This was especially true in the book about women. In the men’s book, Kinsey presents many of the results as an average and standard deviation: great for science, but hard for lay-people to picture themselves in relation to others. In the women’s book, many of the results are presented as histograms, allowing women to easily compare themselves to the distribution of other women’s experiences.

Figure 6.13: Sample page from Kinsey’s book on Women

The Kinsey reports become important cultural documents in the early post-war period. They are empirical descriptions, whether accurate or inaccurate as many have debated, but they are also for many people guidebooks. In her compelling book, From Local to National Communities, Susan Watkins wrote “Even when the couple is literally along in their

bedroom, the echoes of conversations with kin and neighbors influence their actions.”52 For at least a generation, the published Kinsey reports were part of those echoed conversations. And forty years pass before there is another study of American sexual practice on a similar scale.

The Laumann Study That later study was conducted by Edward Laumann, professor of Sociology at the University of Chicago. He earned his PhD at Harvard in 1964 with a dissertation on urban social stratification, under Harrison White and Talcott Parsons. As an Assistant Professor at the University of Michigan, he ran the Detroit Area Study, and moved to Chicago in 1973. He has been an important figure in the development and expansion of social science survey research, and a proponent of the idea that social norms are visible in . The field research for the Laumann study began in 1992, against the backdrop of HIV/AIDS, which made social research on sexual practices and sexual networks both more important, and even more politicized. What WWII was to Kinsey, AIDS was to Laumann. The Laumann project is quite different from the Kinsey studies,in all the ways that quantitative social science was different in the 1990s compared to the 1950s, and then some. Although both rely on face-to-face interviews, nearly all other aspects of the methodology are different. The Laumann study used stratified cluster probability sampling, and devotes nearly 30 pages of main text plus 50 pages of appendix to a discussion of the theory and practice of probability sampling.53 As a result of challenges in securing funding, their sample was relatively small: 1330 men and 1664 women in the main sample54, or about 1/7th the sample size from Kinsey. None of the Kinsey respondents were paid, but in the Laumann study many respondents were paid, between $10 and $100 in some cases, to meet response rate targets in each cluster. Beyond the survey methods, the books really differ in “feel”. The Kinsey reports are in the spirit of the courses that inspired them: multidisciplinary explorations by a broadly-read biologist, with citations from Ovid to Van Gennup to Freud to Westermarck, and in French, German, and occasionally other languages. Laumann is writing a defense of social surveys as a tool applicable to all topics, even the most intimate. Whereas Kinsey relies in part on his own personal charisma to get people to speak honestly, Laumann relies on a making sex as safe and sterile as other topics that NORC researches. He is known to quip, “leave it to guys from

52 Watkins 1990:242. 53 Laumann et al. are openly derisive of Kinsey for the use of post-enumeration adjustment holding promise to correct for non-probability samples, and indeed of the idea that there could ever be any viable scientific alternative to classical probability samples (see for example, page 45). Of course, contemporary work in data science takes a radically different : each era has its own methodological commitments. 54 Plus an additional 273 from an oversample of Blacks and Hispanics that are included in some tables but not others

the University of Chicago to make sex boring.”55 What is strong about the Kinsey study is the remarkable personal rapport between interviewer and interviewee that shines through in the personal stories and is echoed in the remarkable diversity of reported behaviors, but Kinsey’s sample is likely not representative even after the post-enumeration weighting. What is strong about the Laumann is the rigorous sampling procedure, but there is little evidence of comfort or rapport between interviewers and interviewees: it is hard to imagine that the variance in sexual behavior in the 1980s was really as narrow as suggested by the Laumann data.

Comparing Kinsey and Laumann Because the two studies differ so much—one is strong on sampling and weak on rapport, the other the reverse—we have to be very careful in making direct comparisons between the Kinsey and Laumann studies. These are not really a time series. Rather, we can think of the Kinsey as part of the background of social norms that people had available to them as they responded to the Laumann survey. Before Kinsey, most people in the United States would have had no idea what other people considered normal. After Kinsey, most everyone had at least a few numbers in mind, even if those numbers were very wrong. The statistical norm became part of the social one56. The first thing to notice about the distributions of experience in these two studies is that almost none of them are Normal. Shape is easier to observe in the Kinsey, which reported data in much more precise categories (less lumping); however, even in the Laumann it is clear that sexual experiences as reported to survey-takers are not symmetric around the mean, but are skewed. The second thing to notice is that in nearly every case where we can directly compare the results reported by Kinsey and Laumann, we see that the variance in the outcome is smaller—sometimes much smaller—in the Laumann than in the Kinsey. For a number of variables, the mean differs little or not at all, but the difference in the variance is dramatic. To put the data from the two on the same graph requires reducing the large number of categories in the Kinsey down to the much fewer categories of the Laumann, but even still, the shapes differ considerably. For example, let us consider coital frequency for married women. Marital sex is widely considered normal and acceptable, and so coital frequency should not be subject to as much response bias as other questions. The key pattern here is in the Kinsey we see a relatively wide range of answers—over ten percent of married women reported sex four times a week or more. In the Laumann, the mean has not changed much, but the concentration on a few

55 Remember the Stolzenberg quote at the beginning of the paper, about how the average is the mediocre, and the mediocre is boring… he was also from Chicago. What is it with the University of Chicago and boredom? 56 The factoid that 10% of people are gay is an interesting example. This became a social truth for a time, cited and re-cited until accepted as fact, and when people offer a source, it is most often the Kinsey studies that they cite. However, the 10% figure does not appear in either of the Kinsey reports, and indeed Kinsey generally resists this kind of classification of people by what we would now call sexual orientation.

times a month has grown enormously. It is as if people learned the right answer, and changed either their marital sexual practices or at least their reporting to match it.

Figure 6.14: Coital Frequency for Married Women (age-standardized)

50 Married women, Kinsey 45 Married women, Laumann

40

35

30

25

20

15

10

5

0 Not at all A few times a year A few times a month Two or three times a Four times a week or week more

For variables where we might expect more response bias—those regarding masturbation, homosexuality, anal sex, animal contact, and others—we continue to see that the variance is far lower in the Laumann than in the Kinsey, but also now see a shift in the mean toward zero. These distributions are dramatic. According to the Laumann data, over 50% of women age 30-34 did not masturbate at all in the last year, and fewer than 10% reported that they did so on average more than once a week.

Figure 6.15: Female Masturbation in the last year, according to Kinsey and Laumann

60

Female Masturbation age 30-34, Kinsey 50 Female Masturbation, age 30-34, Laumann

40

30

20

10

0 Never Less than once a week More than once a week

Similarly, just over 20% of women aged 40 and older reporting ever having any homosexual experience in the Kinsey study; 40 years later, Laumann finds 4%. If the two studies were more like a true time series, for example with the high-quality sampling of the Laumann and the high-quality interview technique and interpersonal trust of the Kinsey, these data would be very surprising. Between 1950 and 1990, the world changed in ways that made variation of sexual experience much more social acceptable. Since we know so little about the activities of sex before the sexual revolution57, it is hard to say how much practices changed, but discourses certainly did change, and in the opposite direction implied by the differences between the two studies. We cannot know for sure whether the publication and public discussion of the statistical norms of sex in the Kinsey study in fact came to shape social norms over the subsequent forty years. The lower , even when the means are the same, would certainly suggest that for variables without an obvious “right” answer built into the scale— variables like marital coital frequency for example—the Kinsey data became the “right” answers to which people oriented their Laumann answers, if not their actions. Statistical averages may—as Quetelet argued—be the consequence social norms. But not necessarily. And just sometimes, the statistical norms may in turn produce social ones.

On sex and new data

57 The wonderful book by Simon Szreter and Kate Fisher (2010) by this title is an obvious social science exception. Beyond that, rich sources like Laqueur (1990) or Foucault (1990) tell us much more about discourses of sex than about the distribution of practices.

The problems with the Kinsey and Laumann are clear: when a topic is only indirectly observable58 and potentially embarrassing, how can we get good data? For the better part of a century, we relied on surveys, and some combination of Kinsey’s personal rapport approach with Laumann’s cool, neutral scientism to increase truthfulness. A number of scholars have suggested in recent years that new data sources, sometimes combined with new statistical techniques, will solve this problem. For example, Seth Stephens-Davidowitz argues in his dissertation that Google search is a good proxy for off-line events59, and that “big data” sources related to sex in particular are better than survey sources60. The argument relies on the fact that Google searches are private, or at least experienced in private, and that their privacy makes them more truthful or sincere. People lie in public, the argument goes, but they are honest to their computers when no one else is around. Perhaps. But there is also an important social structure behind what we ask our computers, as opposed to other people, or even no one at all. In particular, things that are the most common in every day life are likely to be the least searched for online, because there is no need to ask Google, “how do I make toast?” or “what is an alarm clock?” Things that are easy and obvious, things that are subject to very strong social norms so that we all know what to do, and things that are idiosyncratic and therefore best discussed in person are all likely to appear less often in Google Searches than in real life. Things that are scary, embarrassing, shocking, or shameful are likely to appear more often on Google Search. In this way, Google Search amplifies the standard human habit to overestimate rare events and underestimate common ones. As a result, the frequency of Google searches does not reliably predict the frequency of events; for example, there are approximately ten times more vasectomies annually in the US than suicides, but the later is searched ten times as often61. Changes in the frequency of searches may indicate changes in social awareness or interest in the phenomenon, but they do not necessarily indicate changes in underlying rates: searches for “diabetes” have fallen steadily since 2005, while incidence first surged and then fell. Searches for “suicide” fell between 2005 and 2014, although actual rates of death by suicide rose in that same time period. Search data may or may not be information, depending on what search frequencies represent and why they differ or change. In the 19th century, we began to have evidence that many—even most—social phenomena followed to statistical regularities. At first, scholars like Quetelet and Galton believed that this was simply another application of the Law of Errors, and that human variation would therefore always follow the Normal distribution. By introducing distributions into the social sciences, Quetelet gave us tools to think about the relationship between empirical data, error, and uncertainty, and therefore made probabilistic arguments possible in the social sciences. We owe him a great debt. However, the normal distribution has also done us a good bit of harm, as we think about the relationships between social norms and statistical

58 It is not theoretically impossible to get direct data about sex, for example with body cameras. But the level of invasion necessary seems entirely out of proportion with the scientific value of the data. 59 Stephens-Davidowitz, 2013. 60 Stephens-Davidowitz, 2015 61 Data on searches from Google Trends; data on events from the CDC (http://www.cdc.gov/DataStatistics/)

ones. It is now clear that most human variation is not Normally distributed, and part of the reason is that the myriad little forces that underlie variation are not unbiased and independent, but rather structured and oriented by social processes. Social norms both influence and are influenced by statistical norms. What we can observe in a survey, or a census, or in data from Google Search, Twitter, or AppleWatches, is partially the product of who is trying to do what when, but also partially the product of social opportunity and old- fashioned social morphology. When we think about new data sources and new methodological opportunities, we must remember the lessons of the last data revolution.