M A X - P L A N C K - I N S T I T U T F Ü R W I S S E N S C H A F T S G E S C H I C H T E M ax Pl anc k Ins t i tut e for the His tory of Sc i enc e

2 0 0 6

P R E P R I N T 3 1 0

W o r k s h o p

H i s t o r y and Ep i s t e m o logy of M o lecu la r B i o logy and Beyond : P r ob le m s and Pe r spec t i ves

Collecting and Experimenting: The Moral Economies of Biological Research, 1960s-1980s.

Bruno J. Strasser

I. Introduction

Experimentation is often singled out as the most distinctive feature of modern science. Our very idea of modern science, that we trace back to the Scientific Revolution, gives a central place to this particular way of producing knowledge. The current epistemic, social and cultural authority of science rests largely on the possibility of experimentation in the laboratory. The history of the sciences over the last four centuries reminds us that experimentation has only been one out of many ways in which scientific knowledge has been produced. However, by most accounts, experimentation has progressively come to dominate all the others, in most fields of science, from high energy to molecular . In the life sciences, the rise of experimentalism has been cast against the natural history tradition, leading to its progressive demise. In this paper, I would like to question this big picture by drawing attention to the role played by natural history practices in the rise of the experimental life sciences during the 20th century and its current importance for laboratory science. By natural history, I refer to the different practices of collecting, describing, comparing and naming natural objects, practices usually associated, not with the laboratory, but with the wonder cabinet, the botanical garden or the zoological museum.1 I will argue that collections have played, and still play, an essential role for the production of experimental knowledge.2 This paper focuses on one of the most widely used types of collections in contemporary biomedical sciences: molecular sequence databases. It traces their development from the first protein sequence collections, published as a book-format “atlas” in 1965, to their incorporation as modern computerized databases accessible online. Even though this transformation was closely interwoven with the computer revolution, its greatest challenge was not technical, but social. Indeed, as I will argue, a number of tensions between the collecting and the experimental enterprises resulted from a clash of what E.P. Thompson has called “moral economies”. Conflicts over the collection of data, scientific credit, authorship, and the intellectual value of collections reveal some of the essential features of the moral economies of contemporary life sciences.

II. The rise of experimental biology

Until recently, there was an overwhelming agreement in the literature on the history of biology that natural history had progressively declined from the early 19th century to give way to the experimental approach in the study of life. William Coleman’s classic textbook, Biology in the Nineteenth Century, reminds us that those who promoted the term “biology” in 1802, Gottfried Treviranus and Jean Baptiste de Lamarck, agreed that “natural history” did not have its place in

1 FARBER 2000; GHISELIN et LEVITON 2000; JARDINE et al. 1996. 2 For a similar point, see DE CHADAREVIAN 1998.

105 Bruno J. Strasser

the new science and they were “hoping to reorient the interests and investigations of all who studied life”.3 The increasing emphasis on the study of function made physiology one of the key disciplines of the 19th century. The study of form, that was so central in the natural history tradition, continued in the 19th century, but became subservient to the understanding of function, such as individual development, metabolism or disease. Coleman thus concludes his book in the following terms: “In its name – experiment – was set in motion a campaign to revolutionize the goals and methods of biology”.4 Garland Allen’s textbook, Life Sciences in the Twentieth Century, picks up the story where Coleman had left it and adopts a similar perspective: “It was the twentieth century that saw the fanning out of the experimental method in all areas of biology”,5 and not just in physiology as in the previous century. Opposition to natural history was, once again, a driving force behind these changes. In the early 20th century, it was not so much natural history, in the sense of the description of whole organisms, than morphology, the description of their inner structure that was the target of the “new biology”. Allen’s narrative is thus cast as a “revolt form morphology”, in the study of development as well as heredity. More recent work has questioned the sharp break that Allen located around 1900, and shown that natural history tradition, if not anymore the core of biology, still played a role for its development at the turn of the century.6 Lynn K. Nyhart, for example, has claimed that natural history was declining relatively and growing absolutely around 1900, due to the general expansion of biology’s territory.7 For Keith Benson, “Natural history remained alive and well, primarily within museums”,8 however. Coleman and Allen’s narratives have structured much of the subsequent scholarship.9 In particular, almost all studies of natural history have focused nearly exclusively on the period from the 17th to the 19th century.10 When the 20th century is considered at all, natural history practices are studied in the context of ecology, some areas of evolutionary studies, and obviously systematic, but always far from the laboratory. Thus, in the big picture of 20th century biology, the rise of experimentation is cast against natural history and, by the mid-20th century, has become independent of natural history.

III. The “molecular revolution”

Nowhere is this narrative more pervasive than in the historiography of molecular biology. One of the most profound transformations in the 20th century life sciences, was the process that led to the understanding of life in terms of the structure and function of molecules. The “molecular revolution” supposedly illustrates the triumph of experimentation in the biomedical sciences.

3 COLEMAN [1971], p. 2. For precedents, see MCLAUGHLIN 2002. 4 COLEMAN [1971], p. 166. Coleman’s picture has been refined by a number of authors, and the boundaries between his categories of form, function and transformation have been questioned. 5 ALLEN 1978 , p. xvi. 6 RAINGER et al. 1988. 7 NYHART 1996, p. 442. 8 BENSON 1988, p. 77. 9 For example, BOWLER et MORUS 2005, chapter 7. 10 For example, JARDINE et al. 1996.

106 Collecting and Experimenting: The Moral Economies of Biological Research, 1960s-1980s.

This transformation has generally been identified with the rise of a particular discipline, “molecular biology”, even though it is better understood as a larger process in which “molecular biology” is only an episode.11 Just as Robert Boyle or Francis Bacon at the time of the Scientific Revolution claimed that they were in the process of making a revolution, a number of the proponents of “molecular biology” insisted on the radical break that their science represented with the past. Several of them were trained in physics or chemistry and positioned themselves against “traditional biology”, by which they meant nothing else than natural history. They often described biologists disdainfully as naturalists who were merely collecting observations, and providing unnecessarily complex explanations who would end up in large monographs, such as atlases or obscure essays of Naturphilosophie. The molecular biologists, on the other hand, self- fashioned their scientific personae along the lines of the experimental physicist. In 1969 for example, the Swiss physicist-turned-molecular biologists Eduard Kellenberger, commented that Max Delbrück, also a former physicist, had demonstrated “that biology could be studied in the same precise, logical and quantitative way as physics”.12 Forty years later, Kellenberger claimed that the head of the biology department at his home university warned him he “could not get a PhD in biology, because as a physicist, [he] did not know the names of all the different plants and animals”.13 Statements of this sort, expressing contempt for “traditional biology”, qua natural history, were so pervasive that a number of biologists, such as the evolutionary biologist Edward O. Wilson at Harvard, fought back to defend their professional status in what he perceived as a “molecular war”.14 It shouldn’t be necessary to insist on how much the characterization of “traditional biology” by the molecular biologists was a gross misrepresentation of the practices of biology in the middle of the 20th century, and should rather be understood as an element of the molecular biologists’ discipline building strategies and struggles to create a divide between them, the Moderns, and the others, the Ancients.15 Knowing the pivotal role that experimentation has played for the rise of molecular biology, practically and rhetorically, and the expansion of molecular approaches to most areas of biological research, one might expect natural history practices to have vanished completely from contemporary research. Or at least, that they would be confined to some areas of ecology and evolutionary biology, far from the laboratory. Looking closer at the research practices carried out in almost all biomedical research laboratories around the world reveals a very different picture. Natural history is not dead, it is thriving. Today, paradoxically, the production of biomedical knowledge rests on a “way of knowing”, to borrow John Pickstone’s notion,16 that seems closer to the practices of natural history, than to the experimental tradition that should have been distinctive of the new approaches to biomedical research. This way of knowing is centred on the use of molecular databases, especially protein and DNA sequence databases. Pickstone’s conceptual categories can perhaps help us solve the apparent contradiction resulting from the coexistence of experimental and natural historical approaches in contemporary life sciences. Whereas we have been accustomed to thinking about the development of science in

11 DE CHADAREVIAN et KAMMINGA 1998. 12 University of Geneva Archives, Edouard Kellenberger, “Jean Weigle”, [s.d.], 1969. 13 Interview with Edouard Kellenberger, Lausanne March 15, 2001. 14 WILSON 1994. 15 ABIR-AM 1992. 16 PICKSTONE 2000.

107 Bruno J. Strasser

terms of a succession of episodes, such as Kuhnian paradigms replacing each others, Pickstone proposed that new “ways of knowing” have been added to the practice of science over the last four centuries. The “natural historical”, the “analytic”, and the “experimental” way of knowing constitute different layers in the make-up of contemporary science. By comparison to the experimental way of knowing, the natural historical way of knowing shows distinctive epistemic, material, social and moral dimensions that, in the case of contemporary sequence databases, can be briefly summarized as follows. Sequence databases are computerized collections of data about DNA or protein sequences. The reason why sequences are important is because they are the key determinant of the structure and function of molecules. Determining sequences involves a number of experimental steps that could take several years to perform in the 1950s and 1960s, but that have become largely automated today.17 Sequence databases are accessed online by researchers, to deposit or retrieve sequences, or to carry out comparisons between sequences in the database. The results of the Human Genome Project, for example, were continuously made available through GenBank, the largest DNA sequence databases in the world, and used to assist the sequencing enterprise. In August 2005, GenBank reached the number of 100 billion bases, only “a bit less than the number of stars in the Milky Way”, explained the NIH in its press release.18 These sequences, representing 165,000 different organisms, were provided by tens of thousands researchers. Each day, tens of thousands of individuals around the world access GenBank. Databases are not just repositories, they are tools for producing knowledge. Researchers compare sequences they have determined in their laboratory with those present in the database using sophisticated software to infer, by analogy, the function of gene or a protein, or the evolutionary relationships between species. The material culture of this way of knowing rests on computers and computer networks. Today, in silico biology, complements in vivo and in vitro approaches to biology, and it is vital to the success of the experimental enterprise. If there is any distinctive feature to the natural history approach, it is certainly its reliance on collections. Cabinets, gardens, museums, herbariums, and atlases for example, have all played a crucial role for natural history from the early modern period to the late 19th century, when Victorian sensibilities brought these collections to enjoy a major popularity.19 In addition to being tools for display, they were tools for producing knowledge about the taxonomy of living organisms, their anatomy, and their history. Bringing together specimens in a single place, and organizing them in a systematic way, made comparisons between specimens possible and, by analogical reasoning, their identification and inscription into broader theoretical systems. Early modern cabinets of curiosity, royal gardens of the 17th and 18th century and the great zoological museums of the 19th century have all faced a common challenge, namely to bring to a central location specimens that were often dispersed all over the world. The proponents of contemporary sequence databases have been confronted to the same challenge. Even though these collections have continued to be enriched in the 20th century, they have generally been considered to represent an increasingly archaic mode of research, one that was

17 On the early history of protein sequencing, DE CHADAREVIAN 1996. 18 NIH press release, August 22, 2005, “Public Collections of DNA and RNA Sequence Reach 100 Gigabases”. 19 JARDINE et al. 1996.

108 Collecting and Experimenting: The Moral Economies of Biological Research, 1960s-1980s.

progressively giving way to more experimental approaches. However, a number of new collections have been founded in 20th century, and have played a central role for the rise of the experimental life sciences. The Cambridge Structural Database was founded in 1965 by Olga Kennard, a student of James D. Bernal at the University of Cambridge and contained structural information about proteins, derived from X-ray diffraction experiments. A year later, Victor McKusick founded Mendelian Inheritance in Man, a database of hereditary diseases, at the Johns Hopkins Medical School. At the same time, Index Medicus, the printed database of biomedical literature, started to become available on a computerized system, Medlars. In this paper, I focus on the first protein sequence database, the Atlas of Protein Sequence and Structure, established by the physical chemist Margaret O. Dayhoff, in 1965. I will attempt to characterize this way of knowing in its epistemic, material, social, and moral dimensions. In particular, I examine how data was collected, how the database was managed, what were its conditions of access, how it was used, and how scientific credit and authorship were distributed. I pay particular attention to the moral economy of this way of knowing. I will also argue that some of early difficulties in the establishment of sequence databases resulted from the fact that they relied on a moral economy that conflicted with the then prevailing moral economy of the experimental life sciences. The notion of moral economy has been popularized by social historian E.P. Thompson as an alternative to economic and mob physiology explanation of peasants food riots in 18th century England.20 He argued that the riots were driven, not just by unfocussed anger, but by a sentiment of injustice and betrayal of a system of moral norms defining “just price” and exchange, and the distribution of resources. The moral economy can be defined as the system of values underlying specific exchange practices. The notion of moral economy has been imported in science studies and used in a variety of ways. Robert Kohler, for example, has analysed in these terms the community of researchers working on the genetics of fruit flies in early 20th century. He underlined the importance of access to research tools, equity in the assignment of credit, and authority in setting research agendas:

In the case of science, three elements of communal life seem especially central to its moral economy: access to research tools of the trade; equity in the assignment of credit for achievements; and authority in setting research agendas and deciding what is intellectually worth doing.21

It essential to remember that moral economies, unlike Mertonian norms, are local and historically situated and can thus differ between scientific communities, in our case between experimentalist communities and the promoters of sequence databases. I will thus address the question of the relationship between different moral economies.

20 THOMPSON 1971. 21 KOHLER 1999, p. 249.

109 Bruno J. Strasser

IV. Natural history in the laboratory?

The fist Atlas of Protein Sequence and Structure and its subsequent editions were produced by Margaret O. Dayhoff and her collaborators, biologists Richard V. Eck, Lois T. Hunt, and Winona Barker, at the National Biomedical Research Foundation (NBRF), in Silver Spring, Maryland.22 This private non-profit institution was founded in 1960 by Robert S. Ledley in order to explore the possible uses of electronic computers in biomedical research.23 Ledley, born 1926, trained as a dentist and took courses in physics and at , where he earned an MA in theoretical physics in 1950. He is best known today as the inventor of the whole-body computerized tomography machine. Ledley first worked at the National Bureau of Standards in Washington DC, developing computation methods in symbolic logic applied to problems of .24 Following up on an initiative of the US Air Force in 1956, the National Research Council hired to conduct a survey on the uses of computers in biology and . He produced a 900-page manuscript that was published in 1965 as a monograph entitled Uses of Computers in Biology and Medicine.25 It represented an introduction to the principles and methods of digital and an exploration of their possible application in a number of fields of biology and medicine. From the time he started working on this survey, Ledley became one of the strongest advocates of using digital computers in biomedicine, from the automated recognition of chromosome images to computer-assisted medical diagnostics.26 Impressed by Ledley’s expertise in computers and formal logic, the physicist , asked him in 1954 to become a member of the “RNA tie-club”.27 Gamow had set up this informal group, just after and ’s discovery of the DNA double helix, in order to attempt to decipher the genetic code through a strictly theoretical approach. The group largely failed in his attempt to crack the code, and it was two biochemists, Marshall Nirenberg and J. Heinrich Matthei who, by experimental methods, solved the first codon of the genetic code in 1961. By 1966, the entire genetic code was solved in a similar way.28 Even though Ledley’s participation to the RNA tie-club did not help solve the code, it brought him for the first time in contact with experimental research on proteins and DNA. At the same time, a growing number of biochemists were sequencing proteins, following ’s first success with insulin, in 1955.29 By 1968, the editor of Science declared that the determination of protein sequences was “one of the most important research activities today”.30 The general approach for sequencing consisted of separating the protein in smaller overlapping fragments and determining biochemically their sequence. It was then necessary to reassemble the individual sequences in the right order to resolve the complete sequence of the

22 The NBRF eventually moved to Medical Centre, Washington DC. 23 NBRF Archives, Robert S Ledley to Harvey E Saveley, June 29, 1960. 24 On operations (or operational) research, see FORTUN et SCHWEBER 1993; RAU 2005. 25 LEDLEY 1965. 26 On the introduction of computers in biology and medicine, Joe November, 2006, Rise of the Digital Organism: How Computers Changed Biology, doctoral thesis in progress, Princeton University; HAGEN 2000. 27 Georges Gamow to James Watson, 6 décembre 1954, reproduced in WATSON 2001, annex 12. 28 On the history of the genetic code, see KAY 2000. 29 SANGER 1988. 30 ABELSON 1968.

110 Collecting and Experimenting: The Moral Economies of Biological Research, 1960s-1980s.

original protein. This problem of combinatorial logic was particularly well suited to a treatment by computer methods. From 1960, Ledley worked with Margaret O. Dayhoff, whom he had just hired, developing algorithms to assist biochemists in reconstructing complete sequence, using digital computers.31 (1925-1983), had obtained a PhD in quantum chemistry in 1949, under George Kimball at Columbia University.32 She used punch card machines to calculate resonance energies in small molecules. George Kimball had been a member of the Operations of Research Group during the war and continued afterwards to expand the methods of operations research to other areas.33 At the Rockefeller Institute and then the University of Maryland, Dayhoff worked on the early chemical evolution of the earth atmosphere. She joined the National Biomedical Research Foundation (NBRF) in 1960, and eventually became professor of physiology and biophysics at Georgetown University and president of the Biophysical Society (1980-1981). Given her joint interest for evolution and for protein sequences, her attention must have been caught by two papers published in 1962 and 1965 by the biologist Emil Zuckerkandl and the physical chemist .34 The authors described how protein sequences could be considered “documents of evolutionary history”. The comparison between sequences of a same protein from different organisms, could reveal differences, due to mutations that had occurred since the two species had diverged. The sequences could thus be interpreted as “molecular clocks” and allow the determination of phylogenetic relationships. This paper brought a number of researchers to approach the evolution of species, not only through the morphology of organisms, as it had been done at least since Darwin, but by the comparison of their protein sequences. Comparisons between sequences would also allow the investigation of the mechanisms of evolution, a question intensely debated, especially after the neutral theory of evolution was proposed in 1968.35 The alignment of sequences, their comparison, and the construction of possible evolutionary trees seemed, once again, a task particularly suited for a computer, since it required heavy calculations.36 Dayhoff thus began, with her collaborators, to collect all the known protein sequences from the published literature and entered them in punch cards, the storage medium of early computers. As she explained to a colleague: “There is a tremendous amount of information regarding evolutionary history and biochemical function implicit in each sequence and the number of known sequences is growing explosively. We feel it is important to collect this significant information, correlate it into a unified whole and interpret it.”37 Simply collecting the sequence data represented a considerable effort, since the published sequences were dispersed in a number of different journals and the word “sequence” was not even indexed in bibliographic databases. However possessing a sequence collection would be not only scientifically rewarding but could also to serve a broader agenda, as Dayhoff explained:

31 DAYHOFF et LEDLEY 1962; DAYHOFF 1964. 32 HUNT 1984. 33 MORSE 1973. 34 ZUCKERKANDL et PAULING 1962; ZUCKERKANDL et PAULING 1965. 35 DIETRICH 1994; DIETRICH 1998; SUÁREZ et BARAHONA 1996. 36 On the uses of computers in molecular evolution, and molecular systematics, HAGEN 1999; HAGEN 2001. 37 NBRF Archives, Margaret O. Dayhoff to Carl Berkley, February 27, 1967.

111 Bruno J. Strasser

I realized that the answers people were giving to social problems were very shallow and naive – often only palliative in nature. [!!!] I like to think that the Atlas and related research are going to help in the gigantic endeavour to solve these vexing problems. Species differences, race differences, sex differences, and individual differences, are largely controlled by protein differences. Motivation and mental capacity, goals and satisfactions, as well as diseases may be linked to proteins. We sift over our fingers the first grains of this great outpouring of information and say to ourselves that the world be helped by it. The Atlas is one small link in the chain from biochemistry and mathematics to sociology and medicine.38

The result of this collecting effort, about 70 protein sequences, was published in 1965 as a book entitled Atlas of Protein Sequence and Structure. Each page contained the sequence of a protein, its composition, and at least one reference to the literature where the sequence was first described. The data was presented using the conventional three letter annotation, “ala” for the amino acid for example, but also in a one letter annotation that Dayhoff had proposed in order to facilitate the work with the computer.39 The system promoted by Dayhoff eventually became common, but never replaced the older notation, easier to remember. In addition to the sequences, the subsequent editions of the Atlas contained results of comparison between sequences obtained with a computer (an IBM 7090) and phylogenetic trees that these comparisons suggested. These inferences were based on original computational tools that Dayhoff had developed, such as the “Dayhoff matrix” as it came to be known.40 In the Atlas, Dayhoff also suggested how the different sequences could have evolved, and proposed molecular taxonomies, such as the concept of “protein superfamily” that reflected common ancestry, and was used in turn to organise the Atlas. The computer served two purposes in the Atlas project. First, it was used to store the sequence data and print new editions of the Atlas, thus avoiding the necessity of typing the sequences again at each printing. Indeed, the unavoidable typographic errors that would be added while entering the sequence manually would render sequence comparisons meaningless. Second, the computer was used to perform comparisons between a large number of sequences and build the phylogenetic trees. This required a complete set of the sequences on punch cards that could be fed into the computer. The computer was not yet used to distribute the data. Indeed, in the 1960s, not only were wide computer networks nonexistent, but most biologists were not familiar with computers, mostly centralized university computing facilities, used mainly by physical scientists, engineers and administrators.41 In 1972, the fifth edition of the Atlas sold 1765 paper copies, and just 7 on magnetic tapes.42 Only in the early 1980s, with the advent of micro-computers and computer networks, did Dayhoff’s sequence collection become broadly distributed in electronic format.43 The second edition of the Atlas was published only one year after its initial edition, and its size had doubled. The principle obstacle to which Dayhoff and her collaborators were confronted was

38 NBRF Archives, Margaret O. Dayhoff to Susan Tideman, October 18, 1968. 39 IUPAC-IUB COMMISSION ON BIOCHEMICAL NOMENCLATURE 1968. 40 DAYHOFF 1969. 41 On the early uses of computers in crystallography, see DE CHADAREVIAN 2002, chapter 4; and in systematics, HAGEN 2001, and in biology more generally, NOVEMBER 2004. 42 NBRF Archives, LM 01206, “Comprehensive progress report”, August 23, 1973. 43 On the microcomputer revolution, see CAMPBELL-KELLY et ASPRAY 2004.

112 Collecting and Experimenting: The Moral Economies of Biological Research, 1960s-1980s.

the exponentially growing number of available sequences. The first edition of the Atlas contained less than 100 references to published sequences, seven years later it included more than one thousand. The number of authors cited rose from 160 authors to 2,500.44 This overflow of information represented a serious challenge for Dayhoff and her team.

V. Collecting data, collecting people

As in all natural history endeavours, collecting the vast amount of data necessary for building and expanding a collection requires specific forms of organisation. As Paula Findlen has argued, in the early modern cabinets of Ferrante Imperator and Ulisse Aldrovandi for example, the networks of objects exchange were largely based on patronage relationships.45 In 18th century French gardens, it was also a gift exchange network, a “system of polite indebtedness”, as Emma Spary has put it, between botanists, that filled the Royal botanical gardens of new specimens.46 The great natural history museums of the 19th century, such as the American Museum of Natural History or the British Museum, have relied on commissioned expeditions, but also on the growing market for rare natural history specimens to assemble their collections. The British or the French Empire could also rely on their power over the colonies to supply specimens. As Emma Spary has summarized “natural history is a science of networks”. Margaret O. Dayhoff first organised her collecting enterprise around a Maussian system of gift and counter-gift.47 The first Atlas was offered as a gift to all those whose sequencing work had been included. In her cover letter, Dayhoff insisted that the authors submit any new sequence they had obtained in their laboratory, even before publication, and attempted to discipline the researchers to use her one letter notation system. The act of giving the Atlas was supposed to create a social bond with an obligation to reciprocate, thus bringing new sequences in Dayhoff’s collection. The Atlas was also given to personalities who, if they weren’t directly involved in sequencing work, could lend legitimacy to the project. Melvin Calvin, John Kendrew, Max Perutz, and Richard Synge, for example, all Nobel prize winners, received a copy of the Atlas.48 More than 500 copies were given out in this way. The following edition however, and all the subsequent ones, were sold for a modest price,49 breaking down the early gift system. The reactions of the scientists who received the Atlas were generally enthusiastic. The Atlas was not only a convenient index to the published literature on sequences, but also a tool for the laboratory researchers involved in sequencing work, in the study of protein function or in molecular evolution. Some researchers did send in unpublished sequences to the editors of the Atlas; however, they remained a small minority. Dayhoff’s call was thus not as successful as she might have expected, and the work required to collect sequences grew faster than the human resources available for her project. Given the almost unanimously positive reactions to the Atlas one might wonder why it was so difficult to ensure the collaboration of the scientists involved in

44 NBRF Archives, LM 01206, “Comprehensive progress report”, August 23, 1973. 45 FINDLEN 1994. 46 SPARY 2000, chapter 2. 47 On gift economy, see MAUSS 1923-1924 and BOURDIEU 2000 [1972], and in the case of science, HAGSTROM 1982; BIAGIOLI 1993; FINDLEN 1994. 48 NBRF Archives, Correspondence. 49 NBRF Archives, Margaret O. Dayhoff to Gordon B. Ward, april 30, 1978.

113 Bruno J. Strasser

sequencing work. I would like to suggest that the answer lies, at least in part, in the fact that Dayhoff’s system conflicted with ideas about credit, authorship and the property of knowledge in the experimental sciences, all essential elements of its moral economy. The Atlas gave proper reference to the authors of the published sequences. However, in the preface of the Atlas, the editors warned that they did not want to “become involved in question of history or priority”.50 This decision probably had dramatic consequences on the collecting enterprise. Indeed, by refusing to establish priority, the editors of the Atlas alienated themselves from the main mechanism that brought scientists to submit ideas and experimental results for publication, namely the establishment of authorship. Authorship, in turn, brought recognition and scientific credit, which was the main reward for producing knowledge in science.51 As Lewis Wolpert, a Nobel prize winning molecular biologist, reflected much later:

J.B.S. Haldane is reported to have said that his great pleasure was to see his ideas widely used even though he was not credited with their discovery. That may have been fine for someone as famous and perhaps noble as Haldane, but for most scientists recognition is the reward in science.52

Not only would unpublished sequences printed in the Atlas not secure priority to their author, but they could give important hints to competing groups working on the same protein. In other words, Dayhoff was asking experimentalists to share knowledge that was considered highly proprietary, and that required several months or even years of research to obtain, without offering the possibility to receive credit for it. Thus, the system proposed by Dayhoff ran against one of the essential values, on which was based the moral economy in the experimental life sciences, namely that the production of knowledge deserved credit. For the community of protein researchers in the 1960s, many of which were biochemists and molecular biologists, unpublished knowledge was considered a property of those who had produced it, and published knowledge was considered a common good. Other moral economies, however, were compatible with the experimental enterprise. In the 1910s and 1920, for example, the “fly group”, the community of researchers around Thomas H. Morgan,53 or in the 1940s and 1950s, the “phage group”, the community of researchers around Max Delbrück, who were both studying genetics experimentally,54 relied on a very different moral economy. It was based on the sharing of unpublished data and ideas within the group and a strong collaborative ethos. In the field of biochemistry and molecular biology research, where obtaining experimental knowledge required a much greater investment than in genetics, and where competition had become extremely intense in the 1960s, the moral economy rested on a different set of values. One can get a sense of this system of values by examining some of the reactions to James Watson’s tell-all autobiography, The Double Helix, published 1968.55 Watson reveals that he and Francis Crick had used unpublished experimental data by Rosalind Franklin, obtained through a

50 DAYHOFF et al. 1965, p. xiv. 51 HAGSTROM 1982. 52 WOLPERT 1993, p. 89. 53 KOHLER 1994. 54 KAY 1993. 55 WATSON 1980 [1968].

114 Collecting and Experimenting: The Moral Economies of Biological Research, 1960s-1980s.

confidential report, to determine the structure of DNA.56 A number of reviewers of the book, lamented this unruly behaviour, and Watson’s selfish and reward-hungry attitude. A phage geneticist, Grete Kellenberger-Gujer, for example, in a letter to Max Delbrück, wrote “This book made me sad, it is like pornography: everything is true, but how tasteless, how ugly.”57 Even though such behaviours were known to exist among the experimentalist community, since they were considered immoral, their perpetrators should not shamelessly publicize their transgression, as Watson did. Evolutionary biologist Richard Lewontin, in his review of the book, recognised that Watson’s selfish behaviour was not that unusual:

What every scientist knows, but few will admit, is that the requirement for great success is great ambition. Moreover, the ambition is for personal triumph over other men, not merely nature. Science is a form of competitive and aggressive activity, a contest of man against man that provides knowledge as a side product.58

In this competition, sharing and withholding information was of essence. As Lewis Wolpert noted:

In order to promote the success of their ideas, and hence themselves, scientists must thus adopt a strategy of both competition and collaboration, of altruism and selfishness. Each must balance his or her behaviour, in relation for example to sharing information, in these terms.59

The fact that the knowledge included in the Atlas was copyrighted and that it was sold was another source of potential tension with those who were providing the sequence data. Such tension arose because gratuity of knowledge was an essential value in the moral economy of the experimental sciences. Indeed, previously published knowledge was generally considered a common good that could be freely redistributed. The Atlas, however, was copyrighted and sold, even to the authors who had contributed to it. Some researchers were uneasy about this, as one molecular biologist remarked in a letter to Dayhoff: “you are in somewhat the position of a folksong collector who copyrights his published material ; do I have to pay him if I sing John Henry?”60 Years later, Dayhoff was competing with the physicist Walter Goad from Los Alamos for a NIH contract that would support a national DNA database. Goad commented:

It is important that we be perceived by the molecular biology community as offering completely free and open access to the information and programs we will be collecting. Indeed we seem to be developing an edge on that score as our principal competitors [Margaret O. Dayhoff] become increasingly enmeshed in proprietary arrangements.61

Even though the content of Dayhoff’s Atlas was copyrighted, a number of researchers still consider it belonged to the public domain. Indeed, when the database became available on magnetic tapes, several researchers imported its content into their own database and made it publicly available.62

56 On this episode, MADDOX 2003; DE CHADAREVIAN 2002; OLBY 1994 [1974]. 57 Caltech Archives, Max Delbrück papers, Grete Kellenberger-Gujer to Max Delbrück, March 17, 1969. 58 R. C. LEWONTIN, Chicago Sunday, February 25, 1968, p. 1-2, reprinted in WATSON 1980 [1968]. 59 WOLPERT 1993, p. 88. 60 NBRF Archives, Burton S. Guttman to Margaret O. Dayhoff, June 10, 1968. 61 APS Archives, W. Goad to P. Carruthers, November 3, 1981.

115 Bruno J. Strasser

Dayhoff and her team felt uncomfortable about this appropriation of data,63 but did not voice an objection. If the gratuity of published knowledge was a value so essential to the moral economy of science, why did Margaret O. Dayhoff decide to sell the Atlas? The initial work for collecting the sequences, setting up the Atlas, and analysing the data was supported by the NIH under a research project aiming at the development of algorithms for sequence analysis.64 However, the rapid increase in the resources needed only to keep up with the growing number of published sequences made the NIH more and more hesitant. After all, did the collection of existing data qualify as scientific research or was it just administrative or editorial work? The NIH, trying to find a more appropriate place for Dayhoff’s project, transferred it to the National Library of Medicine, before reintegrating it. Along the way, it threatened to terminate the grant a number of times, unless NBRF could find additional resources to support the data gathering costs. As a result, Dayhoff was brought to sell the Atlas, and later the magnetic tapes, in order to demonstrate her willingness to make this part of her project self-supporting.65In addition to the revenues from sales, Dayhoff and her team relied on additional support from the NSF, the AEC and the NASA, who was particularly interested in molecular evolution, in the framework of its growing exobiology project, and was ready to fund almost any project even remotely related to that subject.66 The scientifically ambiguous status of the Atlas project also had consequences on Dayhoff’s personal career. Indeed, even though the Atlas was recognised as being of immense value for scientific research, it wasn’t clear that Dayhoff’s work should count as a scientific contribution for which she could receive credit. When she applied to become a member of the American Society of Biological Chemists, the biochemist John T. Edsall answered a bit embarrassed:

Personally I believe that you are the kind of person who should become a member of the American Society of Biological Chemists ["] but knowing the general policies that guide the work of the Membership Committee I must add that I can not feel at all sure about your prospects for election. Election is almost invariably based on the research contributions of the candidate in the field of biochemistry, and the nomination papers must include ["] recent work published by the candidate, to demonstrate that he of she has done research which is clearly his own. The compilation of the Atlas of Protein Sequence and Structure scarcely fits into this pattern.67

The difficulties for obtaining personal scientific credit for the Atlas can explain some features of its distribution.

62 Russell Doolittle in the Newat (“new atlas”) database, and Amos Bairoch in the Swiss-Prot database for example. DOOLITTLE 1997; BAIROCH 2000. 63 Interview with Lois T. Hunt, Silver Spring, January 10, 2006, interview with Ruth Dayhoff, Bethesda, January 9, 2006. 64 DAYHOFF et LEDLEY 1962. 65 Margaret Dayhoff felt uncomfortable about this situation and often explained to colleagues that it was the lack of support and insistence of the NIH and NSF that had forced her to adopt this solution. NBRF Archives, Robert Ledley to Marvin Cassman, April 25, 1983 and Margaret O. Dayhoff to George Jacobs, December 5, 1967. 66 On exobiology, see WOLFE 2002; STRICK 2004. 67 NBRF Archives, John T. Edsall à Margaret O. Dayhoff, November 4, 1969.

116 Collecting and Experimenting: The Moral Economies of Biological Research, 1960s-1980s.

For most uses of the Atlas a printed version of the sequences was sufficient. However, to establish phylogenetic reconstructions, is was essential to have the data on a medium that could be fed into a computer, either punch cards or magnetic tapes. Understandably, the editors of the Atlas were somewhat reluctant to distribute the Atlas in these formats, and have resisted on several occasions to do so. Indeed, they would have lost the only way in which they could gain scientific credit, namely by publishing phylogenies and other inferences drawn from sequences included in the Atlas. Commenting on a competing database, a researcher wrote:

An important strength of your program is its openness. Sequences in the system will be quickly made available to everyone and will not be the private hunting grounds of the Centre’s staff. I feel that this is in refreshing contrast to my perception of the way in which the protein sequence library is maintained by Dayhoff and her staff.68

Not only the Atlas itself, but also the phylogenies determined by Dayhoff were of a somewhat ambiguous scientific stature. Other authors, such as Russell F. Doolittle or Emanuel Margoliash, were basing their phylogenies on comparisons of their own protein sequencing work and that of other researchers. Dayhoff’s results, on the other hand, were exclusively based on the work of others, and thus essentially of a theoretical nature. Dayhoff saw an epistemic virtue to this feature and explained that “Since we have no experimental research of our own, we are in a particularly good position to evaluate impartially the work of competing laboratories”.69 If theoretical work enjoyed a respectable status in physics and in some areas of evolutionary biology, it wasn’t as valued in the field of biochemical work, and Dayhoff complained, for example, about the “great hostility of journal reviewers” to her presentation of theoretical methods to solve sequencing problems.70 Indeed, as biochemist Frederick Sanger commented much later: “‘Doing’ for a scientist implies doing experiments”.71 Understandably, the editors of the Atlas tried to inscribe their project in the experimental enterprise, and the each preface began with: “This Atlas voluminously illustrates the triumph of experimental technique over the secretiveness of nature.”72 But there was another reason why Dayhoff’s work posed a challenge to the experimental sciences. The reward system in science was largely based on the figure of the individual author. More generally, until very recently in Western philosophy, knowledge has been considered a property of individuals, not groups.73 Dayhoff’s work on the other hand, was so evidently the result of collective work that, if credit had to be attributed, it was not clear at all to whom. A similar problem had been faced in high energy physics where experiments were conducted by hundreds of physicists. As Peter Galison has shown, an original notion of collective author was eventually worked out in order to attribute credit fairly to all participants.74 In the biomedical sciences, until very recently, no equivalent system was available to distribute credit between authors.75 As if they

68 APS Archives, Walter Goad papers, Winston Salser to Walter Goad, December 31, 1979. 69 NBRF Archives, Margaret O. Dayhoff, “NIH-LM Grant Application”, Mai 31, 1973. 70 NBRF Archives, Margaret O. Dayhoff to Joshua Lederberg, draft, March 1964. 71 SANGER 1988, p. 1. 72 DAYHOFF et al. 1965, p. 1. 73 KUSCH 2002. 74 GALISON 2003. 75 BIAGIOLI 2003; BIAGIOLI 1999.

117 Bruno J. Strasser

wanted to prevent the objection that they were capitalizing on the work of others, the editors of the Atlas warned in their preface that “Some of the insights which have been developed cannot be attributed to any particular worker or school”.76 The difficulties with which Dayhoff was confronted were typical of the natural historical enterprise. The experimental tradition was built around the figure of the solitary researcher who would produce knowledge in a single location: the laboratory. Natural facts produced there would then move outwards to other laboratories where experiments could be replicated and knowledge validated.77 In the natural history tradition, facts moved the other way around, from the periphery to a centre – garden, museum or database – where they were assembled into new knowledge. This required natural historians to coordinate and discipline a large network of investigators, who would hardly be rewarded for their work. To complete this task, Dayhoff had little other means that to rely on the moral ideal of a scientific community that would work generously for the common good. The fierce competition in the experimental sciences, resulting from the increasing number of scientists involved in similar projects, but also her relative marginality to the field of protein research and, to be sure, the status of women in science, prevented her from completely fulfilling her goal.

VI. Moral realignments

If so many reasons can explain why Dayhoff’s project was wrongheaded, why did sequence databases eventually become such a successful and fast growing scientific enterprise by the end of the century ? Did the experimental scientific community finally adopt the moral economy that Dayhoff and her collaborators were encouraging? Or did the promoters of the various sequence databases adapt their project to the existing moral economy in the experimental life sciences? A brief overview of the development of GenBank, today’s largest DNA sequence database, points to the latter. By the late 1970s, the main focus of attention had shifted from protein sequences to DNA sequences.78 In 1980, several researchers convinced the NIH to fund a national DNA database. Two groups made an offer to the NIH for a 3 million dollar contract: Dayhoff and the physicist Walter Goad from the Los Alamos national laboratory. The judgment of the NIH experts on both proposals reveals some of the tensions I have outlined earlier. In particular, they were worried about how the data could be collected and distributed most efficiently. One of the advantages of Goad’s proposal was that it could offer access to the database through the Department of Defense’s computer network, ARTPANET, which would facilitate contact with the scientific community. However, one of the disadvantages of Goad’s proposal, was that the project would be hosted in what was considered by the biologists as a military institution, with its unpleasant professional culture of secrecy. In addition, Los Alamos, and Walter Goad in

76 DAYHOFF et al. 1965, p. 1. 77 On Newton as a solitary genius, FARA 2002, chapter 6, on solitude as a condition of knowledge, SHAPIN 1990. On the circulation of knowledge, see LATOUR 1987. 78 The development of new DNA sequencing methods in 1977 by Frederick Sanger and, independently, by Walter Gilbert and Allan M Maxam lead to a rapid increased in the number of known sequences. MORANGE 1998, chapter 16.

118 Collecting and Experimenting: The Moral Economies of Biological Research, 1960s-1980s.

particular, had almost no contacts with the biomedical community to offer. Indeed, Goad was a long time member of the theoretical division, which had been involved in the development of thermonuclear weapons, before he became interested in theoretical issues in biology. Dayhoff, on the other hand, could claim a long experience in the field, as well as prolonged contacts with the biomedical community. However, the private status of the National Biomedical Research Foundation, raised the question of whether access to the database would be entirely free. And in case the NIH experts had not yet noticed this point, the Los Alamos group made it clear to them:

We can assure NIH that [we do not intend] to assert any proprietary interest whatsoever in any data [...]. We believe it is essential to the function of this national resource that data obtained from the data bank have no restriction whatever placed on its further dissemination. The NBRF at Georgetown, on the other hand, has sought revenues from sales of their database to support their work and have therefore placed restrictions on further dissemination of data acquired from the NBRF.79

Instead of relying on the disinterested and collaborative ethos of individual researchers, Goad managed to embed a data collecting system within the existing moral economy and reward system of the experimental life sciences. To this end, Goad negotiated with twenty major journals publishing DNA sequences that they adopt a policy of mandatory submission of the sequences to the database before an article was published.80 Journal editors could save printing space and the authors would be rewarded for their submission by getting the scientific credit associated with a publication in the journal. An editorial of the Proceedings of the National Academy of Science, explained the new policy, but eluded the more profound reason that would bring scientists to comply: “scientists who generate sequences ["] are also users of sequence information. Self- interest should therefore dictate compliance”.81 In addition, a system of confidential submission would also insure that data deposited in the database would remain outside of the public domain, at least until the published paper would appear, in order not to give away crucial information to a competitive group.82 Finally, Goad offered free and complete access to all researchers via computer networks. In August 1982, Los Alamos was awarded the NIH contract and the system proposed to collect and manage the database has proved successful to the present day.83

VII. Conclusion

Precisely when the experimental tradition was becoming ever more powerful in the life sciences, the use of sequence databases developed from a marginal specialty to an indispensable tool to almost all laboratory researchers around the world. One could, as many have, consider these research practices as part of an entirely new research tradition, , resulting from the introduction of computers into biology and medicine. I believe, however, it is more fruitful to understand these developments as the continuation of a long tradition of natural history, and to

79 APS Archives, Walter Goad papers, Box 3, GenBank Correspondence with BBN 2 (1982), BBN to NIH, May 7, 1982. 80 APS Archives, Walter Goad to Richard Roberts, March 17, 1983. 81 DAWID 1989. 82 On this policy during the Human Genome Project, HILGARTNER 1998; HILGARTNER 2004. 83 LEWIN 1982.

119 Bruno J. Strasser

remember that the practices of natural history have been much more diverse than the caricature that served as a straw man for a number of reforms in experimental biology over the last two centuries. Ursula Klein has recently reminded us that, in the early modern period, the proponents of the experimental method did not think of natural history and experimentation in binary terms, but recognised the importance of different styles of experimentation, one of which was “experimental history”.84 What may sound an oxymoron to today’s epistemologist, was an essential practice in chemistry well into the 19th century. Contemporary sequence databases can perhaps be understood in the framework of this particular experimental tradition. I have relied on the notion of moral economy to explain the conflicts between different ways of doing science, but have hardly reflected on the origins of these different moral economies. Gender might well hold part of the answer. From 1965 to the present day, all the principal investigators and most of the staff of the protein sequence database have been women, in striking contrast with laboratories devoted to experimental science. As Evelyn Fox-Keller, Sandra Harding, and others have argued, the experimental enterprise can be understood as a gendered activity, especially in its focus on mastering nature.85 Also, the increasingly competitive research environment in the experimental sciences had strengthened winner-take-all strategies, to the expense of more communitarian ways of doing science. The collecting enterprise on the other hand, seeking to describe a larger picture, and driven by the ideal of completeness, where individual reward could hardly be achieved, might have provided an alternative more appealing to women. As Dayhoff wrote to a young female colleague who was considering a career in science: “only a minority of women have the ability or interest to participate in the ‘masculine’ scientific world. Only a minority of these choose mathematical subjects. Those who do can carry to this desert a range of feminine concerns that have been completely overlooked.”86 The historiography of early modern science could also shed some light on the rise of collections and databases in the post-war period. Some have argued that around 1600, the numerous new and strange facts brought back from expeditions to distant worlds, the exploration of nature’s diversity beyond Aristotelian categories, and more generally the increase in published scholarship had resulted in an “information overload”. As a result, scholars were led to adopt new strategies to deal with the flood of information, and the rise of natural history in that period can be understood as part of this reaction.87 Similarly, it could be argued that the dramatic increase in the scientific workforce during the post-war period, and the resulting explosion in new scientific facts, highlighted by Derek de Solla Price in his Little Science, Big Science published in 1963,88 similarly led to a number of initiatives to organise, standardize, and assimilate this vast amount of knowledge. Sequence database were one of these initiatives, as one researcher commented in 1966: “[The Atlas] will clearly become a most valuable compilation, particularly as this sort of information accumulates and one’s memory begins to be overburdened”.89

84 KLEIN 2005. 85 KELLER 1995; HARDING 1998. 86 NBRF Archives, Margaret O. Dayhoff to Susan Tideman, October 18, 1968. 87 ROSENBERG 2003; OGILVIE 2003. 88 PRICE 1986 [1963]. 89 NBRF Archives, E Margoliash to Richard V Eck, February 2, 1966.

120 Collecting and Experimenting: The Moral Economies of Biological Research, 1960s-1980s.

One of the many legacies of the fact that history and philosophy of science as a discipline has developed in a century when physics was culturally dominant, is our focus on the relationships between theory and experiment, at the expense of other analytic perspectives, such as the relationships between natural history and experimentalism, that were perhaps unimportant in physics, but crucial in other fields such as biology, chemistry, and medicine. And perhaps historians of medicine have been more sensitive to the different meaning that the word “experimental” has taken in the past.90 As the example of sequence databases suggests, experimentalism has a rich texture, some essential aspects of which we have paid only scant attention so far.

Acknowledgments

I would like to thank Angela Creager, Michael Gordin, Joe November, the participants and organizers to the conference “History and Epistemology of Molecular Biology and Beyond: Problems and Perspectives”, at the Max-Planck-Institut für Wissenschaftsgeschichte, October 13- 15 2005, for useful comments. I also thank Robert S. Ledley, Winona Barker, Lois T. Hunt and Ruth Dayhoff, for granting interviews and access to their archives, as well as the archivists of the American Philosophical Society (APS) and the California Institute of Technology (Caltech). The work was support by the Fondation du 450e anniversaire de l’Université de Lausanne and the Swiss National Science Foundation (grant no 105311-109973).

Bibliography

ABELSON, P.H. 1968. Amino acid sequence in proteins. Science 160:951. ABIR-AM, P. 1992. The politics of macromolecules molecular biologists, biochemists, and rethoric. Osiris 7:164-191. ALLEN, G.E. 1978. Life Science in the Twentieth Century. Cambridge; London: Cambridge University Press. BAIROCH, A. 2000. Serendiplity in bioinformatics, the tribulations of a Swiss bioinformatician through exciting times! Bioinformatics 16 (1):48-64. BENSON, K.R. 1988. From museum research to laboratory research: the transformation of natural history into academic biology. In The American Development of Biology, edited by R. RAINGER, K. R. BENSON et J. MAIENSCHEIN. : University of Press. BIAGIOLI, M. 1993. Galileo, Courtier. The Practice of Science in the Culture of Absolutism. Chicago: University of Chicago Press. BIAGIOLI, M. 1999. Aporias of scientific authorship: credit and responsibility in contemporary biomedicine. In The Science Studies Reader, edited by M. BIAGIOLI. New York: Routledge. BIAGIOLI, M. 2003. Rights or rewards? Changing frameworks of scientific authorship. In Scientific Authorship: Credit and Intellectual Property in Science, edited by M. BIAGIOLI et P. GALISON. New York: Routledge. BOURDIEU, P. 2000 [1972]. Esquisse d’une théorie de la pratique. Paris: Seuil. BOWLER, P.J., et MORUS, I.R. 2005. Making Modern Science: A Historical Survey. Chicago: University Of Chicago Press. CAMPBELL-KELLY, M., et ASPRAY, W. 2004. Computer: A History of the Information Machine. Boulder: Westview Press. COLEMAN, W. 1977 [1971]. Biology in the Nineteenth Century: Problems of Form, Function and Transformation. Cambridge: Cambridge University Press. DAWID, I.B. 1989. Editorial submission of sequences. PNAS 86:407.

90 WARNER 1991.

121 Bruno J. Strasser

DAYHOFF, M.O. 1964. Computer aids to protein sequence determination. Journal of Theoretical Biology 8:97- 112. DAYHOFF, M.O. 1969. Computer analysis of protein evolution. Scientific American 221:86-95. DAYHOFF, M.O., ECK, R.V., CHANG, M.A., et SOCHARD, M.R. 1965. Atlas of Protein Sequence and Structure. Silver Spring: National Biomedical Research Foundation. DAYHOFF, M.O., et LEDLEY, R.S. 1962. Comprotein: A computer program to aid primary protein structure determination. In Proceedings of the Fall Joint Computer Conference. Santa Monica: American Federation of Information Processing Societies. DE CHADAREVIAN, S. 1996. Sequences, conformation, information: biochemists and molecular biologists in the 1950s. Journal of the History of Biology 29 (3):361-386. DE CHADAREVIAN, S. 1998. Following molecules: haemoglobin between the clinic and the laboratory. In Molecularizing Biology and Medicine: New Practices and Alliances, 1910s-1970s, edited by S. DE CHADAREVIAN et H. KAMMINGA. Amsterdam: Harwood Academic Publishers. DE CHADAREVIAN, S. 2002. Designs for Life: Molecular Biology after World War II. Cambridge: Cambridge University Press. DE CHADAREVIAN, S., et KAMMINGA, H., eds. 1998. Molecularizing Biology and Medicine: New Practices and Alliances, 1910s-1970s. Amsterdam: Harwood Academic Publishers. DIETRICH, M.R. 1994. The origins of the neutral theory of molecular evolution. Journal of the History of Biology 27:21-59. DIETRICH, M.R. 1998. Paradox and persuasion: negotiating the place of molecular evolution within evolutionary biology. Journal of the History of Biology 31:85-111. DOOLITTLE, R.F. 1997. Some reflections on the early days of sequence searching. Journal of Molecular Medicine 75:239-241. FARA, P. 2002. Newton: The Making of a Genius. London: Macmillan. FARBER, P.L. 2000. Finding Order in Nature: The Naturalist Tradition from Linnaeus to E. O. Wilson. Baltimore; London: The Johns Hopkins University Press. FINDLEN, P. 1994. Possessing Nature: Museums, Collecting, and Scientific Culture in Early Modern Italy. Berkeley etc.: Univ. of California Press. FORTUN, M., et SCHWEBER, S.S. 1993. Scientists and the legacy of World War II: the case of operations research. Social Studies of Science 23:595-642. GALISON, P. 2003. The collective author. In Scientific Authorship. Credit and Intellectual Property in Science, edited by M. BIAGIOLI et P. GALISON. New York: Routledge. GHISELIN, M.T., et LEVITON, A.E. 2000. Cultures and Institutions of Natural History Essays in the History and Philosophy of Science. Los Angeles: California Academy of Sciences. HAGEN, J.B. 1999. Naturalist, molecular biology, and the challenge of molecular evolution. Journal of the History of Biology 32:321-341. HAGEN, J.B. 2000. The origins of bioinformatics. Nature Reviews 1:231-236. HAGEN, J.B. 2001. The introduction of computers into systematic research in the United States during the 1960s. Studies in the History and Philosophy of Biological and Biomedical Sciences 32 (2):291-314. HAGSTROM, W. 1982. Gift giving as an organizing principle in science. In Science in Context: Readings in the Sociology of Science, edited by B. BARNES et D. EDGE. Cambridge: MIT Press. HARDING, S. 1998. Is Science Multicultural? Postcolonialisms, Feminisms, and Epistemologies. Bloomington etc.: Indiana Univ. Press. HILGARTNER, S. 1998. Data access policy in genome research. In Private Science. Biotechnology and the Rise of the Molecular Sciences, edited by A. THACKRAY. Philadelphia: University of Pennsylvania Press. HILGARTNER, S. 2004. Making maps and making social order: governing American genome centers, 1988- 93. In From Molecular Genetics to Genomics: The Mapping Cultures of Twentieth-Century Genetics, edited by J.-P. GAUDILLIÈRE et H.-J. RHEINBERGER. London etc.: Routledge. HUNT, L. 1984. Margaret Oakley Dayhoff 1925-1983. Bulletin of Mathematical Biology 46 (4):467-472. IUPAC-IUB COMMISSION ON BIOCHEMICAL NOMENCLATURE. 1968. A one-letter notation for amino acid secruences. The Journal of Biological Chemistry 234 (13):3557-3559. JARDINE, N., SECORD, J.A., et SPARY, E.C. 1996. Cultures of Natural History. London; New York etc.: Cambridge University Press. KAY, L.E. 1993. The Molecular Vision of Life. Caltech, the Rockefeller Foundation and the Rise of the New Biology. New York: Oxford University Press. KAY, L.E. 2000. Who Wrote the Book of Life. A History of the Genetic Code. Stanford: Sanford University Press. KELLER, E.F. 1995. Refiguring Life, Metaphors of Twentieth-Century Biology. New York: Columbia University Press.

122 Collecting and Experimenting: The Moral Economies of Biological Research, 1960s-1980s.

KLEIN, U. 2005. Experiments at the intersection of experimental history, technological inquiry, and conceptually driven analysis: a case study from early nineteenth-century France. Perspectives on Science 13 (1):1-48. KOHLER, R.E. 1994. Lords of the Fly: Drosophila Genetics and the Experimental Life. Chicago; London: The Univ. of Chicago Press. KOHLER, R.E. 1999. Moral economy, material culture and community in Drosophila genetics. In The Science Studies Reader, edited by M. BIAGIOLI. New York: Routledge. KUSCH, M. 2002. Testimony in communautarian epistemology. Studies in History and Philosophy of Science 33:335-354. LATOUR, B. 1987. Science in Action. Cambridge: Harvard University Press. LEDLEY, R.S. 1965. Use of Computers in Biology and Medicine. New York; Saint Louis etc.: McGraw-Hill. LEWIN, R. 1982. Long-awaited decision on DNA database. Science 217:817-818. MADDOX, B. 2003. Rosalind Franklin: The Dark Lady of DNA. Repr. ed. New York: Perennial. MAUSS, M. 1923-1924. Essai sur le don: forme et raison de l’échange dans les sociétés archaïques. Année sociologique, nouvelle série 1:30-186. MCLAUGHLIN, P. 2002. Naming biology. Journal of the History of Biology 35:1-4. MORANGE, M. 1998. A History of Molecular Biology. Cambridge: Harvard University Press. MORSE, P.M. 1973. George Elbert Kimball. Biographical Memoirs of the National Academy of Science of the United States of America 43 (128-146). NOVEMBER, J. 2004. LINC: biology’s revolutionary little computer. Endeavour 28 (3):125-131. NYHART, L.K. 1996. Natural history and the ‘new’ biology. In Cultures of Natural History, edited by N. JARDINE, J. A. SECORD et E. C. SPARY. London: Cambridge University Press. OGILVIE, B.W. 2003. The many books of nature: renaissance naturalists and information overload. Journal of the History of Ideas 64 (1):29-40. OLBY, R. 1994 [1974]. The Path to the Double Helix. New York: Dover. PICKSTONE, J.V. 2000. Ways of Knowing: A New History of Science, Technology and Medicine. Manchester: Manchester University Press. PRICE, D.D.S. 1986 [1963]. Little Science, Big Science ... and Beyond. New York: Columbia University Press. RAINGER, R., BENSON, K.R., et MAIENSCHEIN, J., eds. 1988. The American Development of Biology. Philadelphia: University of Pennsylvania Press. RAU, E. 2005. Combat science: the emergence of operational research in World War II. Endeavour 29 (4):156-161. ROSENBERG, D. 2003. Early modern information overload. Journal of the History of Ideas 64 (1):1-9. SANGER, F. 1988. Sequences, sequences, and sequences. Annual Review of Biochemistry 57:1-28. SHAPIN, S. 1990. ‘The mind in its own place’: science and solitude in seventeenth-century England. Science in Context 4 (1):191-218. SPARY, E.C. 2000. Utopia’s Garden: French Natural History from Old Regime to Revolution. Chicago: University of Chicago Press. STRICK, JAMES E. 2004. Creating a cosmic discipline: the crystallization and consolidation of exobiology, 1957-1973. Journal of the History of Biology 37 (1):131-180. SUÁREZ, E., et BARAHONA, A. 1996. The experimental roots of the neutral theory of molecular evolution. History and Philosophy of the Life Science 18:55-811. THOMPSON, E.P. 1971. The moral economy of the English crowd in the eighteenth century. Past and Present 50:76-136. WARNER, J.H. 1991. Ideals of science and their discontents in late nineteenth-century american medicine. Isis 82:454-478. WATSON, J.D. 1980 [1968]. The Double Helix: A Personal Account of the Discovery of the Structure of DNA: Text, Commentary, Reviews, Original Papers. New York; London: W.W. Norton. WATSON, J.D. 2001. Genes, Girls and Gamow. Oxford: Oxford University Press. WILSON, E.O. 1994. Naturalist. Washington, DC etc.: Island etc. WOLFE, A.J. 2002. Germs in space. Joshua Lederberg, exobiology, and the public imagination, 1958-1964. Isis 93:183-205. WOLPERT, L. 1993. The Unnatural Nature of Science. London etc.: Faber and Faber. ZUCKERKANDL, E., et PAULING, L. 1962. Molecular disease, evolution, and genic heterogeneity. In Horizons in Biochemistry, edited by M. KASHA et B. PULLMAN. New York: Academic Press. ZUCKERKANDL, E., et PAULING, L. 1965. Molecules as Documents of Evolutionary History. Journal of Theoretical Biology 8:357-366.

123