The Human Genome Revealed
Total Page:16
File Type:pdf, Size:1020Kb
Downloaded from genome.cshlp.org on October 3, 2021 - Published by Cold Spring Harbor Laboratory Press Commentary The Human Genome Revealed James D. Watson President, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA Seeing the International Sequencing Con- proteins necessary for bacterial existence. Back horses of our big sequencing labs, are 1000- sortium’s draft of the human genome is then I thought that the human genome, at sev- fold-improved descendents of the original se- highly satisfying. The way in which its 3 bil- eral billion bases long, was much, much too quencing machine put together by Mike lion bases have been determined closely fol- large to take on. Soon, however, I became a Hunkepillar and Lloyd Smith in Lee Hood’s lowed the course outlined more than a de- strong proponent of an internationally-based Caltech lab. The computers and software that cade ago by the National Academy of Sci- Human Genome Project (HGP), believing that now compare new raw DNA sequences to pre- ences (NAS) Committee on “Mapping and the large-scale mapping and sequencing re- existing ones also do their tasks 1000 times Sequencing the Human Genome.” Bruce Al- sources that it would command would greatly faster than was possible when the HGP began. berts, now the President of the NAS, was its hasten our discovery of the genetic underpin- A major obstacle to the correct assembly chairman and I one of its 14 other members. nings of many important human diseases. of the human genome was the vast amount The predictions in our 1988 report, that the Our NAS committee wasted little time of the repetitive DNA (∼50%). So the HGP human genome could be sequenced over a on whether we needed a HGP; instead we fo- labs decided early on to sequence DNA com- 15-year period for a cost of three billion dol- cused on how it should be organized and fi- ing from known chromosomal locations. lars, were more accurate than we dared guess. nanced. It seemed best to begin modestly and Their map-based strategy, however, was sud- Two more years of work, to fill in gaps and end with a sequencing crescendo, hopefully denly challenged in May 1998 by the new correct mistakes, will result in an almost er- fueled by much lower sequencing costs. We private company Celera Genomics, led by rorless genetic script for human existence. agreed unanimously that the first big se- Craig Venter. Celera proposed an alternative That the human script would become quencing efforts should not focus on human strategy whereby the genome was randomly available within our lifetimes never passed DNA but on DNA from a model organism of shredded into pieces that were sequenced and through my mind or that of Francis Crick genetics, such as baker’s yeast and the fruit then reassembled in a single process without when we found the double helix in 1953. At fly, Drosophila. We knew that many human the construction of a map—a strategy known that time, just learning how cells read the ge- genes were likely to be homologous to those as “whole-genome shotgun sequencing.” The netic instructions within DNA seemed a tall of model organisms, and these provided good key to their approach was to be the 200 new, order. Happily, progress was faster than ex- systems for studying gene function. high-capacity capillary DNA sequencers that pected, and by 1966 we knew how the ge- That we proposed a 15-year effort re- were about to be launched in the market, as netic code utilizes groups of three DNA bases flected our belief that those starting the well as new proprietary shotgun assembly soft- to specify the amino acid constituents of pro- project should also be part of the finishing ware for use on high-powered computers. So teins—the main “actors” in the plays of life. team. Richard Gibbs, Eric Lander, Maynard armed, Cetera promised a first draft of the hu- Things speeded up even more after the re- Olson, John Suiston, Bob Waterston, and man genome in only two years. combinant DNA procedures of Stanley Co- Jean Weissenbach all have stayed the course, I first heard of Celera in a telephone call hen and Herb Boyer burst upon the scene in running increasingly larger megabase se- from my former associate, Richards Roberts, 1973. Gene cloning and manipulation meta- quencing labs. Only one of our original NAS who organized the first (1988) Cold Spring morphosed from being dreams to becoming committee is no longer in science. Sadly, Dan Harbor meeting on Genome Mapping and Se- facts of life. Simultaneously, Fred Sanger and Nathans died of leukemia three years ago, at quencing. Rich told me that Celera would Walter Gilbert each developed a powerful way the age of 70. During our committee delib- blow the international consortium out of the to determine the order of bases along DNA erations, no one proposed a shorter time water and asked me to consider joining him molecules. This meant that humans, like cells, frame—technology had to improve too on its scientific advisory board. Expecting to could read the messages of genes. The way was much. Later, I learned that Congress likes big learn more about Celera’s game plan at our open to ascertain the complete genetic instruc- projects to be finished within 10 years so that soon-to-be-held spring 1988 Genome Meet- tions, i.e. to sequence the genome, of any or- key initial backers are still in Washington ing, I quickly phoned the National Institutes ganism (subject to the usual constraints of when the achievement is celebrated. Luckily, of Health (NIH) Genome Office and the Well- money, personnel, and technology). Tom Harkin recently became that Congress come Trust to report that Celera had marked The first genomes tackled were those of rarity: a three-term Democratic senator from them out for obsolescence. Later that week, viruses, with the first sequenced viral ge- Iowa. So, like New Mexico’s Republican Pete Craig Venter visited the NIH to tell Harold nomes containing only several thousand Domenici, he will see the HGP from its be- Varmus and Francis Collins that the HGP’s bases. By the early 1980s, viral genomes con- ginnings to its finish as a senator. future effort might best be devoted to se- taining more than 100,000 bases had been The improvements in technology that quencing the mouse. completed, and bacterial genomes contain- the HGP would need for its success material- From the moment of Rich Roberts’s call, ing more than a million bases became realis- ized almost on schedule. They largely in- I found it unthinkable that a private com- tic objectives. Completion of such genomes volved modifications in pre-existing meth- pany should effectively control much of the would at last tell us the number of different ods, as opposed to great leaps forward that human genome through key patents. This generate Nobel Prize-like rewards. The cur- was a gene power-play that, at all costs, must Article and publication are at http://www.genome. rent DNA sequencing machines, the work- be contained. To my relief, the Wellcome org/cgi/doi/10.1101/gr.211601. 11:1803–1804 ©2001 by Cold Spring Harbor Laboratory Press ISSN 1088-9051/01 $5.00; www.genome.org Genome Research 1803 www.genome.org Downloaded from genome.cshlp.org on October 3, 2021 - Published by Cold Spring Harbor Laboratory Press Watson Trust’s immediate response was to double the gene numbers. So, I and virtually all of my Of the many new facts emerging from budget for human genome sequencing at the scientific peers were surprised last year when the human genome draft, I am most excited Sanger Centre. Although the merits of each the number of genes of the fruit fly, Dro- by the finding that repetitive sequences are approach were yet to be tested, Celera’s “su- sophila melanogaster, was found to be much almost absent from the four clusters of ho- per shotgun” method quickly caught the lower than that of a less complex animal, the meobox genes. Unlike most functionally- fancy of the serious press, who reported that roundworm Caenorhabditis elegans (13,500 vs. related human genes, the chromosomal order the HGP was off-course. In fact, two years ear- 18,500). More shocking still was the recent of homeobox genes reflects their temporal lier at its spring 1996 Bermuda meeting, HGP finding that the small mustard plant, Araba- expression patterns during embryonic devel- leaders had seriously discussed Jim Weber’s dopsis thaliana, contains many thousand opment. In this respect, they resemble the proposal for a low-resolution, whole-genome more genes (∼28,000) than does C. elegans. genes of bacterial operons that are tran- shotgun effort to complement the high- Now we are jolted again by the conclusion scribed from single messenger RNA mol- resolution map-based thrust. There, Phil that the number of human genes may not be ecules: Genes located at the start of bacterial Green’s off-the-cuff calculations, later redone much more than 30,000. Until a year ago, I operons are transcribed first by RNA polymer- and published (Green 1997), indicated that hu- anticipated that human existence would re- ase molecules moving along their respective man DNA is too repetitive for a pure shotgun quire 70,000–100,000 genes. region of DNA. Conceivably, much of early approach to assemble the genome correctly. Why organismal complexity fails to cor- developmental timing in humans may be a In September 1998, I returned to Wash- relate with gene numbers is not fully clear.