Review the Human Genome Project Maynard V
Total Page:16
File Type:pdf, Size:1020Kb
Proc. Natl. Acad. Sci. USA Vol. 90, pp. 4338-4344, May 1993 Review The human genome project Maynard V. Olson Department of Molecular Biotechnology, University of Washington, Seattle, WA 98195 ABSTRACT The Human Genome germ cells), which contain 3 x 109 base synthesize a mammalian protein provides Project in the United States is now well pairs (bp) of DNA. Given the four-letter a route to large amounts of the pure underway. Its programmatic direction was alphabet of DNA-customarily symbol- protein. The ability to alter the structure largely set by a National Research Council ized with the letters G, A, T, and C-the of the protein through site-directed mu- report issued in 1988. The broad frame- sequence of 3 x 109 bp corresponds to tagenesis lends genuine novelty to the work supplied by this report has survived 750 megabytes of information. If the se- resultant biosynthetic opportunities. almost unchanged despite an upheaval in quence of the human genome could be However, the importance of recombi- the technology of genome analysis. This determined, it would be possible to store nant-DNA technology in making the Hu- upheaval has primarily affected physical and manipulate it on a desktop computer. man Genome Project feasible stems from and genetic mapping, the two dominant However, even the dream of acquiring its analytical dimensions. Cloning pro- activities in the present phase ofthe project. DNA sequence on this scale is of recent vides a means to purify individual recom- Advances in mapping techniques have al- origin. Dramatic progress was made dur- binant-DNA molecules from complex lowed good progress toward the specific ing the 1950s and 1960s in understanding mixtures and then to prepare biochemi- goals of the project and are also providing the mechanisms by which genetic infor- cally useful amounts of the molecules by strong corollary benefits throughout bio- mation specifies biological structure and culturing the microbial strains into which medical research. Actual DNA sequencing function. However, during this era, the they have been introduced. of the genomes of the human and model information itself remained nearly inac- A less obvious consequence of the organisms is still at an early stage. There cessible. discovery of restriction enzymes was the has been little progress in the intrinsic A landmark event in DNA analysis development ofthe first practical method efficiency of DNA-sequence determination. came in 1970 with the discovery of site- of genetic mapping in humans (ref. 4, pp. However, refinements in experimental pro- 519-522; ref. 6). Most human cells con- tocols, and man- specific restriction enzymes (refs. 2 and instrumentation, project 3; ref. 4, p. 64). These remarkable en- tain two copies of each DNA sequence, agement have made it practical to acquire one of maternal and the other of paternal sequence data on an enlarged scale. It is zymes have the ability to scan any source of DNA for every occurrence of a par- origin. When a new germ cell is pro- also increasingly apparent that DNA- duced, it contains only one copy of the sequence data provide a potent means of ticular string of bases-for example, the enzyme EcoRI recognizes the genome, a copy that is a unique mosaic of relating knowledge gained from the study string the two genomes from which it was de- of model organisms to human GAATTC. Restriction enzymes cleave biology. both strands of the double helix at their rived. Genetic mapping involves measur- There is as yet little indication that the ing, through actual inheritance studies in infusion of technology from outside biology recognition sites. Since the cleavage events are directed by the DNA se- families, the probability that two closely into the Human Genome Project has been spaced segments of the genome will stay effectively stimulated. Opportunities in this quence, they always occur at the same positions in different samples of genomic together during germ-cell formation. The area remain large, posing substantial tech- mapping requires an ability to distinguish nical and policy challenges. DNA extracted from any genetically ho- mogeneous source (e.g., different tissues between the two copies of the genome of the same individual human or different present in the somatic cells from which In the United States, the Human Genome individuals sampled from an inbred strain the germ cells are derived. Subtle differ- Project first took clear form in February of mice). ences in the base sequence of different 1988, with the release of the National Restriction enzymes provide a means instances of the human genome some- Research Council (NRC) report Mapping to develop precise physical maps ofDNA times alter restriction sites and, hence, and Sequencing the Human Genome (1). simply by determining the coordinates in restriction-fragment sizes. These alter- To a degree remarkable in Federal sci- base pairs of the sites at which particular ations are detectable even in complex ence policy, this report has had a clear enzymes cleave (ref. 4, pp. 66-67; ref. 5). genomes by a method known as gel- effect on subsequent programmatic ac- Like topographic maps, physical maps of transfer hybridization, which was devel- tivity. With a budget in the current fiscal DNA derive their utility through annota- oped in 1975 (ref. 4, pp. 127-130; ref. 7). year of $170 million, jointly administered tion: mapped landmarks provide refer- In 1987, the first global human genetic by the- National Institutes of Health ence points relative to which functional map, based on "restriction-fragment- (NIH) and the Department of Energy DNA sequences such as genes can be length polymorphisms," was published (DOE), a program is underway that con- localized. Restriction enzymes also facil- (8). forms closely to the recommendations of itate a key step in the cut-and-splice As to the actual determination ofDNA the NRC committee. After a 5-year real- procedures by which recombinant-DNA sequence, reasonably efficient methods ity check, it is of both scientific and molecules (i.e., DNA clones) are con- first appeared in 1977 (ref. 4, pp. 67-69; policy interest to examine how the com- structed (ref. 4, pp. 73-74 and pp. 99- refs. 9 and 10). A technique known as mittee's view toward this project has 124). fared in the field. The importance of recombinant-DNA Abbreviations: DOE, Department of Energy; technology is often attributed primarily FISH, fluorescence in situ hybridization; Background NIH, National Institutes of Health; NRC, to its synthetic dimension. For example, National Research Council; STS, sequence- The human genome is the genetic mate- the ability to design and construct a DNA tagged site; YAC, yeast artificial chromo- rial in human egg and sperm cells (i.e., molecule that programs a bacterium to some. 4338 Downloaded by guest on September 28, 2021 Review: Olson Proc. Natl. Acad. Sci. USA 90 (1993) 4339 chain-termination sequencing came to 188-191) and as the starting points for methods used to construct them. This dominate standard practice: it is based on sequence analysis or functional studies. capability required a new choice of land- enzymatic DNA synthesis, carried out in Human genetics is uniquely dependent marks called sequence-tagged sites vitro in the presence of artificial chain- on strong research infrastructure. While (STSs; ref. 19). An STS is simply a short, terminating variants of the normal DNA- model organisms have been extensively unique sequence of DNA that can be precursor molecules. By the early 1980s bred for the specific purpose of facilitat- amplified via the PCR. STSs are ideal individual sequences exceeding 105 bp ing genetic analysis (13), human genetics landmarks during map construction be- had been determined (11); however, a is limited to the examination of individ- cause of the ease with which they can be more common scale of analysis was 103- uals, families, and populations as they detected by PCR assays. Equally impor- 104 bp. Most physical mapping was car- are found in contemporary society. tant is their role in map representation ried out on a similar scale. Hence, the NRC committee set ambi- and map use. Given the gap between the ability to tious goals for the construction of de- Complex physical maps based on re- determine 105 bp of DNA sequence in a tailed physical and genetic maps of the striction sites are of little value as exper- state-of-the-art laboratory and the 109-bp human genome, as well as organized col- imental tools unless they are supported size of the human genome, the NRC lections of cloned human DNA. By de- by a collection of clones that can be used committee confronted an enormous sign, the goals were too ambitious for the to detect particular segments of the problem of scale. Partly because of the technology of 1988. In retrospect, they mapped DNA via DNADNA hybridiza- obvious need for improved technology were so ambitious that they probably tion assays. Comprehensive maps of the would have overwhelmed the basic meth- and also because of a desire to maximize human genome would have to be sup- odologies on which the NRC report was tens of synergy between genome analysis and ported by thousands of clones, based. Fortunately, technical advances each of which would have to be main- studies of biological function, the com- since 1988 have exceeded all reasonable tained as a separate microbial strain. In mittee recommended against early em- expectations. contrast, STSs can be described in an phasis on large-scale sequencing of hu- Much of this progress was made pos- electronic data base in a form that makes man DNA. Instead, it advocated com- sible by the development of the polymer- them prehensive physical and genetic experimentally accessible in any mapping ase chain reaction (PCR).