LOGY P BIO H YS IC Y S R T S I E

T M

H E

I

H

C

C S

E N S G IC IN T E A ER RM ING INFO

ISSN: 1050–6101, Issue No. 44 Vol. 10, Nos. 1–2, February 1999

In This Issue U.S. HGP on Fast Track for Early Completion Project U.S.HGPonFastTrack...... 1 n September 1998, advisory commit- The U.S. HGP began officially in 1990 DOE Joint Genome Institute ...... 1 Itees at DOE and NIH approved new as a $3-billion, 15-year program to New 5-Year Goals...... 3 5-year goals aimed at completing the find the estimated 80,000 human Faster Sequencing with BACs ...... 4 Human Genome Project (HGP) 2 years genes and determine the sequence of SNPMarkers...... 6 earlier than originally planned in the 3 billion DNA building blocks Who’s Sequencing the Human Genome? 6 1990. The target date of 2003 also will that underlie all of human biology 1999 Oakland Workshop Proceedings...6 mark the 50th anniversary of Watson and its diversity. The early phase of Genomics Progress in Science ...... 6 and Crick’s description of DNA’s fun- the HGP was characterized by efforts Private-Sector Sequencing...... 7 GeneMap ’98 ...... 7 damental structure. to create the biological, instrumenta- tion, and computing resources neces- The new plan was published in the In the News sary for efficient production-scale October 23, 1998, issue of Science, EMSL Remote Access ...... 7 DNA sequencing. The first 5-year C. elegans Sequence ...... 8 which also cited the contributions of plan was revised in 1993 due to DOE BER Research...... 9 international partners. These part- remarkable technological progress, Hollaender Fellows...... 9 ners include the Sanger Centre in the and the second plan projected goals SBIR 1998 Awards ...... 9 United Kingdom and research centers Mouse Resources ...... 10 Mouse Consortium ...... 10 in Germany, Japan, and France. (see Five-Year Plan, p. 2) Chlamydia Genome Analysis ...... 11 HUGO Office Merger ...... 11 DOE Joint Genome Institute Exceeds Microbial Genomics Superbug Deinococcus radiodurans ...12 DNA Sequencing Goal Microbial Genome News ...... 12 ELSI he DOE Joint Genome Institute million “finished” bases and 40 million Eric Lander, HGP Science ...... 13 T(JGI) surpassed its sequencing goal “draft” bases. [“Finished” sequence has Mark Rothstein, Genetic Privacy ...... 14 of 20 Mb of human DNA for FY 1998, been checked for accuracy, with gaps James Wilson, Gene Therapy ...... 15 marking almost a tenfold increase in filled in to form a continuous stretch LeRoy Walters, Gene Therapy ...... 16 production over the previous year. of DNA across a chromosomal region.] The JGI sequencing effort is targeting Radio and TV Programs...... 17 “With this milestone, JGI rises to third Courses and Curriculum...... 17 chromosomes 5, 16, and 19. position worldwide in terms of its total Proteomics contribution of human DNA sequence “We are seeking to break the 100-Mb Looking at ...... 18 to public databases and great barrier in the year 2000,” said JGI Proteomics News ...... 19 promise for completion of the entire Director Elbert Branscomb. With 1998 Genetics in Medicine [Human Genome] project in 5 years,” worldwide sequencing capacity at Organization for Rare Disorders...... 19 noted Martha Krebs, Director of the about 200 Mb per year, all major Genetics in Medicine News . 15, 16, 19, 20 DOE Office of Science. sequencing laboratories are ramping Informatics up production. At least 600 Mb of Software Programs...... 11,20 Further dramatic increases are sequence is expected for 1999. GDB and Other Databases ...... 21 expected as JGI’s main sequencing Publications ...... 21 efforts move to its new facility in Wal- JGI, established at the end of 1996, is a consortium of scientists, engineers, Web, Other Resources, Publications nut Creek, California, half of which and support staff from the Lawrence 1999 Oakland Workshop Web Site .....6 has just been occupied. When the sec- Berkeley, Lawrence Livermore, and Gene Logic License ...... 11 ond half is completed in March 2000, Los Alamos national laboratories Calendars ...... 22 about 200 staff members will main- tain around-the-clock operations. [HGN 8(2), 1; www.ornl.gov/hgmis/ Funding...... 23 publicat/hgn/v8n2/01doe.html; JGI Subscriptions, Acronyms ...... 24 JGI’s sequencing goal for the current sequencing goals and progress: fiscal year is 70 Mb, including 30 www.jgi.doe.gov].◊

Human Genome News issues become available on the Web soon after going to press. Archived issues are also available. 2 Human Genome News 10(1–2) February 1999

Genome Project

According to Ari Patrinos, DOE Asso- Five-Year Plan (from p. 1) To find out “Who’s Sequencing the Human ciate Director for Biological and Envi- Genome,” see p. 6. through FY 1998. The latest plan was ronmental Research, “Although we developed during a series of individ- have as our primary goal the finished ual and joint DOE and NIH work- sets of full-length cDNA clones and ‘Book of Life’ by the end of 2003, we sequences for human and model- shops held over the past 2 years (see also want the working draft to be as box, p. 3). organism genes. Other functional- useful as possible.” genomics goals include studies into Observers have predicted that the NIH and DOE sequencing centers and control, creation 21th century will be the “biology cen- expect their facilities to generate of mutations that cause loss or tury.” The analytical power arising about 60% to 70% of the human DNA alteration of function in nonhuman from the reference DNA sequences of sequence, which will be made avail- organisms, and development of several entire and other able broadly and rapidly via the Web experimental and computational genomic resources is anticipated to to stimulate further research. methods for analyses. help jump start the new millennium. Sequencing Technology Comparative Genomics Human DNA Sequencing Although current sequencing capacity The functions of human genes and The HGP’s continued emphasis is on is far greater than at the inception of other DNA regions often are revealed obtaining a complete and highly accu- the HGP, achieving the new sequencing by studying their parallels in nonhu- rate reference sequence (1 error in goals will require a two- to threefold mans. To enable such comparisons, 10,000 bases) that is largely continu- improvement. Further incremental HGP researchers have obtained com- ous across each human chromosome. advances in sequencing technologies, plete genomic sequences for the bacte- Scientists believe that knowing this efficiency, and cost will be needed. For rium Escherichia coli, the yeast sequence is critically important for future sequencing applications, plan- , and the understanding human biology and for ners emphasize the importance of sup- roundworm . applications to other fields. porting novel technologies that may be Sequencing continues on Drosophila The plan calls for generating a “work- 5 to 10 years in development. melanogaster and the laboratory mouse. The availability of complete ing draft” of the human genome DNA Sequence Variation sequence by 2001. The working draft genome sequences generated both A new goal focuses on identifying indi- will comprise shotgun sequence data inside and outside the HGP is driving vidual variations in the human genome. from mapped clones, with gaps and a major breakthrough in fundamental Although more than 99% of human ambiguities unresolved. If these data biology as scientists compare entire DNA sequences are the same across sets can be merged with those from genomes to gain new insights into the population, variations in DNA the private sector, they may increase evolutionary, biochemical, genetic, sequence can have a major impact on the depth of the mapped draft, which metabolic, and physiological path- how humans respond to disease; envi- scientists expect will contain about ways. HGP planners stress the need ronmental insults such as bacteria, half the genes. Draft sequence will for a sustainable sequencing capacity viruses, toxins, and chemicals; and provide a foundation for obtaining the to facilitate future comparisons. drugs and other therapies. high-quality finished sequence and Ethical, Legal, and Social also will be a valuable tool for Methods are being developed to detect Implications (ELSI) researchers hunting disease genes. different types of variation, particu- Rapid advances in the science of genet- larly the most common type called ics and its applications present new single-nucleotide polymorphisms and complex ethical and policy issues (SNPs), which occur about once every Resource Success Story for individuals and society. ELSI pro- 100 to 300 bases. Scientists believe grams that identify and address these SNP maps will help them identify the implications have been an integral A Critical Resource in Public, multiple genes associated with such part of the U.S. HGP since its incep- Private Sequencing complex diseases as cancer, diabetes, tion. These programs have resulted in For sequencing the human genome, vascular disease, and some forms of a body of work that promotes educa- scientists prefer the larger and mental illness. These associations are tion and helps guide the conduct of more stable BAC clones first devel- difficult to establish with conventional genetic research and the development oped with DOE support. gene-hunting methods because a sin- of related medical and public policies. BACs will be a critical component of gle altered gene may make only a the much-publicized private-sector small contribution to disease risk. A continuing challenge is to safeguard sequencing efforts of such companies the privacy of individuals and groups as Celera Genomics and Incyte [see Functional Genomics who contribute DNA samples for p. 7 and HGN 9(1–2), 1 (www.ornl. Efficient interpretation of the func- large-scale sequence-variation stud- gov/hgmis/publicat/hgn/v9n3/ tions of human genes and other DNA ies. Other concerns are to anticipate 01venter.html)]. As with all HGP sequences requires that resources and how the resulting data may affect resources and data, BACs are freely strategies be developed to enable concepts of race and ethnicity; iden- available to the entire research com- large-scale investigations across whole munity. For more details on BACs, tify potential uses (or misuses) of see p. 4.◊ genomes. A technically challenging genetic data in workplaces, schools, first priority is to generate complete and courts; identify commercial February 1999 Human Genome News 10(1–2) 3

Genome Project uses; and foresee impacts of genetic advances on the concepts of humanity Five-Year Research Goals and personal responsibility. of the U.S. Human Genome Project and October 1, 1998, to September 30, 2003 (www.ornl.gov/hgmis/hg5yp) Continued investment in current and Human DNA Sequence • Develop an integrated physical and new databases and analytical tools is • Achieve coverage of at least 90% of the genetic map for the mouse, generate critical to the success of the HGP and genome in a working draft based on additional mouse cDNA resources, mapped clones by the end of 2001. and complete the sequence of the to the future usefulness of the data it mouse genome by 2008. produces. Databases must adapt to the • Finish one-third of the human DNA • Identify other useful model organ- evolving needs of the scientific commu- sequence by the end of 2001. • isms and support appropriate nity and must allow queries to be Finish the complete human DNA genomic studies. answered easily. Planners suggest sequence by the end of 2003. developing a human genome database, • Make the sequence totally and freely Ethical, Legal, and Social Issues analogous to model organism databases, accessible. • Examine issues surrounding the that will link to phenotypic . completion of the human DNA Sequencing Technology sequence and the study of human Also needed are databases and analytical • Continue to increase the throughput genetic variation. tools for studying the expanding body of and reduce the cost of current • Examine issues raised by the inte- gene-expression and functional data, sequencing technology. gration of genetic technologies and for modeling complex biological net- • Support research on novel technolo- information into healthcare and works and interactions, and for col- gies that can lead to significant im- public-health activities. lecting and analyzing sequence- provements in sequencing technology. • Examine issues raised by the inte- variation data. • Develop effective methods for the gration of knowledge about genomics advanced development of sequencing and gene-environment interactions Training technologies and the introduction of in nonclinical settings. Planners note that future genomic sci- new approaches. • Explore how new genetic knowledge entists will require training in inter- Human Genome Sequence may interact with a variety of philo- disciplinary areas that include biology, Variation sophical, theological, and ethical perspectives. , engineering, mathe- • Develop technologies for rapid, large- • Explore how racial, ethnic, and matics, physics, and chemistry. Addi- scale identification and scoring of socioeconomic factors affect the use, single-nucleotide polymorphisms (SNPs) tionally, scientists with management understanding, and interpretation of and other DNA sequence variants. skills will be needed for leading large genetic information; the use of data-production efforts. • Identify common variants in the cod- genetic services; and the develop- ing regions of the majority of identi- ment of policy. The HGP already has revolutionized fied genes. biology by providing tools and resources • Create a SNP map of at least 100,000 Bioinformatics and for basic research and has catalyzed markers. Computational Biology the growth of the life sciences industry. • Develop the intellectual foundations • Improve content and usefulness of Current and potential applications of for studies of sequence variation. databases. genome research address national • Create public resources of DNA sam- • Develop better tools for data genera- needs in molecular medicine, waste ples and cell lines. tion, capture, and annotation. • control and environmental cleanup, Functional Genomics Technology Develop and improve tools and agriculture and animal husbandry, databases for comprehensive func- • Generate sets of full-length cDNA biotechnology, energy sources, and risk tional studies. clones and sequences that represent • assessment.◊ Develop and improve tools for human genes and model organisms. representing and analyzing • Support research on methods for sequence similarity and variation. studying functions of nonprotein- • Create mechanisms to support coding sequences. effective approaches for producing Launchpad to Human • Develop technology for comprehensive robust, exportable software that can Chromosomes analysis of gene expression. be shared widely. • Improve methods for genome-wide Training and Manpower A new Web site, designed by HGMIS mutagenesis. • Nurture the training of scientists as a launchpad to information about • Develop technology for large-scale skilled in genomic research. all the human chromosomes, is online protein analyses. (www.ornl.gov/hgmis/launchpad). • Encourage the establishment of The page for each chromosome con- Comparative Genomics academic career paths for genomic tains links to gene maps, sequences, • Complete the sequence of the round- scientists. associated genetic disorders, nonhu- worm Caenorhabditis elegans genome • Increase the number of scholars who man genetic models, identified by 1998. are knowledgeable both in genomic genes, and research efforts and labo- • Complete the sequence of the fruit fly and genetic sciences and in ethics, ratories. Suggestions for additions Drosophila genome by 2002. law, or the social sciences.◊ and corrections are welcome ([email protected]).◊ U.S. HGP Timeline: www.ornl.gov/hgmis/project/timeline.html 4 Human Genome News 10(1–2) February 1999

Genome Project ➤ Acronyms BAC End Sequencing Speeds Large A list of acronyms is printed on the and Small Projects back page of this newsletter. ltimate goals of the Human UGenome Project (HGP) are to ➤ Mapping with STCs and STSs determine the sequence of the 3 billion DNA bases that make up the human STCs STSs genome and to increase understanding An STC is a short stretch of sequence An STS is a DNA segment that of gene function. In search of the best read from one end of the human DNA can be copied repeatedly by PCR route to these ends, researchers have insert in a clone. BAC clone STCs can be without amplifying unwanted DNA generated several different types of useful in a number of ways. First, STCs regions from the source genome. useful chromosomal maps. Eventually, help researchers expand contigs, as out- STS markers have been used by the human genome will be represented lined in the article. Second, when the members of the International RH by DNA chromosome sequences with insert length is determined, the STC Mapping Consortium to construct various levels of annotation. spacing helps verify the contiguous the RH maps that complement sequence created by assembly software. contig building. STSs generated Interim maps have proven useful for Third, BACs with STCs serve to physi- from BAC STC reads are helping biomedical research, but the most cally define and thus “capture” gaps that to enrich RH maps. Conversely, a valuable map resources for production occur when sequencing biochemistry is mapped STS can be used to isolate stalled by occasional difficult-to-read a BAC (or any DNA clone type) DNA sequencing are megabase-scale stretches of DNA sequence. Finally, STC from a library of clones represent- assemblies of overlapping DNA clones reads can be used for the design of STSs. ing a genome.◊ (contigs). Building long contigs, how- ever, has proven a difficult task. Although contig maps of chromosomes uniformly than previous systems. genome sequencing (Venter et al., 16 and 19 (developed at Los Alamos BAC development was pioneered with Nature 381, 364–66, orcas.htsc. and Lawrence Livermore national DOE support by Melvin Simon’s team washington.edu/Papers/STC_Papers_ laboratories, respectively) were largely at the California Institute of Technol- NatureCommentary.html). DOE complete in 1995, comparable contig ogy, with Pieter de Jong of Roswell accepted related applications in 1996 maps of other chromosomes are less Park Cancer Institute contributing to and implemented a fast-track, special ready to support high-throughput subsequent improvements. review process involving a panel com- sequencing. To help alleviate this Recent DOE-sponsored projects are posed of international experts in impending bottleneck, in 1998 DOE producing sequence tag connectors human and mouse genetics, mapping, sponsored projects to enrich the BAC (STCs) on BACs to help extend the sequencing, informatics, and manage- clone resources preferred for high- ment. Following the panel’s recommen- throughput sequencing systems. human chromosome sequence already acquired (see figure, p. 5). STCs are dations, in September 1996 DOE BACs and STCs DNA sequence reads at both ends of initiated pilot projects at six laborato- BACs, which typically contain 100- to the BACs. In 1995 and 1996 investi- ries to refine protocols and clarify cost 200-kb inserts of human DNA, were gators began to advocate that the and quality factors. designed as larger, more stable recom- STC concept, which had proven useful Several months later in 1997, a work- binant DNA clones that would repre- in smaller-scale sequencing projects, shop and review was held to assess sent the human genome more be applied to large-scale human progress. Attendees recommended that DOE maintain its level of support at about $5 million a year. They also sug- gested concentrating STC production ➤ Web Sites with Related Information at sites that achieve the highest- Arabidopsis Genome Initiative Research Genetics quality sequence reads to allow the genome-www.stanford.edu/ www.resgen.com design of valuable STSs (see “Mapping Arabidopsis/agi.html RH Consortium with STCs and STSs,” above). BAC Projects (progress, articles, www.ncbi.nlm.nih.gov/genemap resources, and related Web sites) High-throughput STC production is Roswell Park Cancer Institute now being carried out at The Institute www.ornl.gov/bac BacPac Resource Center CalTech BAC Projects and Protocols bacpac.med.buffalo.edu for Genomic Research (TIGR) under Bill Nierman and at the University of www.tree.caltech.edu TIGR Human BAC Ends Genome Systems Inc. www.tigr.org/tdb/humgen/bac_ Washington Department of Molecular www.genomesystems.com end_search/bac_end_intro.html Biology (UWMB) by Gregory Mahairas of Leroy Hood’s team. These sequenc- German Resource Centre U.K. Human Genome Mapping www.rzpd.de Project Resource Centre ing projects are slated for completion www.hgmp.mrc.ac.uk in late 1999, with STC data sets on NCBI Resources and Databases: some 450,000 BACs. As of February UniGene, dbEST, dbGSS University of Washington, Seattle, www.ncbi.nlm.nih.gov Human STC Project and Databases 1999, more than 378,000 STCs had orcas.htsc.washington.edu been acquired at the two sites (see BAC Projects in box at left). February 1999 Human Genome News 10(1–2) 5 Genome Project

STC data will provide researchers with an STC marker spaced an aver- age of every 3000 to 4000 bases across ☛ Availability of BAC Clones and STC Data the entire human genome, a 100-fold Major sequencing centers may request BAC libraries directly from CalTech improvement over other current and Roswell Park Cancer Institute. Facilities requiring fewer BACs can obtain human genome maps. them through commercial suppliers after identifying needed BACs by search- ing the STC database against their own seed sequences. In the United States, The availability of STC data sets Genome Systems Inc. and Research Genetics distribute clonal resources and encourages more participation by provide screening services. In Europe, similar services are provided by the smaller laboratories. Their contig Sanger and German resource centers. building has been hindered previously STC data sets are available at the NCBI database dbGSS, with more detailed by the prohibitive cost of maintaining information and protocols on the TIGR and UWMB Web sites. Web sites for and processing libraries on the human these and other resources are listed in the box on p. 4.◊ genome scale. With the number of STC data sets now expanding, BACs to extend chromosomal sequence can (ESTs) to particular chromosomal include microbial genome sequencing be screened computationally over the regions, their correlated BACs also strategies using STCs developed at Internet. Scientists need to order only will be assigned candidate positions TIGR with DOE support. The private those BACs identified as candidates on the chromosomes. In addition, company Celera Genomics is plan- for contig extension (see box at right). some STCs are being used to design ning to use a similar strategy to STSs that are useful in other map- sequence the human genome [HGN Enriching STC Data ping methods (see “Mapping with 9(3), 1; www.ornl.gov/hgmis/publicat/ Teams at UWMB and CalTech are STCs and STSs,” p. 4). hgn/v9n3/01venter.html].◊ generating additional enrichments to core BAC-STC data sets. Restriction STCs Useful in Non-HGP Efforts *UniGene lists entries for nonredundant fingerprints, which are useful for vali- on Human, Other Genomes EST sequences read from the ends of dating candidate contig extensions, Several smaller-scale mapping and cDNA clones generated and arrayed for wide distribution by the international will be available from UWMB for sequencing projects have adapted I.M.A.G.E. Consortium [HGN 6(6), 3; most BACs processed there. At Cal- STC concepts since the HGP began. www.ornl.gov/hgmis/publicat/hgn/v6n6/ Tech, a team led by Ung-Jin Kim is STC data sets either are in use or 3image.html)]. I.M.A.G.E. clone libraries correlating BACs with cDNAs from planned for genome projects in other are an outgrowth of a 1991 DOE initiative to enrich the developing human genome the NCBI UniGene* listing. These species, including those for the flower- physical maps with gene loci and open correlations will allow concurrent ing plant and broad access to the resulting data and sequencing of chromosomal regions the laboratory mouse. Other examples resources. and their derivative cDNAs, thus pro- moting the interpretion of sequence BAC End Sequencing Extends Contigs. Software tools are helping to position function. If cDNAs already have been STCs. One tool, provided by the Genome Channel (compbio.ornl.gov/tools/index. mapped via expressed sequence tags html), allows investigators to view the contig positions of more than 15,000 BAC end sequences and their relationships to other clones and predicted genes and exons (gene-coding regions). In the figure, the black bar repre- sents 250 kb of a much longer contig. Below the bar, the long hori- 250 300 350 400 450 zontal lines denote Putative genes (kb) Previously assembled seed sequence (contig) BAC clones, of which 1, 5, and 6 are candi- 250 200 150 100 50 0 dates for extending the seed contig to the left. Above the bar, vertical tick marks indicate exons as predicted by 1 GRAIL software. Exons connected by 2 short horizontal lines 3 represent putative gene models for the 4 contig’s forward DNA BAC clones (kb) strand. [Figure contrib- uted by Richard Mural, 5 Morey Parang, and Manesh Shah (ORNL)] 6 6 Human Genome News 10(1–2) February 1999 Genome Project

In 1997 the NIH National Cancer Insti- Scientists Hunt SNPs to Uncover tute launched a Genetic Annotation Initiative to gather SNPs in regions Variation, Disease of thousands of cancer-associated hy does one man live to celebrate SNPs are the most common type of genes (www.ncbi.nlm.nih.gov/ Whis hundredth birthday with a sequence variation. Other variations ncicgap). In another NIH program, a glass of wine in one hand and a cigar include the number of base insertions 1998 RFA involves 18 institutes inter- in the other while another succumbs and deletions and sequence repeats ested in developing genomic-scale in midlife to cancer or heart disease? (called mini- and microsatellites). technologies or in implementing proj- And why may one woman’s breast Some disease-causing mutations are ects to catalogue and detect SNPs in cancer be effectively eradicated while SNPs, for example, the single base different DNA samples (www.nhgri. another’s shows no significant change in the gene associated with nih.gov/Grant_info/Funding/RFA/ response to the same treatment? sickle cell anemia. SNPs occur inside rfa-hg-98-001.html). and outside genes, about once every The explanations may reside in the SNPs generated in these public proj- 100 to 300 bases throughout the cumulative effect of a small number ects will be freely available from human genome. of differences in DNA base sequence dbSNP, a new database at the NIH called single-nucleotide polymor- DNA variations are important in National Center for Biotechnology phisms (SNPs), which underlie indi- understanding the genetic basis for Information, which serves as a central vidual responses to environment, disease and individual responses to repository for SNPs and for short- disease, and medical treatments. environmental factors, as well as for deletion and insertion polymorphisms such normal variations in biological (www.ncbi.nlm.nih.gov/SNP). [Denise processes as development and aging. Casey, HGMIS, [email protected]] ◊ For this , scientists in the pub- Who’s Sequencing the lic and private sectors are beginning Human Genome? to focus their on methodi- cally searching for SNPs throughout Science Highlights isted below are the major large- the human genome. [See articles on Progress in Genomics L scale sequencing facilities in the new HGP goals (p. 1) and private U.S. Human Genome Project* as of human genome sequencing projects he annual genome issue of Science February 1999. Washington Univer- (p. 7).] T(October 23, 1998) highlights prog- sity and the DOE Joint Genome Insti- ress in genomics, including the analysis and use of genomic data from a variety tute led in total human DNA sequence of organisms. Articles report on new contributed to public databases in plant genome initiatives, provide an 1998. To access the Web sites of the 1999 DOE Human overview of 10 years of plant compara- centers listed below, see www. Genome Program tive genetics, and assess the conceptual ornl.gov/hgmis/CENTERS.HTML. organization and approaches of some Workshop Proceedings current genome-related databases. DOE-Funded Proceedings of the 1999 DOE Human Other features include the latest plan • Joint Genome Institute Genome Program Contractor-Grantee for the U.S. Human Genome Project • Workshop, held January 12–16 in (p. 1) and a report on the newest physi- University of Washington, Seattle cal map of human gene–based markers. (BAC end sequencing, p. 4) Oakland, California, are on the Web (www.ornl.gov/hgmis/publicat/ In the “Report” and “Perspective” sec- • The Institute for Genomic Research 99santa/index.html). The searchable tions, papers on the complete sequence (BAC end sequencing, p. 4) abstracts, which represent DOE’s of Chlamydia trachomatis summarize NIH-Funded latest human and microbial genome major findings of the sequencing project research, are categorized under for this bacterium, which is an agent of • Washington University, St. Louis Sequencing; Sequencing Technolo- trachoma. Trachoma is a major cause of • Whitehead Institute/Massachusetts gies; Mapping; Informatics; Func- blindness in Asia and Africa and the most Institute of Technology tional Genomics; Microbial Genome common sexually transmitted bacterial Program; Ethical, Legal, and Social • disease in the United States (p. 11). Baylor College of Medicine Issues; and Infrastructure. An • University of Washington, Seattle** author index is included.◊ A fold-out chart of Arabidopsis thaliana’s genome illustrates advances in charac- • University of Texas Southwestern terizing the flowering plant, a popular Medical Center ¶ Nature Genetics model for studying plant biology. • Stanford University Supplement Genome data generated by this project • University of Oklahoma hold the potential for improved crops A special supplement to Nature and plant factories that generate prod- ______Genetics [21(1)] is devoted to nucleic ucts such as biodegradable plastics. *In addition to sequencers in the U.S. project, acid microarrays in various formats. Another potential research outcome, centers in the United Kingdom, Germany, Published in January 1999 with the which has relevance to human health, is and Japan are making major contributions sponsorship of NIH NHGRI, the an increased understanding of basic cel- toward sequencing the human genome. See issue is also available on the Web lular processes extending across species. URL above. (genetics.nature.com).◊ Researchers expect to finish this 120-Mb **DOE and NIH co-funded.◊ genome’s DNA sequence by 2000.◊ February 1999 Human Genome News 10(1–2) 7

Genome Project GeneMap ’98 Doubles Second Private Human Genome Sequencing Density of 1996 Map Project Under Way n the special genome issue of Science n August 1998, Incyte Pharmaceuti- focus on genes and their sequence varia- (October 23, 1998), researchers I Icals Inc. of Palo Alto, California, tions [HGN 9(3), 1; www.ornl.gov/hgmis/ reported the release of GeneMap ’98, an publicat/hgn/v9n3/01venter.html]. updated human gene map that provides announced plans to spend $200 million over the next 2 years to sequence an early look at some of the most impor- In contrast to the emphasis on identify- human genes in its new unit, Incyte tant regions of the human genome. Two ing genes by these private companies, Genetics. Incyte also stated that it to three times more detailed than the the sequence produced by the would acquire Hexagen Inc. (Cam- 1996 version, the new map contains government-backed Human Genome bridge, U.K.), which has developed a some 30,000 human gene–based mark- Project (HGP) will reflect the entire, proprietary technique for identifying ers. It doubles the gene density of the 3-billion–base human genome. Obtain- genetic variations in mice. previous release and represents perhaps ing the complete reference human half of all human genes. Incyte Genetics will concentrate on genome sequence will enable scientists The map highlights important chromoso- cataloguing SNPs (p. 6). Using this to begin exploring the function of genes mal landmarks that (1) provide a valu- knowledge to design drugs—an applica- as well as important extragenic regions able resource for studying complex tion of genetic data known as pharmaco- and their roles in human health and dis- (polygenic) genetic traits and (2) offer a genetics—may help companies produce ease. The HGP also is funding the crea- framework and focus for constructing more effective therapeutics. The pace of tion of clone resources for mapping and complete physical maps of chromosomes development may be accelerated by sequencing, bioinformatics and com- for genome sequencing. An important genetically prescreening for appropriate parative genomics infrastructure, and tool for aiding design and construction participants in clinical trials. next-generation sequencing technologies (see box, p. 3, for new goals). of large-scale gene-expression arrays, the The announcement came 3 months map also can be used to study compara- after another private company was tive analysis of mammalian chromosome All HGP data is freely accessible over formed for human genome sequencing. the Internet and released daily for structure and evolution. GeneMap ’98 is Celera Genomics, established by immediate use by scientists throughout available on a redesigned Web site that researcher J. Craig Venter (formerly the world. Celera plans to release data includes mapping information and president of The Institute for Genome freely to the research community on a associated data and annotations Research) and Perkin-Elmer’s Applied ◊ quarterly basis, and Incyte data will be (www.ncbi.nlm.nih.gov/genemap). Biosystems Division, also is expected to available for a fee.◊

In the News time and money by using VNMRF to EMSL User Facility Promotes Remote participate in experiments remotely and to virtually extend or enhance vis- Access to Instrumentation its to the facility. NMR spectrometry n October 1, 1998, the William R. Through the Internet, EMSL is mak- users desiring to run all or part of OWiley Environmental Molecular ing these very expensive, cutting-edge their experiments remotely can get Sciences Laboratory (EMSL), DOE’s technologies available to researchers started by simply noting their plans in newest National Scientific User Facil- and students who might otherwise find their applications to EMSL. ity, celebrated the first anniversary of the instruments difficult or impossible Education its opening in Richland, Washington to access (www.emsl.pnl.gov:2080/ (www.emsl.pnl.gov). The mission of a docs/collab). Internet access to instrumentation and user facility is to provide unique researchers is bringing cutting-edge research resources to scientists from NMR Spectrometry technology to the classroom. For exam- DOE and government laboratories, Over a year ago, the EMSL Collabo- ple, undergraduate chemistry students universities, and industry. Operated ratory team, in cooperation with at a small college in Washington state by Pacific Northwest National Labo- researchers in the Macromolecular recently used EMSL’s technology to ratory, EMSL’s goals are to (1) attain Structure and Dynamics directorate, remotely control a mass spectrometer, a molecular-level understanding of began to develop the Virtual Nuclear run spectra on unknown samples, and the physical, chemical, and biological Magnetic Resonance Facility (VNMRF). calculate distribution of isotopes and processes needed to solve DOE’s most VNMRF is now “open for business” to fragmentation patterns. Members of a critical environmental problems and allow NMR spectrometer users to sixth-grade science class, still in their (2) advance molecular science in sup- conduct videoconferences with EMSL Illinois classroom, manipulated a high- port of DOE’s long-term environ- researchers, run the spectrometers powered Argonne electron microscope mental missions. remotely, collaboratively analyze data, for a close-up look at computer chips. and share their notes in a Web-based EMSL is recognized as a leader in Other User Facilities electronic notebook. A modern computer, using collaboratories to facilitate the Other DOE user facilities are at Stan- video camera, microphone, and Internet remote use of nuclear magnetic reso- ford University and at Argonne, Brook- connection are all that is needed. nance (NMR) spectroscopy and mass haven, Lawrence Berkeley, Lawrence spectrometry [see Analytical Chemis- During the last year, researchers from Livermore, Los Alamos, and Oak Ridge try (November 1, 1998), pubs.acs.org/ several universities and a national national laboratories (www.ornl.gov/ hotartcl/ac/98/nov/long.html]. laboratory have saved considerable hgmis/publicat/97pr/06g_usma.html).◊ 8 Human Genome News 10(1–2) February 1999

In the News

been hallmarks of the project, which International Team Delivers C. elegans has been a model for cooperation and sharing among Human Genome Project Sequence researchers. Major HGP Milestone Offers First Whole-Genome View The first of a two-part sequencing of a Multicellular Animal process used to parse the C. elegans or the first time, scientists have much smaller microbes sequenced so genome was the “shotgun” sequencing Fthe nearly complete genetic far, it begins life as a single fertilized of randomly chosen subclones (each instructions for an animal that, like cell that undergoes a series of divi- only a small piece of a much larger humans, has a nervous system, digests sions as it grows into an adult ani- cloned DNA molecule). The finishing food, and reproduces sexually. The 97- mal, forming complex tissues and phase used a more ordered (directed) million–base genome of the tiny organ systems. Researchers have sequencing strategy to close specific roundworm Caenorhabditis elegans found it particularly useful for remaining gaps and resolve ambigui- was deciphered by an international studying early development, neurobi- ties. Members of the Sequencing Con- team led by Robert Waterston (Wash- ology, and aging—processes that sortium noted that, were they to begin ington University School of Medicine, have parallels in human biology. the project again today, they would use St. Louis) and John Sulston (Sanger the same combination strategy but Centre, Cambridge, England). The with larger bacterial clones such as work was reported in a special BACs. This is the strategy currently issue of the journal Science Why Sequence Entire being used for large-scale human (December 11, 1998) that Genomes? A Worm’s-Eye View genome sequencing in the featured six articles HGP (p. 4). Although tools In the Science special issue on C. elegans, international for both sequencing phases describing the history researchers explain the rationale for sequencing farther than and significance of the the protein-coding (gene) regions of a genome. They note that have improved greatly over accomplishment and whole-genome data provide a basis for discovery of every gene, the years, finishing some early sequence- show long-range relationships among genes, provide remains labor intensive. analysis results. structural and control elements for each gene, and offer a complete archive of genetic information. The data also The magnitude of this Although sequencing has provide a set of tools for future research into an effort underscores the chal- been almost completed, organism’s biology from fertilization to death, lenge of sequencing the investigators pointed out that most of which is not yet understood.◊ human genome, which is some analysis and annotation will con- 30 times larger than that of C. ele- tinue for years, facilitated by more gans. Methods and data from the work information and better technologies. are helping researchers sequence and “We have provided biologists with a interpret the human genome. In fact, powerful new tool to experiment with C. elegans Data a significant amount of production and learn how genomes function,” Notes associated with the Science sequencing occurs at Washington Uni- said Waterston. Obtaining genomic paper and links to data resources are versity and the Sanger Centre. sequence, they noted, is more a begin- on the sites below: • Early analysis highlights the impor- ning than an end. genome.wustl.edu/gsc/index.shtml • www.sanger.ac.uk. tance of sequencing entire genomes for C. elegans and the 12-Mb genome of finding all genes and understanding the budding yeast Saccharomyces the function of nonprotein-coding DNA cerevisiae (completed in 1996) repre- regions in the genomes of such eukary- The 9-year sequencing project sent the only completely otic organisms as humans and round- required 2 million individual “reads” sequenced thus far. The two genomes worms (see sidebar, “Why Sequence performed on DNA sequencing are being compared in an attempt to Entire Genomes?”). The C. elegans instrumentation to spell out the identify elements essential for eukary- genome is packaged into 6 chromo- worm DNA sequence, 500 bases at a otic life and the genetic requirements somes containing about 19,000 genes, time. It began with the development for progression from a unicellular to several times the number originally of a clone-based physical map to multicellular existence. Eukaryotes, facilitate gene analysis and grew into predicted by classical genetics experi- which include plants and animals, are a collaboration among C. elegans ments. About 40% of identified genes the most complex of the three major Sequencing Consortium members match those of other organisms, includ- branches of life on earth. The other and the entire international commu- ing humans. Like the human genome, branches are the least complex pro- nity of C. elegans researchers. In C. elegans contains large amounts of karyotes (bacteria) and the moder- addition to the nuclear genome- repeated DNA that does not encode ately complex Archaea, which share sequencing effort, other researchers proteins but probably plays a role in features with both other branches. sequenced its 15-kb mitochondrial chromosome function, gene organiza- During its 2- to 3-week life span in the genome and carried out extensive tion, or regulation of gene activity. The dirt of temperate regions, the benign cDNA analyses that facilitated gene C. elegans project was funded by NIH C. elegans carries out many of the identification. Free data exchange and the Medical Research Council same processes as humans. Unlike the and immediate data release have (U.K.). [Denise Casey, HGMIS] ◊ February 1999 Human Genome News 10(1–2) 9

In the News

biological systems. In seeking to trans- DOE Biological and Environmental late genomics for applications in diverse fields, DOE is helping to usher Research Helps Fuel “Biology Century” in what has been called the “biology century.” aking advantage of the wealth of “Engineered” Biomolecules for Use Tinformation generated by the “new in Energy Production, Environ- The research projects were funded fol- biology” of the Human Genome Project, mental Cleanup, Drug Design, and lowing extensive peer review of propos- DOE’s Life Sciences Division is funding Industry ($2 million). Developing als. [List of principal investigators and $16 million in projects that focus on methods to rapidly determine the struc- projects: www.er.doe.gov/production/ high-throughput approaches to solving ture of large numbers of proteins will ober/projlist.html; OBER: www.er.doe. complex biological problems related to contribute to capabilities for designing gov/production/ober/ober_top.html] ◊ DOE’s diverse missions. The research, biomolecules such as enzymes, antibod- which is taking place at 5 DOE national ies, and other proteins for these applica- laboratories and 13 universities and tions. research institutions, will address unre- solved issues in the following 4 major New Genetic Information from areas. Mice, Yeast, and Fruit Flies for SBIR 1998 Human Understanding Human Gene Func- Biochemical Potential of Microbes tions More Quickly ($6 million). This Genome Awards ($2 million). Researchers seek to research will contribute to more accu- Announced develop methods to decode the complete rate disease prediction and diagnosis genomes of microbes more rapidly, iden- and the design of drug therapies tai- he DOE Office of Biological and Envi- tify potentially useful microbes, and lored to an individual’s genetic makeup. Tronmental Research has announced explore their potential for energy pro- four Phase I and three Phase II awards duction and use and for environmental DOE’s life sciences research program for 1998 in human genome topics of the cleanup. began more than 50 years ago to study Small Business Innovation Research the health effects of radiation, initially (SBIR) program. The highly competitive Health Risks from Low-Level focusing on epidemiological studies of SBIR awards are designed to stimulate Exposures to Radiation and Other exposed people and genetic studies in commercialization of federally funded Energy-Related By-Products ($6 mil- animals. Nearly 15 years ago, DOE research and development for the benefit lion). New information will be useful in started planning its Human Genome of both the private and public sectors. the ongoing development of federal Program to obtain DNA sequencing and SBIR emphasizes cutting-edge, high-risk health-risk policies that protect workers analysis technologies and information research with potential for high payoff in and the public from radiation and envi- at the genetic level regarding the effects hundreds of areas, including human ronmental pollutants, including those at of radiation and energy production on genome research (contacts: box, p. 23). DOE sites. SBIR Awards in Genome, Structural Biology, Related Technologies Searchable abstracts: www.ornl.gov/ Hollaender Fellows Named hgmis/publicat/99santa Phase I OE announced the award of five High-throughput functional analysis DFY 1998 Alexander Hollaender Dis- of expressed sequences in the mouse. Atom Sciences, Inc. (Oak Ridge, Ten- nessee): A Quantitative Analytical Tool tinguished Postdoctoral Fellowships for • Thomas Kirchstetter (University of up to 2 years of research at DOE labora- for Producing DNA-Based Diagnostic California at Berkeley, Environmen- Arrays tories having substantial programs sup- tal Engineering): Lawrence Berkeley portive of the Office of Biological and National Laboratory, Tica Novakov. Fidelity Systems, Inc. (Gaithersburg, Environmental Research’s mission. The Hygroscopic growth and optical prop- Maryland): D-Strap DNA Sequencing mission is to understand health and erties of carbonaceous aerosols. Chemistry environmental effects associated with energy technologies and to develop and • Timothy Onasch (University of MacConnell Research Corporation sustain research programs in life, bio- Colorado, Chemistry): Brookhaven (San Diego, California): Automated Puri- medical, and environmental science. National Laboratory, Dan Imre. fication of Blood or Bacterial Genomic Studies of cloud particle formation DNA Fellowship winners were chosen from a mechanisms. field of 40 applicants who received their TPL, Inc. (Albuquerque, New Mexico): doctoral degrees after April 30, 1996. • James Randerson (Stanford Uni- Micromachined Silicon Sensor for DNA Listed below are each fellow’s name, versity, Biology): Lawrence Berkeley Sequencing by Hybridization university and subject of doctoral National Laboratory, Inez Fung. Im- degree, host laboratory and research pact of disturbance in high-latitude Phase II mentor, and proposed research topic. terrestrial ecosystems on atmos- Cimarron Software, Inc. (Salt Lake 13 • David Boisvert (Yale University, pheric measurements of CO2, CO2, City, Utah): (1) A Simulation Extension 14 Genetics): University of California at and CO2. of a Workflow-Based LIMS and (2) A Berkeley, Sung-Hou Kim. Structural Past winners are listed on the Web site Workflow-Based LIMS for High- approaches to understanding ribo- (www.orau.gov/ober/proglist.htm). A Throughput Sequencing, Genotyping, some biogenesis and rRNA methyla- complete description of the program, and Genetic Diagnostic Environments tion at extreme temperatures. including history and application forms, SpectruMedix Corp. (State College, • Carl Friddle (Stanford University, is at www.orau.gov/ober/hollaend.htm. Pennsylvania): A Fully Automated 384- Genetics): Lawrence Berkeley Na- See contact information on p. 23 for the Capillary Array DNA Sequencer ◊ tional Laboratory, Edward Rubin. Hollaender Fellowships.◊ 10 Human Genome News 10(1–2) February 1999 In the News Mouse Resources Critical to Understanding Human Genome ome 60 scientists met for 3 days in March 1998 in Bethesda, Maryland, to ➤ More Information Sdefine priorities for producing resources to make the mouse a more valu- • March meeting: able tool for understanding mammalian biology. Convened by NIH Director www.nih.gov/welcome/director/ Harold Varmus, the Mouse Genomics and Genetics Resources Working reports/mgenome.htm Group’s recommendations, as summarized by cochairs William Dove (Univer- sity of Wisconsin) and David Cox (Stanford University), are outlined below. • NIH action plan: Total direct costs for the first year are estimated at $49.3 million. genetics.nature.com The first follow-up meeting was held in October 1998 to discuss implementa- • October meeting: tion of the March recommendations. Representatives from DOE and the www.nih.gov/welcome/director/ U.K.’s Medical Research Council were present to develop a coordinated strat- reports/mgenom3.htm egy and share expertise in this international effort. • Mouse Genome Sequencing Network RFA: p. 23. ◊ phenotypes within new, specialized Recommendations centers using the supermutagen Recommendations for structural ENU (ethyl nitrosourea, developed analysis, functional analysis, and at Oak Ridge National Laboratory resources include the following: by William Russell). • Develop phenotyping protocols in Structural Analysis ENU centers and by individual uman • Generate an additional 60,000 new investigators. H markers, identified as crucial for • Set up targeted mutagenesis pro- scientists who are cloning genes. grams to validate embryonic stem Genome • Genotype inbred mouse strains and lines from different mouse strains news generate a low-resolution (5-cM) for specialized uses. single-nucleotide polymorphism • Couple molecular genotyping with This newsletter is intended to facilitate communi- map to determine its value for the construction of congenic mouse cation and collaboration, help prevent duplication mouse research. strains. of research effort, and inform persons interested ′ • Sequence and map 3 ends of partial Resources in genome research. Views expressed are not cDNAs and improve methods for iso- necessarily those of the Department of Energy lating missing and full-length • Develop cryopreservation methods Office of Biological and Environmental Research. and facilities for maintaining mu- Suggestions are invited. cDNAs. • tant mouse sperm and ovaries, Human Genome Management Generate 12 Mb of sequence for the thus reducing the cost of maintain- (HGMIS) first year and ramp up to 400 Mb ing live animals. within 5 years, obtaining a completed Oak Ridge National Laboratory • 1060 Commerce Park, MS 6480 reference mouse genomic sequence by Build a new repository for live Oak Ridge,TN 37830 2008. mouse strains. 423/576-6669, Fax: /574-9888 • Evaluate and expand some existing www.ornl.gov/hgmis Functional Analysis databases. Managing Editor Production Assistants • Develop standardized genome-wide • Train researchers in cryopreserva- Betty K. Mansfield Marissa D. Mills mutagenesis protocols and improved tion technology and animal [email protected] Laura N. Yust tools and assays for characterizing pathology.◊ Editors/Writers/ Designers Anne E. Adamson Denise K. Casey Sheryl A. Martin Mouse Consortium for Functional Genomics Judy M. Wyrick ix Tennessee research organizations located from Memphis to Knoxville U.S. Department of Energy Office of Ssigned a Memorandum of Cooperation on December 4, 1998, to form the Biological and Environmental Research Tennessee Mouse Consortium for Functional Genomics. The consortium’s pur- Ari Patrinos, Associate Director pose is to induce gene mutations in mice as models for human genetic diseases www.er.doe.gov/production/ober/ober_top.html and as subjects for studying gene function. Consortium members are Oak Ridge Life Sciences Division, OBER National Laboratory (ORNL), University of Tennessee (Knoxville and Memphis), Marvin E. Frazier, Director Vanderbilt University Medical Center, Meharry Medical College, and St. Jude www.er.doe.gov/production/ober/hug_top.html Children’s Research Hospital. The collaboration will combine ORNL’s experience Contact: Daniel W. Drell, 301/903-6488, Fax: -8521 in mouse genetics and functional genomics with the other institutions’ biological [email protected] or genome@science. and clinical expertise. Vanderbilt, for example, will contribute proficiency in doe.gov behavioral neurosciences, while Meharry is especially interested in mutations in the sensory systems. Each institution will play a crucial role in screening muta- This newsletter is prepared at the request of genized mice for induced changes in behavior, physiology, biochemistry, and mor- the DOE Office of Biological and Environ- phology and will choose mutations of interest for detailed study. mental Research by the Toxicology and Risk The ORNL Laboratory for Comparative and Functional Genomics, with its large Analysis Section of the Life Sciences Division at Oak Ridge National Laboratory, which is collection of mutant mouse stocks and large-scale mutagenesis and phenotype managed by Lockheed Martin Energy Re- screening program, is the center facility of the consortium. The six sites will be search Corp. for the U.S. Department of En- linked by the Internet, and the consortium will be managed by Darla Miller ergy, under Contract DE-AC05-96OR22464.◊ (ORNL, [email protected]).◊ February 1999 Human Genome News 10(1–2) 11

In the News Chlamydia Genome Offers Surprises Embnet.news Stimulates New Research into Treatment of Major STD, on Web Prevention of Blindness The latest issue of embnet.news, the nalysis of the 1-Mb genome of identified the genes for synthesizing newsletter of EMBnet, is available Chlamydia trachomatis has this molecule. Other genes found for in html format on the Web A (www.ie.embnet.org/embnet.news) revealed some unexpected biology for new surface proteins may be impor- and in printable Postscript and the tiny organism. C. trachomatis is tant for future vaccine development, Adobe Acrobat formats via ftp responsible for causing the most com- possibly by using the gene sequence (ftp.ie.embnet.org/pub/embnet.news). mon bacterial sexually transmitted itself instead of the protein to stimu- The newsletter contains informa- disease (STD) in the United States as late an immune response. Data are tion, articles, reviews, comments, well as trachoma, a major cause of available on the Chlamydia Genome and announcements of interest to the blindness in Asia and Africa. A col- Project Web site (chlamydia-www. European and global bioinformatics communities.◊ laboration among scientists at the berkeley.edu:4231). University of California at Berkeley The project is focusing now on European Biotech and Stanford University, the study sequencing the genome of the organ- was reported in the genome issue of Program ism C. pneumoniae, which causes a Science (October 23, 1998). mild pneumonia and also may con- The European Union’s Biotechnol- ogy Program and funded projects Of 18 fully sequenced bacterial tribute to the development of athero- are at www.cordis.lu/biotech/home. genomes, Chlamydia is the only obli- sclerotic lesions. This project is html. Areas of research include cell gate intracellular parasite, growing funded by the genome data company factories, genome analysis, plant exclusively within eukaryotic cells Incyte Pharmaceuticals (Palo Alto, and animal biotechnology, cell and requiring host enzymes and cellu- California), which also is sequencing , immunology, and lar machinery for several necessary human genes (p. 7). [Denise Casey, structural biology.◊ functions. Researchers were surprised HGMIS, [email protected]] ◊ to learn that it harbors genes that could allow it to generate its own energy-storage molecule, ATP (adeno- sine triphosphate). ☛ System Identifies Polymorphisms Another new finding explained why POMPOUS is a computational system for predicting polymorphic loci directly Chlamydia is vulnerable to penicillin. and efficiently from human genomic sequence (pompous.swmed.edu). The suite Although Chlamydia was thought to of programs detects tandem repeats ranging from dinucleotides to 250-mers, scores them according to predicted level of polymorphism, and designs appro- lack peptidoglycan, a vital bacterial priate flanking primers for PCR amplification. cell-wall component and the antibi- otic’s major target, scientists have In the verification process, the computer accurately predicted markers in genomic sequence 67% of the time. According to senior author Harold Garner (University of Texas Southwestern Medical Center), the system significantly enhances the discovery of gene function and reduces the cost of finding markers. PANORAMA, a genetic features computation and visualization server, aims to HUGO give researchers maximum information about sequences of interest by reveal- ing all their properties at a glance (pompous.swmed.edu/panorama.htm). Consolidates Detailed output is provided as five major files: Features, EST Hits, Non-EST Offices, Web Sites Hits, GenScan, and POMPOUS Results.◊ he Human Genome Organisation ☛ SmithKline Licenses Software T(HUGO), whose purpose is to pro- mote international collaboration SmithKline Beecham (SB) licensed Gene Logic’s bioinformatics system and within the Human Genome Project, software tools based on the Object Protocol Model (OPM). OPM was developed by Victor Markowitz and his team while funded by the DOE HGP at Lawrence has merged its HUGO Americas Berkeley National Laboratory. The program enables the rapid development of office with the London entity. The relational databases, integration of relational and flat-file databases, and London Web site lists regional building of cross-database query systems. HUGO contacts and links to the Gene Logic and SB also will use OPM to develop a series of customized data- HUGO Pacific office, publications bases and servers for integrating a wide range of public and proprietary and reports, and information on genomic and biological data sources into SB’s data-mining process. Under the HGM ’99 and other genome meetings agreement, Gene Logic will receive software licensing fees and funding while (www.gene.ucl.ac.uk/hugo; E-mail, retaining the right to license software and products developed under the col- [email protected]; laborative program to third-party customers. Pacific Office: Web, hugo-pacific. Michael J. Brennan, president and chief executive officer of Gene Logic, said, genome.ad.jp; E-mail, tito@ims. “This relationship with SB is a validation of the power of the OPM system to u-tokyo.ac.jp or [email protected]. manage and integrate large volumes of genomic and biological data from dis- ac.jp).◊ parate sources into a seamless data-mining process.” ◊ 12 Human Genome News 10(1–2) February 1999 Microbial Genomics Superbug Survives Radiation, Eats Wastes “Conan the Bacterium” 50 kb, may be part of the homologous recombination system that is the ➤ More Information can of spoiled meat and nuclear major form of repair for DNA double- The DOE Microbial Genome Pro- Awaste may appear to have little in strand breaks. Researchers have not gram report, in preparation, will common, but the microbe Deinococcus yet determined if circularization include information on this and radiodurans finds both environments occurs more frequently after irradia- other microbes. Abstracts of micro- rather cozy. Scientists hope this tion. No evidence, however, exists for bial research presented at the 1999 organism’s ability to withstand mas- a causal link between circularization DOE Contractor-Grantee Workshop sive doses of radiation will make it a and radiation resistance; the bacte- are on the Web (www.ornl.gov/hgmis/ publicat/99santa/microb.html). useful tool for toxic-site remediation. rium Escherichia coli’s genome, in Although scientists now find it in fact, also circularizes and yet is radia- the Health Sciences in Bethesda, many different soil and water sites tion sensitive. Plausible explanations Maryland) described a first step around the world, D. radiodurans was for the extraordinary DNA-repair toward enhancing the D. radiodurans not identified until 1956. It was iso- capability of D. radiodurans remain genome to make it valuable for toxic- lated from a can of ground beef that elusive in the early analyses of DNA site cleanup. The work also was fea- had been radiation sterilized but had repair genes. tured in a four-page “Conan the Bacte- spoiled nonetheless. Perhaps because In the sequencing effort, assembly rium” article in the July–August, 1998, it can efficiently repair radiation problems were encountered in repeated issue of The Sciences, the magazine of breakage of its own DNA, D. radiodu- regions over 500 bases long and more the New York Academy of Sciences. rans can endure 1.5 million rads of than 95% identical. To help verify the radiation, a dose 3000 times higher assemblies, TIGR scientists turned to In the Nature Biotechnology article, than would kill organisms from a special type of “optical” chromosome Daly and Minton reported sucessfully microbes to humans. Scientists are map of D. radiodurans constructed by altering the microbe’s genome. This unsure how this resistance evolved, David Schwartz and colleagues [New was accomplished by first fusing a although they suspect it may be a York University (NYU)]. gene encoding toluene dioxygenase (an side effect of the microbe’s ability to enzyme that degrades the organic con- survive periods of severe dehydration, To create this type of map, the NYU taminant toluene) to a D. radiodurans which also fragments DNA. team uses optical light microscopy to promoter (a site that activates the directly image individual DNA gene). This DNA was then inserted Recognition of D. radiodurans’ resis- molecules bound to specially coated into one of the bacterium’s chromo- tance to radiation led DOE Microbial surfaces, which are then cut with somes. The resulting recombinant bac- Genome Program (MGP) managers to restriction enzymes. When a cut is terium is capable of degrading toluene believe the microbe could be useful in made, the linear DNA contracts and and other organic compounds in a cleaning up mixed-waste sites contami- reveals a break. Scientists create a high-radiation environment. It also is nated with toxic chemicals as well as landmark map of the DNA sequence tolerant of toluene and trichloro- radiation. They began to fund projects by determining where the cut sites lie ethylene’s solvent effects at levels to decipher the microbe’s genome and and then measuring the distances exceeding those of many radioactive alter it to detoxify the most common between them. This type of high- waste sites. [Denise Casey, HGMIS, chemical contaminants at these sites. resolution restriction enzyme map pro- [email protected]] ◊ Such detoxification functions might vides a useful scaffold for aligning and ______include concentrating heavy metals verifying the maps predicted by stan- and breaking down organic solvents dard shotgun-sequencing procedures. Unfinished Microbial such as trichlorethylene. Optical mapping of D. radiodurans, Genomes Searchable Some results are reported below. which is providing insight into this organism’s biology with a picture of The National Center for Biotechnology Complete Genome Sequence the entire genome’s basic organiza- Information (NCBI) Web site links to The complete sequence of the 3-Mb tion, also may help scientists under- sequences from unfinished microbial D. radiodurans genome is now in stand aspects of the microbe’s genomes for BLAST searching hand, and researchers led by Owen radiation-resistant nature. (www.ncbi.nlm.nih.gov/BLAST/unfin_ White at The Institute for Genomic databases.html). These unfinished Research (TIGR) in Rockville, Mary- Genetic Enhancements sequences, which are not yet in Gen- land, expect to publish their findings Cleanup of toxic sites created by Bank nor accessible via Entrez, also shortly (www.tigr.org). The genome improper disposal of nuclear wastes can be retrieved from their associated consists of three chromosomes and a presents a massive global challenge sequencing centers by ftp or Web. The single extrachromosomal plasmid, requiring innovative remediation 18 finished microbial genomes are with repeats highly abundant on each approaches. In Nature Biotechnology searchable by Entrez via the NCBI chromosome. Circularization of chro- (Vol. 16, October 1998), DOE grantees site (www.ncbi.nlm.nih.gov/Entrez/ mosomal regions, occurring across Michael Daly and Kenneth Minton Genome/org.html).◊ repeats distributed at least every (Uniformed Services University for (see Microbials, p. 13) February 1999 Human Genome News 10(1–2) 13 ELSI: Cambridge Symposium Cambridge Symposium: The Human Genome Project: Science, Law, and Social Change in the 21st Century he highly successful symposium, students, consumer advocates, and T“The Human Genome Project: Sci- religious leaders. Topics at plenary ses- ☛ Free CD-ROM ence, Law, and Social Change in the sions and breakout groups included 21st Century,” was held in Cambridge, genetic privacy, DNA databanks, Meeting syllabus, plenary talk tran- Massachusetts, on April 23–24, 1998. genetic discrimination, doctor-patient scripts, and Web site links available from Gus Cervini ([email protected]) It was sponsored by the Whitehead relationships, gene therapy, newborn Institute of Biomedical Research and screening, and gene alteration. the American Society of Law, Medi- cine, and Ethics and supported in part Highlights of several selected plenary available. The final two speakers, by the Ethical, Legal, and Social talks are given below. Eric Lander set James Wilson and LeRoy Walters, Issues component of the DOE Human the stage, describing the science discussed gene therapy, a class of dis- Genome Program. This largest ELSI behind the Human Genome Project. ease prevention or treatment expected meeting ever was attended by more Mark Rothstein spoke on protecting to become more available as technolo- than 840 lawyers, judges, physicians, genetic privacy, which is increasingly gies unravel the genetic factors state legislators, journalists, educators, important as genetic tests become involved in disease.

“Genetics in the 21st Century” Eric Lander (Whitehead Institute) ccording to Eric Lander, “People of the human race and determine the covered in physical maps that can be A today are now living through the causes of disease. used to isolate disease genes. Sequenc- most stunning information revolution, ing is heating up, with about 10% of People are variable, Lander said, and unlike anything before in the history the sequence expected to be finished every possible DNA sequence and of science.” He compared its impor- by the end of 1998. DNA change that can exist probably tance to the chemist Mendeleev’s criti- does exist somewhere in the world. On The process of producing 3 billion let- cal observation around 1869 that all the other hand, he continued, there ters of information (the DNA base the elements of matter could be organ- are only two or three common variants sequences) requires extraordinary ized in a very simple table. With this of most human genes. If two people automation and cooperation around discovery, Mendeleev laid the founda- the world. Bizarre machines are being tions for the chemical industry and were selected at random from the built, Lander said as he showed a pic- for much of chemistry in the 20th audience and a particular gene were ture of a machine at Whitehead nick- century. The biological sciences and sequenced from each, the odds are one named the Genomatron, which can set industry are now experiencing the in two or one in three that the two up 100,000 PCR reactions in an hour. same thing, Lander stated. Instead of sequences of the coding regions would This reflects a 1000- to 10,000-fold a periodic table, the 100,000 human be identical. This reflects the fact that increase in capabilities over only 4 or genes constitute a finite list that will the human race descended from a 5 years ago, when a student might set be complete in the near future. This small population in Africa only 10,000 up 10 to 100 reactions in an hour. list will help biologists and scientists generations ago or about 200,000 years. understand the tremendous diversity Small populations have relatively few What are we making of this informa- variants, and the mutation rate of one tion revolution? he asked. How far in a billion bases is so low that 95% of have we come toward understanding Microbials (from p. 12) all the genes in the audience have not the remarkable differences among undergone a single mutation in all humans, the basis of different traits? TIGR Releases Chlorobium those years. Finding gene associations for rare tepidum Sequence Even though any two human chromo- Mendelian disorders like cystic fibro- sis or Huntington’s disease is a piece In September 1998, The Institute for somes are nearly identical, the little of cake these days, Lander stated. Over Genomic Research (TIGR) announced differences in DNA sequence can be 1000 relatively rare disorders already the release of more than 1.9 Mb of used to trace the inheritance pattern have been mapped to specific chromo- genome sequence from Chlorobium of chromosomes and localize particular somal regions—almost all of them tepidum, a photosynthetic gram-nega- genes to particular subregions. Find- within the last 10 years, and all within tive bacterium (www.tigr.org/cgi-bin/ ing genes in this manner requires good the last 14 years. About 140 have been BlastSearch/blast.cgi?). The TIGR pro- genetic, physical, and sequence maps. specifically isolated and cloned. gram, supported by DOE, has reached The Human Genome Project has been 3× coverage in the random-sequencing making very good progress in these For common diseases, the challenge phase. The photosynthetic C. tepidum three tasks, Lander said; the genetic has been to tease apart the contribu- may play an important role in the maps are essentially finished, and tions of multiple genes associated earth’s overall cycle of carbon use.◊ more than 97% of the genome is well (see Lander, p. 14) 14 Human Genome News 10(1–2) February 1999 ELSI: Cambridge Symposium “Protecting Genetic Privacy: Why It is So Hard to Do” Mark A. Rothstein (Health Law and Policy Institute, University of Houston Law Center) ark Rothstein began his presen- have the right to see an individual’s been before the accident? In child- Mtation by assuring the audience, information. custody cases, should the risk of an “Although it will be more complicated inherited disease keep either parent Rothstein listed eight nonmedical than most people imagine, protecting from gaining custody? How much uses of genetic information: insurance, genetic privacy and confidentiality is genetic testing should be authorized employment, criminal law, personal- a worthy goal.” Steps taken toward before children are placed for adop- injury litigation, domestic relations, this goal so far, however, he character- tion? Should a mortgage company be forensics, education, and commerce. ized as misguided and simplistic. Bef- allowed to require a genetic test to Data also are being used for identifi- ore explaining this position further, assess an applicant’s life expectancy? cation and in such contexts as immi- he gave the audience useful back- gration, paternity, settlement of Because of the financial incentives ground information on relevant estates, kinship, and schools. involved, Rothstein said, confidential- issues. ity is particularly difficult to maintain In criminal law, defendants already Rothstein defined “privacy” as the in health insurance and employment. are attempting to use as a defense limited access to a person, the right to Laws have been enacted in 16 states unproved theories of genetic predispo- be let alone, and the right to keep cer- to prohibit insurance companies from sition to violent behavior. When such tain information from disclosure to using genetic information to deny cov- a defense fails, defendants invoke other individuals. “Confidentiality,” he erage or raise health insurance rates. similar claims as a method of mitigat- said, is the right of an individual to When these laws were passed, ing their punishment at the sentenc- prevent the redisclosure of certain Rothstein pointed out, people thought ing stage. sensitive information that was origi- they were wonderful. Now, however, it nally disclosed in the confines of a Should defendants in personal-injury has become clear that they protect confidential relationship. Protecting cases be allowed to compel victims to only individuals who are asymptomatic. confidentiality can be difficult undergo genetic testing to determine Once symptoms become apparent, the because others think they should what their life expectancy might have laws don’t apply. Rothstein suggested that a comprehensive law would need to say that no insurance company may Lander (from p. 13) deny coverage or raise rates based on an individual’s past, present, or pre- with complex conditions. The most “We have to make the advantages of dicted health status. progress has been made by looking for this genetic revolution available for rare Mendelian subtypes, but there biomedical research and yet still fight On the federal level, Rothstein cited are as yet no good published subtypes what I think is the danger of a naive the Health Insurance Portability and for asthma, schizophrenia, and bipo- biological determinism and the conse- Accountability Act (HIPAA), which lar disease, for example. quences that could have for society. applies to employer-based and com- We need a different model. The right mercially issued group health insur- Human genetics eventually may come model, for me, is captured on a poster ance. Although HIPAA is a step down to just one very large table of [showing two people] I’m very fond of forward, he said, it does not apply to variants or traits common to the popu- from the Musée de L’Homme in Paris, the unemployed, and employers are lation. People already are talking about from an exhibit they had some years not required to provide any health collecting all the roughly 300,000 vari- ago: ‘Tous parents, tous differents.’ It insurance or specific benefits. Rothstein ants (3 for each of the 100,000 genes) can be translated two ways: ‘All the said, “I think we are living in a rather and genotyping everybody. This is same, all different,’ or ‘all related, all hopeful—or naive—world, where we what genetics may look like in the different.’ ” may temporarily have been able to con- 21st century, Lander continued. tain double-digit increases in medical Genetic variations influence our lives, He showed some examples of extreme costs and where we’ve been able to put he concluded, but they don’t con- claims, particularly those in super- our finger in the dike on the issue of strain us, nor do they shape us in the market tabloids, regarding genes and the uninsured. But there are problems choices we can make as a society. how they determine what kind of lurking, and I don’t know that we are What has happened so far in the work a person may do, whom he will doing enough to address those issues.” information revolution will seem like marry, or how much money she will nothing when compared to what will Moving to employment discrimination, earn. As the audience laughed, Lan- flow from the sluice gates of human he cited data showing that 85% of der pointed out that if the subject genetics projects around the world people surveyed said they should be were Alzheimer’s disease or thrill over the next decade or so. We must protected from having employers seeking, it’s not clear where the public explore “how to manage the informa- obtain their health records. Some would draw the line regarding behav- tion,” Lander said, “and the choices employers, on the other hand, feel that ior or other traits that might be and consequences of what science has they can save a great deal of money by explained by genes. to offer.”◊ eliminating prospective employees and ([email protected]) (see Rothstein, p. 15 ) February 1999 Human Genome News 10(1–2) 15 ELSI: Cambridge Symposium “Human Gene Therapy: Present and Future” James M. Wilson (Institute for Human Gene Therapy, University of Pennsylvania) n his presentation at the 1998 Cam- contrasted with germline gene ther- concept of vehicles called vectors Ibridge meeting, James Wilson char- apy, in which a goal is to pass the (gene carriers) to deliver therapeutic acterized gene therapy as a novel change on to offspring. Germline gene genes to the patients’ cells. Once the approach in its very early stages. Its therapy is not being actively investi- gene is in the cell, it needs to operate purpose, he said, is to change the gated, at least in larger animals and correctly. Patients’ bodies may reject expression of some genes in an attempt humans, although a lot of discussion treatments, and, finally, there is the to treat, cure, or ultimately prevent is being conducted about its value need to regulate gene expression. Wil- disease. Current gene therapy is pri- and desirability. son expressed optimism that many marily experiment based, with a few groups are making headway and coop- Gene therapy should not be confused erating to overcome all these obsta- early human clinical trials under way. with cloning, which has been in the cles. Theoretically, he continued, gene ther- news so much in the past year, Wilson apy can be targeted to somatic (body) continued. Cloning, which is creating Viruses have evolved a way of encap- or germ (egg and sperm) cells. In another individual with essentially sulating and delivering their genes to somatic gene therapy the recipient’s the same genetic makeup, is very dif- human cells in a pathogenic manner. genome is changed, but the change is ferent from gene therapy. Scientists have tried to take advantage not passed along to the next genera- Listing three scientific hurdles in of the virus’s biology and manipulate tion. This form of gene therapy is gene therapy, Wilson emphasized the its genome to remove the disease- causing genes and insert therapeutic genes. These gene-delivery vehicles Rothstein (from p. 14) will make this field a reality, he said. In the mid-1980s, the focus of gene dependents whose medical expenses Rothstein then raised the questions: therapy was entirely on treating dis- are likely to be high. Is genetic information unique? eases caused by such single-gene Should it be protected separately In March 1995, the Equal Employ- defects as hemophilia, Duchenne’s from other forms of information? He ment Opportunity Commission issued muscular dystrophy, and sickle cell listed six arguments for considering an interpretation that is helpful but anemia. In the late 1980s and early genetic information different from not the final word, Rothstein continued. 1990s, the concept of gene therapy other kinds of medical data: It Basically, it says covered entities that expanded into a number of acquired reveals the health of family mem- discriminate on the basis of genetic diseases. When human testing of first- bers; it reveals parentage, reproduc- predisposition are regarding the indi- generation vectors began in 1990, sci- tive options, and future health risks; viduals as having impairments, and entists learned that the vectors didn’t it goes to the essence of who and such individuals are covered by the transfer genes efficiently and that what an individual is; and it’s Americans with Disabilities Act. The they were not sufficiently weakened. regarded as unique by individuals problem here, Rothstein said, is that Expression and use of the therapeutic and third parties, who often overuse this interpretation is not binding on genes did not last very long. it. Rothstein said that, even if we are the courts and does not apply to unaf- satisfied that genetic information is In 1995, Wilson continued, a public fected carriers of recessive and X- unique, it should not necessarily be debate led to the consensus that gene linked disorders. It also does not pro- protected separately. First, people therapy has value although many hibit employers from requiring access don’t know exactly what genetic unanswered questions require contin- to employees’ clinical records, which information is. Second, it probably is ued basic research. As the field has could include genetic information. The impossible to segregate it from other matured over the last decade, it has consequences of this interpretation, information in a clinical record; and caught the attention of the biophar- Rothstein said, are that it permits third, enacting genetic-specific legis- maceutical industry, which has begun disclosure of sensitive information lation may be self defeating because to sort out its own role in gene ther- within companies and discourages it further stigmatizes people with apy. This is critical because ultimately at-risk people from being tested. genetic conditions. this industry will bring gene therapies Some 13 states have enacted laws to large patient populations. The problem of genetic discrimina- that prohibit employers from requir- tion cannot be solved by a single law, Wilson reviewed several specific gene- ing genetic testing or from using Rothstein concluded, and resolution therapy cases involving high choles- genetic test results to discriminate in of the issue raises fundamental con- terol, hemophilia, and cystic fibrosis. employment. Unfortunately, Rothstein cerns of equality in the system. Due He emphasized that the response to said, these laws are either too narrow to the complexity and difficulty of the any therapy in a heterogeneous patient or too broad. They don’t protect challenge, we should start to address population will be quite variable. genetic information in medical these problems in depth.◊ records or prevent employers from He asked the audience to think about gaining access through health- ([email protected]) gene therapy, not necessarily to treat insurance claims. (see Wilson, p. 16) 16 Human Genome News 10(1–2) February 1999

ELSI: Cambridge Symposium “Ethical Issues in Human Gene Therapy” LeRoy Walters (Kennedy Institute of Ethics, Georgetown University) eRoy Walters provided a valuable was born in 1971 with X-linked attacked her body and made her sick. L perspective on some of the lessons severe combined immune deficiency She was treated with a synthetic learned by scientists and ethicists and died 12 years later after receiv- enzyme called PEG-ADA, which over the 18 years since the first human ing a bone marrow transplant that, gradually decreased in efficacy, and in gene therapy protocol was approved. unknown to doctors, carried a silent 1990 she became the first patient to He also offered his predictions for Epstein-Barr virus. receive gene therapy in an approved future gene-therapy interventions protocol. She is now almost 13 years In contrast to David’s story, Walters and discussed some associated ethical old and living a normal life. continued, is the story of Ashanti, who dilemmas that society may be facing. was born in 1986 with an autosomal In reviewing the history of gene ther- Walters began his talk with two case recessive form of severe combined apy in the United States, Walters studies. The first was about David, immune deficiency. In Ashanti’s early referred to a document prepared by known as “the boy in the bubble.” He years, every environmental microbe an interdisciplinary group in 1984 and 1985. Called “The Points to Consider,” it contained 110 questions that inves- Wilson (from p. 15) tigators were asked to answer as they thought about performing gene ther- genetic disease but as an alternative protein-producing gene and a type of apy on human patients. The questions way to deliver proteins. Protein thera- molecular rheostat that would react covered such topics as gene therapy’s peutics currently are manufactured to a pill to regulate gene expression. potential benefits and harms, fairness by placing genes in laboratory- This may prove to be one of gene in selection of recipients, procedures cultured organisms that produce the therapy’s most useful applications as to be followed, recipients’ privacy and proteins coded by those genes. Exam- scientists begin to consider it in many confidentiality, and possible alterna- ples of such manufactured proteins other contexts, he said. Wilson’s group tive therapies. The same questions include insulin, growth hormone, and is conducting experiments with could constitute a checklist for gene erythropoietin, all of which must be ARIAD Pharmaceuticals to study the therapy today, Walters said. injected frequently into the patient. modulation of gene expression. The review process in the early days Recent gene therapy approaches Wilson stated that only so much can was transparent and public, a fact promise to avoid these repeated injec- be done in academia and that the that was important to gene therapy’s tions, which can be painful, impracti- biopharmaceutical industry has to acceptance. Policymakers knew cal, and extremely expensive. One embrace gene therapy and handle exactly what was happening, and any method uses a new vector called issues of patents, regulatory affairs, member of the public could attend a adeno-associated virus, an organism and the optimum business model. An meeting, see the investigators, hear that causes no known disease and example of a dilemma that society may the questions, and have access to a doesn’t trigger patient immune be facing can be seen in the treatment public list of approved gene therapy response. The vector takes up resi- of hemophilia. Infusing a patient with protocols. dence in the cells, which then express the replacement protein, which stops the corrected gene to manufacture the Walters stated that as of February bleeding episodes but doesn’t prevent 1998, 200 therapeutic protocols had protein. In hemophilia treatments, for them, currently costs about $80,000 a example, a gene-carrying vector could been formally reviewed: 23 dealing year. Why would a vector to prevent with HIV infection or AIDS; 33 with be injected into a muscle, prompting bleeding for 5 to 10 years be commer- the muscle cells to produce Factor IX single-gene diseases, especially cystic cialized when it would displace such a fibrosis; 138 with cancer; and 6 with and thus prevent bleeding. This lucrative treatment, and how would method would end the need for injec- other diseases. Reviewing what has this gene therapy be delivered to the been learned from the past 18 years, tions of Factor IX—a derivative of public? pooled blood products and a potential he listed the following points: source of HIV and hepatitis infection. Wilson concluded his presentation by • Somatic cell gene therapy has been In studies by Wilson and Kathy High outlining future milestones in the successfully distinguished from (University of Pennsylvania), patients field: proof of concept in the next few more ambitious plans for human have not needed Factor IX injections years in model inherited diseases, fol- genetic engineering. ➨ for more than a year. lowed by cancer and cardiovascular diseases; continued explosive activity In gene therapies such as those in technological development; develop- described above, the introduced gene ment of regulatory policy (with the ➤ To find out more about Gene is always “on” so the protein is always Food and Drug Administration); and Therapy, see site below (select being expressed, possibly even in commercial development.◊ “Disease Intervention”) instances when it isn’t needed. Wilson • www.ornl.gov/hgmis/resource/ described a newer permutation in ([email protected]) medicine.html which the vector contains both the February 1999 Human Genome News 10(1–2) 17

ELSI

• The more neutral term “human them with enucleated egg cells from a activities for elementary students focus gene transfer” might have been donor. Because mitochondria are in on understanding the living and nonliv- used, rather than “human gene the cytoplasm and would be derived ing parts of their world. Middle school students continue to explore and therapy.” “Therapy” seems to prom- from the donor, the resulting embryos understand genetics, biotechnology, and ise benefits to the patient; “gene would be free from mitochondrial dis- genetic engineering, while high school- transfer” covers even the ease. This type of case would involve ers are encouraged to plan environ- Phase I studies that test a pro- simultaneous germline intervention mental stewardship activities. [Order duct’s toxicity and are unlikely to and cloning in the technical sense. leaders’ guide (96 pages, Product No. be therapeutic to the subjects. ES0046) from National 4-H Supply Walters ended with a warning Service: 301/961-2934, Fax: -2937] ◊ • The success of human gene therapy against repeating mistakes made in has been quite modest in the first the time of the eugenics movement 8 years; unfortunately, some and the Third Reich. “We can applaud Microbial TV Series researchers and companies have the war on disease that genetic re- Intimate Strangers: Unseen Life on overstated the early results. search is waging. It will be a great day Earth, a four-part series for prime-time public television, will be shown by PBS • when a child is definitively cured of An optimum location will be needed and distributed for international broad- for a national public review body to cystic fibrosis or when a particular cast this fall. Funded in part by the examine new biomedical technologies. family line is liberated from the burden DOE Human Genome, Microbial of fragile X syndrome. But we will be Genome, and Natural and Accelerated Looking to the future, Walters said he humane warriors only if, in the midst Bioremediation Research programs, the thinks we will see prenatal interven- of the battle, we also show respect for series is designed to increase public sci- tions to prevent severe and irreversible those who courageously cope with dis- ence literacy by using lessons from the damage to fetuses and gene transfer to microbial world to teach about more ability and for those who cannot yet complex systems of life. prevent or treat neurological disease. be cured.” ◊ In studies affecting the brain, the ques- The series, produced by independent ([email protected]) tion of what is enhancement and what filmmakers Baker & Simon Associates is cure, treatment, or prevention of dis- with the American Society for Microbiol- ease will arise in an acute form, he ogy, is an initiative of the Microbial Lit- eracy Collaborative (MLC). An organiza- said. For example, is it remediation or ELSI News tional partnership headed by Cynthia enhancement to intervene so that a Needham, MLC seeks to emphasize child would have an IQ of 100 instead On Radio: The DNA Files how basic research advances society’s of 60 or 70? well-being, to improve decision-making On November 2, 1998, an interactive on microbial issues, and to create more Walters predicted that, in the next Web site was launched for The DNA effective curricula for science teachers 18 years, proposals will emerge for Files, a series of nine 1-hour documenta- and students. In addition to Intimate ries hosted by John Hockenberry and germline genetic intervention, which Strangers, other MLC products include distributed by National Public Radio an interactive Web site (operational in will require a great deal of prelimi- (www.dnafiles.org). Supported in part by nary technical work. Instead of the May: www.microbeworld.org), a set of DOE, the series covers such topics as hands-on activities designed to intro- current technologies of adding genes, DNA and behavior, prenatal and predic- duce youth to the microbial world, and something analogous to the “search tive genetic testing, gene therapy, genet- week-long leadership programs tar- and replace” function on a word proc- ics of human evolution, genetics and geted to young people from challenging biotechnology, and genetics and the law. essor will be needed to find the mal- environments. Unseen Life on Earth,a The Web site, which lists radio stations 12-part telecourse for undergraduates, functioning gene, splice it out, and that will broadcast The DNA Files replace it with the properly function- will support distance and pro- around the country, provides informa- vide teaching resources to college and ing gene. tion about each program, additional precollege teachers.◊ resources, and an opportunity for listen- He pointed out that there are some ers to interact about some ethical issues good moral arguments in favor of introduced in the series. [Contact: Short Courses for germline genetic intervention, whose [email protected] or [email protected]] ◊ Biology Teachers goal is to prevent or alleviate disease or disability. Such intervention is more Innovative Biotechnology Outreach to K–12 teachers and stu- efficient than repeating gene therapy dents is an aim of the new molecular Curriculum biology teaching laboratory at Pennsyl- generation after generation, and even vania State University’s Biotechnology in utero gene therapy is too late for An innovative curriculum to boost stu- dent enthusiasm and interest in biotech- Institute. Short courses including lec- some diseases. The one case that could nology has been launched through a tures and laboratory experience can be justify nuclear transfer in the early partnership involving the National 4-H scheduled for area teachers on the prin- embryonic stage, Walters thought, is Council, Monsanto Corporation, and ciples and techniques used in genetic that in which a woman is likely to Pioneer Hi-Bred International. Called and molecular biology research, espe- pass on a mitochondrial disease to her Fields of Genes: Making Sense of Biotech- cially as they relate to the Human nology, the curriculum is designed to Genome Project. High school biology offspring. In such a situation, he said, help teachers provide students in grades teachers are particularly encouraged to after in vitro fertilization it would be 4–12 with a basic understanding of sci- take advantage of this opportunity. justified at perhaps the four-cell stage entific principles that form the founda- [Contact: Loida Escote-Carlson to remove all the cells’ nuclei and fuse tion of biotechnology. Curriculum (814/863-5751, [email protected])] ◊ 18 Human Genome News 10(1–2) February 1999

Proteomics

genomes, reveal biochemical pathways From Sequence to Systems: Looking at and regulatory networks, and define Proteins to Understand Genome Expression targets for protein-structure determi- nation. he availability of entire genomic sequences for some 18 microbes (and many Tmore to come) now offers investigators the opportunity to perform compara- In the context of the DOE Microbial tive analysis from an evolutionary perspective, identify conserved genes and Genome Program, analyzing the pro- metabolic capabilities based on protein sequence homology, and predict protein teomes of organisms for which com- structures. Understanding how gene products—proteins—work together to cre- plete genomes are available offers the ate and maintain complex biological systems, however, requires data about the potential for rapid identification of the entire spectrum of protein production in the complex ecosystem of a cell. organisms’ major gene products. In the account below, DOE Microbial Genome Program grantee Carol Giometti Although M. jannaschii’s complete of Argonne National Laboratory (ANL) describes such studies on two microbial genome sequence is publicly available genomes, the heat-loving Methanococcus jannaschii and Pyrococcus furiosus, and annotated according to sequence both subjects of the DOE program. [Introduction by Dan Drell, DOE Microbial homology with other known proteins, and Human Genome programs] the actual proteins synthesized by M. jannaschii and regulation of their n 1995, V. Wasinger and coworkers of an organism, proteomics summa- synthesis have not been studied until I(University of Sydney, Australia) rizes protein-expression patterns of a now. Correlation of protein abun- coined the term “proteome” to describe biological system at different times. dance, shifts in abundance in all the proteins encoded within a Biochemical pathways and regulatory response to environmental changes, genome. Proteomics is the study of mechanisms can be deduced by and post-translational modifications protein expression by biological sys- manipulating the cellular environ- with the genome sequence will pro- tems, including relative abundance, ment or DNA sequence and observing vide new information regarding gene post-translational modifications, sta- coregulation of specific proteins or sets expression and regulation in this bility within the cell, and fluctuations of proteins. Proteomics tools include member of the Archaea. In addition, as a response to environment and high-resolution protein separation, proteome studies will serve to confirm altered cellular needs. detection, and quantitation methods or refute protein identifications based and techniques for linking proteins to on sequence homologies alone. In contrast to genomic sequence, their corresponding gene sequences. The genome sequence of P. furiosus is which captures DNA information These tools can be used to further virtually complete, and numerous that is stable throughout the lifetime annotate and validate completed P. furiosus enzyme activities have been well characterized. The regula- tion of specific gene expression (e.g., 2-DGE: A Technique for Visualizing Protein inducibility of enzyme activities of Expression and Modification interest) is not characterized in P. furiosus, however, nor has the One current proteomic tool for visualizing and quantitating all proteins expressed influence of post-translational modi- in a biological system at a given time is two-dimensional gel electrophoresis (2- DGE). As originally described by Patrick O’Farrell for analyzing Escherichia coli fication been explored. Characteri- proteins in 1975, 2-DGE combines the electrophoretic separation of denatured pro- zation of the P. furiosus proteome teins by isoelectric point charge differences in the first dimension with separation will bridge the gap between gene based on molecular size differences in the second dimension. The proteins, which sequence and protein function by can be detected using protein-specific stains, appear as constellations of spots in providing data on the regulation of the 2-D space of the gel. Over the 20-year history of 2-DGE, numerous protein synthesis. In addition, stud- have been developed for comparing 2-DGE patterns and quantitatively analyzing ies are in progress to determine the protein abundance. subcellular localization (soluble vs The recent addition of mass spectrometry to methods available for identifying pro- membrane fractions) of each teins detected by 2-DGE has provided the needed capability for rapid identifica- P. furiosus protein. tion. Proteins can be digested in the 2-DGE gel using a specific protease (e.g., trypsin or amino acid–specific endopeptidases), the resulting peptides eluted, and Strategies rooted in 2-DGE (see the masses of proteins determined using mass spectrometry. [Matrix-assisted laser sidebar) are being developed to link desorption ionization (MALDI-MS) and electrospray currently are the preferred the proteome information with methods.] The peptide masses are then used to search protein and DNA sequence databases for the identity of predicted amino acid sequences to produce the same existing genome sequence databases peptide masses when cleaved with the same protease. When a complete genome for these two Archaea. Evolving sequence is available and the peptide mass search is limited to just that sequence approaches to characterizing small- database, the protein identification process is highly reliable and efficient. genome proteomes and linking The work of John Yates’s group at the University of Washington, in which 260 proteome and genome databases Haemophilus influenzae proteins separated by 2-DGE were identified in about a will be the foundation for develop- month with MALDI-MS, demonstrates this approach’s potential for identifying the ing protocols for similar investiga- hundreds of proteins revealed in the 2-DGE patterns of cell lysates. The protein- tions of large mammalian expression information can then be compared to the cell’s proteome and other pro- proteomes. [Carol S. Giometti, ANL, teomes to provide a better understanding of cell function.◊ [email protected]] ◊ February 1999 Human Genome News 10(1–2) 19 Proteomics

Proteomics News Tool for Protein Analysis integrated into SWISS-PROT. Weekly Technologies funded by DOE TREMBL updates are available by accounted for 34 of the 100 R&D PEDANT is a software system for com- anonymous ftp (ftp.ebi.ac.uk/pub/ awards in 1998. [SOLVE Contact: pletely automatic and exhaustive databases/trembl) and from the Thomas Terwilliger (505/667-0072, analysis of protein sequence sets, from Sequence Retrieval System server of [email protected])] ◊ individual sequences to complete the European Bioinformatics Institute genomes (pedant.mips.biochem.mpg.de). (srs.ebi.ac.uk). [Contact: apweiler@ NIH Proteomics Grant This server now contains 20 complete ebi.ac.uk] ◊ genomic sequences and 1 plasmid, as to Axys well as 21 experimental and unfinished R&D 100 Award Axys Pharmaceuticals Inc. of South genomic sequences. to LANL’s SOLVE San Francisco, California, has been Entries for completed genomes include awarded a Phase I Small Business three sections: One of the four R&D 100 awards won Innovation Research grant from the • General Information such as by Los Alamos National Laboratory in NIH National Institute of General genome summary, open reading 1998 was for SOLVE, a system that Medical Sciences to conduct a frames, links, and search mecha- produces 3-D pictures of protein struc- 6-month research study of proteomics. nism; ture. SOLVE automatically carries out Proteomics is the global search for all the steps necessary to fill in missing and identification and prediction of • Protein Function such as closest information in X-ray crystallography, a protein function. The Axys goal is to homologues, functional categories, process that uses X rays to determine build the ProteomeBank, a software Protein Information Resource key- the structure of atoms, ions, or mole- system and proprietary database of words and superfamilies, and PRO- cules in chemical substances. SOLVE’s protein families for high-throughput, SITE patterns; and speed and ease of operation make it accurate prediction of protein func- ◊ • suitable for the rapid analysis of pro- tion. Protein Structure such as known tein molecule shapes, and accurate pro- 3-D, transmembrane, - tein pictures can be produced in hours peptide, low-complexity, coiled-coil, Completing the E. coli ◊ rather than days. In addition, the auto- and structural classes. mated system can evaluate hundreds of Proteome solutions and can be operated by a nov- A database of genes characterized TREMBL Release 6 ice. SOLVE shows promise in helping since completion of the Escherichia researchers design new and improved Release 6 of TREMBL, a protein coli genome sequence lists new and drugs, enzymes for rapidly breaking old gene names, SWISS-PROT entry, sequence database that supplements down toxic waste or synthesizing useful SWISS-PROT, has been announced. gene location, genetic structure, and chemicals, and heat-tolerant enzymes identified function (sun1.bham.ac.uk/ TREMBL contains the translations of useful in chemical manufacturing bcm4ght6/genome.html).◊ all EMBL Nucleotide Sequence Data- processes. base coding sequences not yet

Genetics in Medicine

Organization for Rare Orphan diseases, most of which are free prescription drugs from nine com- genetic in origin, are those affecting fewer panies to thousands of uninsured, Disorders than 200,000 people in the United States. needy patients. In addition, the NORD he National Organization for Rare More than 5000 rare disorders affect grant program provides financial sup- TDisorders (NORD) is a federation of about 20 million Americans. port to academic scientists for clinical research. [NORD; P.O. Box 8923; New more than 140 nonprofit voluntary health Responding to over 1 million inquiries Fairfield, CT 06812-8923 (800/999- organizations dedicated to helping peo- each year, NORD attempts to educate 6673 or 203/746-6518, Fax: -6481)] ◊ ple with rare “orphan” diseases and to the public and the medical community assisting the groups that serve them. by distributing understandable informa- tion through its newsletters, publica- tions, and databases; providing referrals ➤ NORD Publications to additional resources; and maintaining The third edition of the 675-page an extensive Web site (www.rarediseases. Cancer Web Site NORD Resource Guide lists more org). Through the Web, users can access than 900 organizations that can NORD’s Rare Disease Database (RDB), Northwestern University researchers benefit individuals with rare disor- containing more than 1100 abstracts, as have developed a Web site to teach ders and their families. The 1000- well as the Organization Database of health professionals and the public page Physicians’ Guide to Rare Dis- support groups and the Orphan Drug about the genetic basis of cancer and eases contains information on over Database. Complete RDB entries are new discoveries in the field of cancer 900 such disorders, including symp- available online at low cost, and print- genetics. Designed as a comprehensive toms and visual diagnostic signs. outs can be ordered from the NORD educational program, the site provides [Orders: www.rarediseases.org, click office. a fundamental understanding of genet- ics, genetic testing and diagnosis, on “Services/Products.” The physi- NORD also maintains confidential cians’ guide may also be ordered genetic counseling, and cancer risk patient networking for individuals and assessment (www.cancergenetics.org).◊ from Dowden Publishing Co. families. Since 1987, NORD has admin- (800/707-7040 or 201/391-9100, istered medication assistance programs Fax: -2778).] for pharmaceutical companies, providing (More on Genetics in Medicine, p. 20) 20 Human Genome News 10(1–2) February 1999

Informatics ☛ Software Progams Provide Useful Resources BioToolKit New Sequin Version and nucleotide content. Clicking on an entry shows alignment against a consen- BioToolKit now provides 750 annotated The National Center for Biotechnology sus pattern, allowing the user to see the links to Web tools for the study of nucleic Information has released Version 2.80 repeat pattern and mutation location.◊ acid, genome, and protein structure of Sequin, the sequence-submission and (www.biosupplynet.com/cfdocs/btk/btk. editing tool, for all platforms (www.ncbi. cfm).◊ nlm.nih.gov/Sequin). This version is Sequence Viewer expected to be particularly useful for Sequence Viewer, a free public software Gene-Finding Programs genome centers that annotate large tool for viewing and analyzing DNA records.◊ sequences, is available on the National at Sanger Center for Genome Resources (NCGR) Updated versions of gene-finding pro- Tandem Repeat Tool Web site (www.ncgr.org/gsdb/sv). The grams (including FGENES, FGENESH, NCGR tool was developed to fill the and FGENES-m variant for mammal- Gary Benson (Mount Sinai School of need for graphical representations of ian sequences) are available for use Medicine) has developed a program to nucleotide sequences in the Genome through the Sanger Web site (genomic. find tandem repeats in DNA sequence Sequence DataBase and for detailed sanger.ac.uk). Also, the Gapped BLASTP data without prior knowledge of pattern descriptions of sequence annotation. program from the National Center for repeat, pattern size, or number of copies Sequence Viewer allows users to Biotechnology Information allows users (c3.biomath.mssm.edu/trf.html). The quickly find a sequence region that to check a gene’s protein structure in current version finds pattern repeats integrates with a gene rather than the INFOGENEP database of finished ranging from 1 to 500 bases. searching through a lengthy, complex and unfinished human sequences and Users submitting a sequence (up to flat-file report. It also can be used as a receive the clone’s name and sequence 2 Mb) in FASTA format will receive a quality-control tool for readily locating (genomic.sanger.ac.uk/db.html). See the summary table of repeats, including mistakes in feature position.◊ Web site for more information.◊ their location, size, number of copies,

Genetics in Medicine (from p. 19)

Cross-references connect to a network New HGMIS Site: ¶ Calculation of Genetic within the book for comprehensive Translation of Risks information on any covered topic. Fur- ther sources are given for many entries, Genetics to Medicine The Calculation of Genetic Risks: and most biometrical procedures have At the request of medical professionals Worked Examples in DNA Diagnostics worked examples. 1152 pp., 1998. eager for translation of genomics to (second edition) by Peter Bridge [Orders: World Scientific Publishing Co. medical practice, the Human Genome (Alberta Children’s Hospital, Canada) (800/227-7562, Fax: 888/977-2665, Project Information suite of Web sites explains how to calculate an individu- [email protected], www.wspc.com] ◊ has added a new page called “Medicine al’s genetic risk based on information and the New Genetics”(www.ornl.gov/ from genetic testing and family pedi- hgmis/resource/medicine.html). This grees. Worked examples are included. Mutation Journal 272 pp., 1997. Order through bookstores site covers topics of specific interest to Devoted to the union between genomics physicians, nurses, genetic counselors, or from Johns Hopkins University Press (800/537-5487, Fax: 410/516-6998).◊ and mutation research, the fourth issue and allied health professionals. It con- of Mutation Research Genomics Online— tains information and links about dis- ¶ a section of Mutation Research Online— ease prevention, diagnosis, and Genetics Manual is at www1.elsevier.com/journals/ intervention; genetic-disease databases Genetics Manual: Current Theory, genomics/menu.htm. Although full and support groups; gene testing; gene Concepts, Terms by George P. Redei online access is restricted to subscrib- therapy; pharmacogenomics; genetic (University of Missouri, Columbia) ers, informative snapshots are available counseling; ethical, legal, and social explains over 18,000 life science terms for selected articles in each issue.◊ issues associated with genetics; con- and concepts arranged alphabetically. tinuing medical education courses in genetics; publications; multimedia; pro- fessional societies; and other resources. Medical professionals are asked to ☛ DNA Polymorphism Discovery Resource review the site and send comments and A resource for detecting DNA sequence polymorphisms has been developed by ◊ suggestions to HGMIS. the NIH National Human Genome Research Institute in collaboration with the NIH National Institute of General Medical Sciences and its Human Genetic Mutant Cell Repository. Designed to reflect the diversity of the human popula- HuGEM Web Site tion, the resource is composed of cell lines and DNA samples from 450 unre- The Human Genome Education Model lated individuals, both male and female. In addition to the complete set, Project (HuGEM) offers education in the predefined nested subsets with 8, 24, 44, and 90 samples will encompass the new genetics to specific groups of health same range of diversity. Individuals sampled include Americans of European, professionals who provide services for African, Mexican, and Asian extraction as well as Native Americans [F.S. Col- individuals and families with genetic lins et al., Genome Research 8(12), 1229–31, 1998]. (Orders: 800/752-3805 or conditions (www.dml.georgetown.edu/ 609/757-4848; [email protected]; locus.umdnj.edu/nigms) ◊ hugem/elsi.htm).◊ February 1999 Human Genome News 10(1–2) 21

Informatics

cl/cl.html). TRANSFAC is linked to a Databases number of other databases. Among the most recent additions are enhanced GDB Mapping Database Operations Restored internal hyperlinking between individ- ual tables, improved linking of refer- Canadian Institution Takes Over Collection, Curation ences to PubMed, and insertion of most training sequence sets used for matrix he Bioinformatics Centre at the U.S. node will be maintained by the construction, including the correspond- THospital for Sick Children (HSC) in Computational Biosciences Section at ing site-matrix links. Toronto, Canada, has received funding Oak Ridge National Laboratory The TRANSFAC server also provides from an anonymous source to continue (ORNL) in conjunction with the access to such sequence-analysis tools data acquisition and curation activities Genome Annotation Consortium (GAC). as PatternSearch, which uses sequence of the Genome Database (GDB). Work GAC is a multi-institutional group information contained in the SITE table is under way to obtain additional sup- established to help build a shared for analysis of submitted sequences; and port for future software development. infrastructure for integrating diverse MatInspector, which uses a library of biological information [HGN 9(3), 13; GDB, which provides human gene map- matrices selected from the TRANSFAC www.ornl.gov/hgmis/publicat/hgn/ ping data to genetics researchers, was MATRIX table. Another sequence- v9n3/13anno.html]. based at Johns Hopkins University analysis program, FastM, developed by (JHU) School of Medicine in Baltimore The ORNL node has been established the group of Thomas Werner (National until July 1998. At that time, DOE with- (genome.ornl.gov), and the primary Research Centre for Environment and drew major funding to focus its infor- node is being transferred to HSC. To Health, Neuherberg), is included on the matics resources on the sequencing retain key members of the GDB staff TRANSFAC server. Using the MatIn- phase of the Human Genome Project. and a presence in the United States, spector , FastM analyzes HSC is supporting a curatorial center sequences for user-defined combinations The new income will enable HSC to at JHU. In addition to continuing GDB of –binding sites. send GDB data from the central edit- operations and access, researchers at The Structural Analysis with Genetic able node to international nodes. The participating institutions are exploring Algorithms (SaGa) program can identify further collaborations in acquiring, structural characteristics in the envi- analyzing, and exchanging data to ronment of aligned functional sites. benefit the genome community. TRANSFAC tools are freely accessible Publications [Contacts: HSC, Jamie Cuticchia for users from noncommercial organiza- ([email protected]); JHU, tions. Users from profit-oriented organi- ¶ Bioinformatics Journal Christopher Porter ([email protected]) zations are requested to obtain licensing In Silico Biology, a peer-reviewed and Conover Talbot, Jr. ([email protected]); from BIOBASE Ltd. ([email protected]). online journal, attempts to bridge the ORNL and GAC, Edward Uberbacher [TRANSFAC contact: Edgar Wingender gap between experimental scientists ([email protected]) and Jay Snoddy and computational biologists by focus- (+49-531/6181-427, Fax: -266, ([email protected])] ◊ ◊ ing on biologically significant compu- [email protected])] tational methods and results (www. bioinfo.de/isb). A print version is also p53 Mutation Database available. [Subscribe via Web site or Influenza Database by e-mail: [email protected]] ◊ at LANL The p53 mutation database contains information on all p53 missense muta- ¶ Computational Methods Los Alamos National Laboratory tions and small deletions in human (LANL) introduced its annotated Influ- tumors and cell lines as reported in Computational Methods in Molecular enza Sequence Database in July 1998 peer-reviewed literature (www.iarc.fr/ Biology, edited by Steven Salzberg (www-flu.lanl.gov). The database cur- p53/whatsdb1.htm).◊ (The Institute for Genomic rently holds all the influenza sequences Research), David Searls (SmithKline published in GenBank and, after veri- Beecham Pharmaceuticals), and fication and annotation, will add unpub- TBASE at Jackson Simon Kasif (University of Illinois at lished sequences collected around the Laboratory Chicago), was published by Elsevier world. LANL is working with the Uni- Science in 1998 (www.cs.jhu. versity of California and the Centers for TBASE, the database of transgenic ani- edu/~salzberg/compbio-book.html). Disease Control and Prevention to mals and targeted mutations, is at the Leading researchers from the compu- expand the database.◊ Jackson Laboratory in Bar Harbor, tational biology community are Maine ([email protected]; www.jax.org/ ◊ included among the authors. TRANSFAC Database tbase). Biologists who rely on computers are the primary audience, with a secon- The TRANSFAC database compiles data Intein Database on Web about gene regulatory DNA sequences dary audience of computer scientists The Intein Database Web site contains who are developing techniques with and protein factors binding to them (transfac.gbf.de). Programs help identify a registry of all submitted experimental biology applications. A list of Web and theoretical inteins (inframe protein resources in the book will be kept putative promoter or enhancer struc- tures and suggest their features. introns) as well as information on pro- updated on the Web (www.cs.jhu.edu/ tein splicing and intein structure ~salzberg/appendixa.html). 398 pp., TRANSFAC consists of six cross-linked (www.neb.com/neb/inteins.html).◊ hardbound. [Orders: Web, www.elsevier. tables: SITE, CELL, FACTOR, CLASS, com, search on “Salzberg”; 888/437- MATRIX, and GENE. FACTOR entries 4636 or 212/633-3730; usinfof@ also are cross-linked with a proposed elsevier.com] ◊ classification system for transcription (see Informatics, p. 23) factors (transfac.gbf.de/TRANSFAC/ 22 Human Genome News 10(1–2) February 1999 Calendar of Genome and Biotechnology Meetings*

More comprehensive lists of genome-related meetings and organizations offering training are available on the Web (www.ornl.gov/hgmis) or from HGMIS (see p. 10 for contact information).

April 1999...... 20. Involving Children in Genetic Suscepti- September 1999 ...... 8–9. Functional Proteomics:Integrating Tech- bility Research: Implications for Informed 13–14. NIH NHGRI Advisory Council Meet- nologies for Target Discovery and Disease Consent. NHGRI Lecture Series: Gail Geller ing; Bethesda, MD [K. Malone, 301/402-2205, Therapy; Boston [IBC, 508/481-6400, (Johns Hopkins University); Washington, DC Fax: -0837; [email protected]] [see contact: April 15] Fax: -7911; [email protected]; www.ibcusa.com] 18–21. 11th Intl. Genome Sequencing and 8–10. Human Genetics: Principles and 25–30. 19th Intl. Conf. on Yeast Genetics Analysis Conf.; Miami [TIGR, 301/838-3515, Applications in Medical Practice; Monterrey, and Molecular Biol.; Rimini, Italy [L. Frontali, Fax: -0229; [email protected]; www.tigr.org] Mexico [A. Morales, +11-528/348-6982; +39-06/445-3950, Fax: /446-1980; frontali@ [email protected]] axcasp.caspur.it; www.icgeb.trieste.it] October 1999 ...... 11–14. RECOMB ‘99; Lyon, France [INRIA, 29–June 1. 31st Annu. Meeting of the 5–10. 10th Intl. Congress on Genes, Gene +33-139/635-053, Fax: -638; symposia@ European Society of Human Genetics; Geneva Families, and Isozymes: Advances in Genome inria.fr; www.inria.fr/RECOMB99] [J. van Goyenkade,+31-20/679-3411, Research and Their Implications for Biol. in Fax: /673-7306; [email protected]; the 21st Century; Beijing [N. Wang, +8610/ 14. Genes Synapses and Long-Term Memory. eurocongres.com/eshg] 6255-1158, Fax: -1951; [email protected]. TIGR/NRC/DOE Distinguished Speaker cn.net] Series: Eric Kandel (Columbia Univ. College of 30–June 3. ASM General Meeting; Atlanta Physicians and Surgeons); Washington, DC [ASM, 202/942-9248, Fax: -9340; MeetingsInfo@ 6–10. Neurobiology of Drosophila; Cold [D. Hawkins, 301/838-3501, Fax: -0209; asmusa.org; www.asmusa.org/jnlsrc/news/ Spring Harbor, NY [see contact: May 19–23] [email protected]; www.tigr.org] mtgconf.htm] 16–19. NSGC 18th Annu. Educ. Conf.; Oak- 15. Using the Yeast Genome Sequence to land, CA [NSGC, 610/872-7608; [email protected]; June 1999 ...... www.nsgc.org] Learn About Biol. NHGRI Lecture Series: Proteome; San Francisco [see con- 9–10. 19–23. ASHG; San Francisco [C. Galkin, Robert Waterson (CSHL); Washington, DC tact: April 19–20] [L. Brooks, 301/496-7531; lisa_brooks@ 301/571-1825, Fax: /530-7079; cgalkin@ nih.gov] 10–13. American Society of Gene Therapy genetics.faseb.org; www.faseb.org/genetics/ Annu. Meeting; Washington, DC [M. Stallings, ashg/ashgmenu.htm] 17–21. Experimental Biol. ’99; Washington, 609/848-1000 ext. 264, Fax: -5274; mstallings@ DC [Meeting office, 301/530-7010, Fax: -7014; slackinc.com; www.asgt.org/natmtg.html] February 2000 ...... [email protected]; www.faseb.org/meetings/eb99/ 27–Mar. 2. Eighth DOE Human Genome index.htm] 14–15. Bioinformatics; San Francisco [see contact: April 19–20] Program Contractor-Grantee Workshop; 19–20. Protein Expression; Washington, DC 14–15. DNA Forensics; McLean, VA Santa Fe, NM [Sylvia Spengler, 510/486-4879, [CHI, 617/630-1300, Fax: -1325; chi@ Fax: -5717, [email protected]] ◊ healthtech.com; www.healthtech.com] [see contact: April 19–20] 23–24. Protein Sequence Structure Func- 16. Protein Structure; San Francisco tion; San Francisco [K. Clarke, 415/476-1913, [see contact: April 19–20] Fax: /502-4690; [email protected]; 17. Whole Genome Analysis by Optical Map- Training Events* mdi.ucsf.edu/PSSF_Mtg.html] ping. NHGRI Lecture Series: David Schwartz 27–28. 2nd Intl. Workshop on Advanced (New York University); Washington, DC April 1999...... [see contact: April 15] Genomics: Genomics and Drug Discovery; 23–25. 1999 Genetics Review Course; Tokyo [Secretariat, +81-3/5563-4342, 19–24. 26th FEBS Meeting; Nice, France Schaumburg, IL [M. Greenfield, 301/571- Fax: -4887; [email protected]] [G. Dirheimer, +33-388/417-055, Fax: /602- 1887, Fax: -1895; [email protected]; 218; [email protected]; www.faseb.org/genetics/acmg] May 1999...... coli.polytechnique.fr/febs99] 8. Clinical Cancer Genetics: A Practical 30–May 2. Medical Genetics and Genetic 24–25. Genomics, Functional Genomics, Counseling Review Course; Pittsburgh Approach; Baltimore [Johns Hopkins Univ. Proteomics, and Beyond: New Horizons for School of Medicine CME Office, 410/955-2959, (May 14–16, Oakland, CA) [NSGC, 610/872- the 21st Century Academia-Pharmaceutical 7608; [email protected]; www.nsgc.org] Fax: -0807; [email protected]] Collaborations; Paris [L. Drye, Fax: +33-140/ 13–14. Gene Quantification-Europe; 613-405; [email protected]; www.pasteur.fr/ June 1999 ...... Munich, Germany [see contact: April 19–20] Conf/euroconf.html] 9–29. Advanced Bacterial Genetics; Cold 16–20. BIO ’99; Seattle, WA [Meetings July 1999...... Spring Harbor, NY (app. deadline Mar. 15) Dept., 202/857-0244, Fax: /331-8132; [CSHL, 516/367-8346, Fax: -8845; meetings@ www.bio.org/meetings] 4–7. 1999 Behavior Genetics Assoc. Meeting; cshl.org; www.cshl.org] Vancouver, B.C., Canada [R. Rose, 812/855-8770, 16–20. New World Science for the Next Mil- Fax: -4691; [email protected]; www.bga.org/ 12–17. Contemporary Challenges in Health lennium. 1999 ASBMB Joint Meeting; San 1999] Care Ethics. Intensive Bioethics Course; Francisco [ASBMB, 301/530-7010, Fax: -7014; Washington, DC [Kennedy Inst. of Ethics, [email protected]; www.faseb.org/ 11–15. 9th European Congress on Biotech- 202/687-5477; [email protected]. meetings/asbmb/asbmb99] nol.; Brussels [Secretariat, +32-2/706-8174, edu; guweb.georgetown.edu/kennedy/courses/ Fax: -8170; [email protected]; www.ecb9.be] 17. Genomic Partnering-Europe; Munich, ibc98.htm] Germany [see contact: April 19–20] August 1999 ...... October 1999 ...... 17–18. NIH NHGRI Advisory Council Meet- 5–9. Microbial Biodiversity; Chicago [see 4–14. Gene Expression Analysis. Theoreti- ing; Bethesda, MD [K. Malone, 301/402-2205, contact: May 30–June 3] cal and Practical Course; Monterrey, Mexico Fax: -0837; [email protected]] 8–13. Human Molecular Genetics; Newport, [H. Barrera-Saldaña, +52-8/329-4173, 18–19. 3rd Annu. Human Genome-Europe; RI [GRC, 401/783-4011, Fax: -7644; Fax: /333-7747; [email protected]; Munich, Germany [see contact: April 19–20] [email protected]; www.grc.uri.edu] www.icgeb.trieste.it] 19–23. Genome Sequencing and Biol.; Cold 17–22. Yeast Cell Biol.; Cold Spring Harbor, 13–26. Genome Informatics; Cold Spring Spring Harbor, NY [CSHL, 516/367-8346, NY [see contact: May 19–23] Harbor, NY (app. deadline July 15) [see con- Fax: -8845; [email protected]; www.cshl.org] tact: June 9–29] 13–26. Macromolecular Crystallography; Cold Spring Harbor, NY (app. deadline *Dates and meeting status may change; courses may also be offered at other times and ◊ places; check with contact person. Attendance may be either limited or restricted. July 15) [see contact: June 9–29] February 1999 Human Genome News 10(1–2) 23

For Your Information

DOE: Office of Science NCI: Technologies for U.S. Genome Research Grants and Contracts Molecular Analysis Funding www.er.doe.gov/production/grants/ PAR-98-066 grants.html Investigators wishing to apply for funding are Comprehensive ELSI Program www.nih.gov/grants/guide/pa-files/ urged to discuss projects with agency staff Notice: Notice will be issued in early PAR-98-066.html before submitting proposals. summer, 1999. The National Cancer Institute (NCI) DOE Office of Biological and 99-03: Environmental Meteorology invites small business applications for Environmental Research Program—Vertical Transport and Mixing basic, clinical, and epidemiological Human Genome Program research to develop novel technologies • 99-06: Environmental Management Funding information, inquiries: for molecular analysis of cancers and [email protected] or 301/903-6488 Science Program—Research Related to their host environment. Funding • Subsurface Contamination/Vadose Zone Relevant documents: www.er.doe.gov/ mechanisms will be SBIR, STTR, and production/ober/hug_top.html Issues NCI’s recently developed Phased Inno- 99-07: Financial Assistance Program, Alexander Hollaender Distinguished vation Award (PAR-98-067). Postdoctoral Fellowships Energy Biosciences • Letter of Intent due: March 5, 1999 Research opportunities in energy-related life, 99-08: Next-Generation Internet • Application due: April 9, 1999 biomedical, and environmental sciences, in- Research in Basic Technologies cluding human and microbial genomes, global • Contact: Carol Dahl (301/496- change, and supporting disciplines. 99-09: Next-Generation Internet ◊ Applications, Network Technology, and 1550, [email protected]) • Next deadline: January 2000 Network Testbed Partnerships • Contact: Barbara Dorsey, Oak Ridge 99-10: Next-Generation Internet Uni- NIH: Network for Large-Scale Institute for Science and Education (423/576-9975, Fax: /241-5220, dorseyb@ versity Network Technology Testbeds Mouse Sequencing orau.gov, www.orau.gov/ober/hollaend.htm) 99-11: Fundamental Research in RFA HG-99-001 (released 12/98) Computational Molecular Biology Carbon Management Postdoctoral Fellowships 99-14: Low-Dose Radiation Research www.nih.gov/grants/guide/ Topic: Support career transitions into compu- Program ◊ rfa-files/RFA-HG-99-001.html tational molecular biology from other scien- Topic: Establish a Mouse Genome tific fields. Funded by DOE and the Alfred P. NHGRI: National Research Sequencing Network that will support Sloan Foundation to give young scientists an mouse genome mapping and sequencing. intensive 2-year postdoctoral opportunity in Service Award Fellowships Network goals are to generate necessary an appropriate molecular biology facility. • Topic: To engage in research relevant mapping resources and begin a working Contact: Christine Trance; Alfred P. Sloan to the Human Genome Project. Fel- draft of the mouse genome DNA Foundation; 630 Fifth Ave., Ste.2550; New sequence. Applications are encouraged York, NY 10111 (212/649-1649, Fax: /757- lowships for postdoctoral, senior post- 5117, [email protected]) doctoral, and predoctoral minorities or for pilot sequencing projects in new persons with disabilities are available groups and the capacity expansion of NIH National Human Genome to U.S. citizens or permanent residents; existing sequencing centers. Research Institute research in ethical, legal, and social • Letter of Intent due: March 1, • NHGRI program: 301/496-7531, issues (ELSI) is not open to predoc- 1999 Fax: /480-2770, www.nhgri.nih.gov/ toral students through this program. • Application due: April 29, 1999 About_NHGRI • Applications for postdoctoral • • Program announcements: www.nhgri.nih. and senior postdoctoral due: Contacts: See contact, NHGRI. gov/Grant_info December 5, April 5, and August 5. Sequencing: JanePeterson • ([email protected]; Mapping ELSI: 301/402-4997 • Applications for predoctoral mi- Resources: Bettie Graham norities or persons with disabili- Small Business Innovation ([email protected]) ◊ ties due: May 1 and November 15. Research Grants • Contacts: ELSI topics, Elizabeth DOE and NIH invite small business firms Thomson (301/402-4997, elizabeth_ NHGRI: Genomic Technology (under 500 employees) to submit grant [email protected]); all other topics, applications addressing the human genome Development topic. The two agencies also support the Small Bettie Graham (see NHGRI contact ◊ PAR-99-047 Business Technology Transfer (STTR) program information in box) to foster transfers between research institu- www.nhgri.nih.gov/Grant_info/ tions and small businesses. Funding/Research/advtech.html Contacts: (from p. 21) Informatics Topic: Advanced development of high- • DOE SBIR/STTR Office: 301/903-1414 or throughput methods, hardware, and -0569, Fax: -5488, [email protected]. ¶ Bioinformatics Guide software in support of genomic SBIR applications due March 2, 1999. Bioinformatics: A Practical Guide to the Analy- research. Initial emphasis will be on STTR due April 8, 1999. SBIR: sbir.er.doe. sis of Genes and Proteins is intended to help DNA sequencing technologies. gov/sbir; STTR: sttr.er.doe.gov/sttr the molecular biologist design and implement a • • Bettie Graham (see contact, NHGRI). NIH successful sequence-analysis strategy using the Letter of Intent due: February 15 SBIR due April 15, August 15, and Decem- overwhelming array of tools available, includ- and July 15 ber 15. STTR, April 1, August 1, and De- ing Internet resources. Edited by Andreas D. • Application due: April 26 and cember 1 Baxevanis (NIH NHGRI) and B.F. Francis September 21 National SBIR/STTR conference: April 9–11, Ouellette (then at NIH National Center for 1999, Washington, D.C. (www.zyn.com/sbir; • Biotechnology Information), the book is a Contact: Jeffery Schloss (see con- [email protected]). For regional conferences, collection of chapters on relevant topics from tact, NHGRI) ◊ see Web site. ◊ 16 contributors. Paperback, 362 pp. 1998.◊ 24 Human Genome News 10(1–2) February 1999

Human Genome Management Information System Subscription/Document Request* (Vol. 10, Nos. 1–2)

Name______(First) (MI) (Last) Affiliation ______

Department/Division ______

Street/P.O. Box/Building ______

City/State/Zip Code ______

Country ______Area of Interest______

Phone ______Fax______

E-Mail Address (important to list if you have one) ______*Please type, print carefully, or enclose a business card to ensure efficient shipping. To change name/address/affiliation or drop your subscription to Human Genome News, enclose your current HGN address label. Send to HGMIS address shown below and on p. 10. Request for Print Subscription and 2. ___Human Genome Program Report, Documents Available Online Only Documents (all online documents listed at Parts 1 and 2 (www.ornl.gov/hgmis/publicat/ Primer on Molecular Genetics www.ornl.gov/hgmis/publicat/publications.html) 97pr) (www.ornl.gov/hgmis/publicat/primer/in- 1. Human Genome News 3. ___To Know Ourselves tro.html) ___New Subscriber (www.ornl.gov/hgmis/tko/index.html) Five-Year Plan ___Change of Name/Affiliation/Address 4. ___Reprint of “What Can the New Gene (www.ornl.gov/hgmis/hg5yp) (circle all that apply) Tests Tell Us?” by Denise Casey, HGMIS Medicine and the New Genetics (www.ornl.gov/hgmis/publicat/judges/judgetoc. (www.ornl.gov/hgmis/resource/medicine.html) ___Drop Subscription html), Judges’ Journal 36(3), Summer 1997

SELECTED ACRONYMS ASBMB Am. Soc. for CME continuing medical GRC Gordon Res. Conf. INRIA French Natl. Inst. NSGC Natl. Soc. of RH radiation hybrid Biochem. and Mol. Biol. education HGMIS Human Genome for Research in Computer Genetic Counselors SNP single-nucleotide ASHG Am. Soc. for CSHL Cold Spring Management Information Science and Control OBER Office of Biologi- polymorphism Hum. Genet. Harbor Lab. System kb kilobase cal and Environmental STC sequence tag ASM Am. Soc. for DOE Dept. of Energy HGP Human Genome Mb megabase Research connector Microbiology ELSI ethical, legal, and Project or DOE Human NCBI Natl. Center for PAC P1 artificial STS sequence tagged BAC bacterial artificial social issues Genome Program Biotechnology chromosome site chromosome EST expressed sequence IBC Intl. Business Information PCR polymerase chain TIGR The Inst. for BIO Biotechnology tag NHGRI Natl. Human reaction Genomic Res. Industry Organization FEBS Fed. of Eur. I.M.A.G.E. Integrated Genome Research Institute RFA Request for WWW World Wide Web CHI Cambridge Biochem. Soc. Molecular Analysis of NIH Natl. Institutes of Applications YAC yeast artificial Healthtech Inst. FISH fluorescence in situ Gene Expression Health RECOMB Conference on chromosome cM centimorgan hybridization NRC Natl. Research Computational Molecular Council Biology OAK RIDGE NATIONAL LABORATORY • MANAGED BY LOCKHEED MARTIN ENERGY RESEARCH CORP. • FOR THE U.S. DEPARTMENT OF ENERGY Betty K. Mansfield HGMIS BULK RATE Oak Ridge National Laboratory U.S. POSTAGE 1060 Commerce Park, MS 6480 PAID Oak Ridge, TN 37830, U.S.A. OAK RIDGE, TN Postmaster: Do Not Forward. PERMIT NO. 3 Address Correction Requested. Return Postage Guaranteed.