DOE Joint Genome Institute Exceeds DNA Sequencing Goal U.S. HGP

Total Page:16

File Type:pdf, Size:1020Kb

DOE Joint Genome Institute Exceeds DNA Sequencing Goal U.S. HGP LOGY P BIO H YS IC Y S R T S I E T M H E I H C C S E N S G IC IN T E A ER RM ING INFO ISSN: 1050–6101, Issue No. 44 Vol. 10, Nos. 1–2, February 1999 In This Issue U.S. HGP on Fast Track for Early Completion Genome Project U.S.HGPonFastTrack..............1 n September 1998, advisory commit- The U.S. HGP began officially in 1990 DOE Joint Genome Institute ..........1 Itees at DOE and NIH approved new as a $3-billion, 15-year program to New 5-Year Goals...................3 5-year goals aimed at completing the find the estimated 80,000 human Faster Sequencing with BACs .........4 Human Genome Project (HGP) 2 years genes and determine the sequence of SNPMarkers......................6 earlier than originally planned in the 3 billion DNA building blocks Who’s Sequencing the Human Genome? 6 1990. The target date of 2003 also will that underlie all of human biology 1999 Oakland Workshop Proceedings...6 mark the 50th anniversary of Watson and its diversity. The early phase of Genomics Progress in Science ........6 and Crick’s description of DNA’s fun- the HGP was characterized by efforts Private-Sector Sequencing............7 GeneMap ’98 ......................7 damental structure. to create the biological, instrumenta- tion, and computing resources neces- The new plan was published in the In the News sary for efficient production-scale October 23, 1998, issue of Science, EMSL Remote Access ...............7 DNA sequencing. The first 5-year C. elegans Sequence ................8 which also cited the contributions of plan was revised in 1993 due to DOE BER Research.................9 international partners. These part- remarkable technological progress, Hollaender Fellows..................9 ners include the Sanger Centre in the and the second plan projected goals SBIR 1998 Awards ..................9 United Kingdom and research centers Mouse Resources .................10 Mouse Consortium .................10 in Germany, Japan, and France. (see Five-Year Plan, p. 2) Chlamydia Genome Analysis .........11 HUGO Office Merger ...............11 DOE Joint Genome Institute Exceeds Microbial Genomics Superbug Deinococcus radiodurans ...12 DNA Sequencing Goal Microbial Genome News ............12 ELSI he DOE Joint Genome Institute million “finished” bases and 40 million Eric Lander, HGP Science ...........13 T(JGI) surpassed its sequencing goal “draft” bases. [“Finished” sequence has Mark Rothstein, Genetic Privacy ......14 of 20 Mb of human DNA for FY 1998, been checked for accuracy, with gaps James Wilson, Gene Therapy ........15 marking almost a tenfold increase in filled in to form a continuous stretch LeRoy Walters, Gene Therapy ........16 production over the previous year. of DNA across a chromosomal region.] The JGI sequencing effort is targeting Radio and TV Programs.............17 “With this milestone, JGI rises to third Courses and Curriculum.............17 chromosomes 5, 16, and 19. position worldwide in terms of its total Proteomics contribution of human DNA sequence “We are seeking to break the 100-Mb Looking at Proteins.................18 to public databases and signals great barrier in the year 2000,” said JGI Proteomics News ..................19 promise for completion of the entire Director Elbert Branscomb. With 1998 Genetics in Medicine [Human Genome] project in 5 years,” worldwide sequencing capacity at Organization for Rare Disorders.......19 noted Martha Krebs, Director of the about 200 Mb per year, all major Genetics in Medicine News . 15, 16, 19, 20 DOE Office of Science. sequencing laboratories are ramping Informatics up production. At least 600 Mb of Software Programs..............11,20 Further dramatic increases are sequence is expected for 1999. GDB and Other Databases ..........21 expected as JGI’s main sequencing Publications ......................21 efforts move to its new facility in Wal- JGI, established at the end of 1996, is a consortium of scientists, engineers, Web, Other Resources, Publications nut Creek, California, half of which and support staff from the Lawrence 1999 Oakland Workshop Web Site .....6 has just been occupied. When the sec- Berkeley, Lawrence Livermore, and Gene Logic License ................11 ond half is completed in March 2000, Los Alamos national laboratories Calendars .......................22 about 200 staff members will main- tain around-the-clock operations. [HGN 8(2), 1; www.ornl.gov/hgmis/ Funding.........................23 publicat/hgn/v8n2/01doe.html; JGI Subscriptions, Acronyms ..........24 JGI’s sequencing goal for the current sequencing goals and progress: fiscal year is 70 Mb, including 30 www.jgi.doe.gov].◊ Human Genome News issues become available on the Web soon after going to press. Archived issues are also available. 2 Human Genome News 10(1–2) February 1999 Genome Project According to Ari Patrinos, DOE Asso- Five-Year Plan (from p. 1) To find out “Who’s Sequencing the Human ciate Director for Biological and Envi- Genome,” see p. 6. through FY 1998. The latest plan was ronmental Research, “Although we developed during a series of individ- have as our primary goal the finished ual and joint DOE and NIH work- sets of full-length cDNA clones and ‘Book of Life’ by the end of 2003, we sequences for human and model- shops held over the past 2 years (see also want the working draft to be as box, p. 3). organism genes. Other functional- useful as possible.” genomics goals include studies into Observers have predicted that the NIH and DOE sequencing centers gene expression and control, creation 21th century will be the “biology cen- expect their facilities to generate of mutations that cause loss or tury.” The analytical power arising about 60% to 70% of the human DNA alteration of function in nonhuman from the reference DNA sequences of sequence, which will be made avail- organisms, and development of several entire genomes and other able broadly and rapidly via the Web experimental and computational genomic resources is anticipated to to stimulate further research. methods for protein analyses. help jump start the new millennium. Sequencing Technology Comparative Genomics Human DNA Sequencing Although current sequencing capacity The functions of human genes and The HGP’s continued emphasis is on is far greater than at the inception of other DNA regions often are revealed obtaining a complete and highly accu- the HGP, achieving the new sequencing by studying their parallels in nonhu- rate reference sequence (1 error in goals will require a two- to threefold mans. To enable such comparisons, 10,000 bases) that is largely continu- improvement. Further incremental HGP researchers have obtained com- ous across each human chromosome. advances in sequencing technologies, plete genomic sequences for the bacte- Scientists believe that knowing this efficiency, and cost will be needed. For rium Escherichia coli, the yeast sequence is critically important for future sequencing applications, plan- Saccharomyces cerevisiae, and the understanding human biology and for ners emphasize the importance of sup- roundworm Caenorhabditis elegans. applications to other fields. porting novel technologies that may be Sequencing continues on Drosophila The plan calls for generating a “work- 5 to 10 years in development. melanogaster and the laboratory mouse. The availability of complete ing draft” of the human genome DNA Sequence Variation sequence by 2001. The working draft genome sequences generated both A new goal focuses on identifying indi- will comprise shotgun sequence data inside and outside the HGP is driving vidual variations in the human genome. from mapped clones, with gaps and a major breakthrough in fundamental Although more than 99% of human ambiguities unresolved. If these data biology as scientists compare entire DNA sequences are the same across sets can be merged with those from genomes to gain new insights into the population, variations in DNA the private sector, they may increase evolutionary, biochemical, genetic, sequence can have a major impact on the depth of the mapped draft, which metabolic, and physiological path- how humans respond to disease; envi- scientists expect will contain about ways. HGP planners stress the need ronmental insults such as bacteria, half the genes. Draft sequence will for a sustainable sequencing capacity viruses, toxins, and chemicals; and provide a foundation for obtaining the to facilitate future comparisons. drugs and other therapies. high-quality finished sequence and Ethical, Legal, and Social also will be a valuable tool for Methods are being developed to detect Implications (ELSI) researchers hunting disease genes. different types of variation, particu- Rapid advances in the science of genet- larly the most common type called ics and its applications present new single-nucleotide polymorphisms and complex ethical and policy issues (SNPs), which occur about once every Resource Success Story for individuals and society. ELSI pro- 100 to 300 bases. Scientists believe grams that identify and address these SNP maps will help them identify the implications have been an integral A Critical Resource in Public, multiple genes associated with such part of the U.S. HGP since its incep- Private Sequencing complex diseases as cancer, diabetes, tion. These programs have resulted in For sequencing the human genome, vascular disease, and some forms of a body of work that promotes educa- scientists prefer the larger and mental illness. These associations are tion and helps guide the conduct of more stable BAC clones first devel- difficult to establish with conventional genetic research and the development oped with DOE support. gene-hunting methods because a sin- of related medical and public policies. BACs will be a critical component of gle altered gene may make only a the much-publicized private-sector small contribution to disease risk. A continuing challenge is to safeguard sequencing efforts of such companies the privacy of individuals and groups as Celera Genomics and Incyte [see Functional Genomics who contribute DNA samples for p. 7 and HGN 9(1–2), 1 (www.ornl. Efficient interpretation of the func- large-scale sequence-variation stud- gov/hgmis/publicat/hgn/v9n3/ tions of human genes and other DNA ies.
Recommended publications
  • University of Maryland Establishes Center for Health Related Informatics and Bioimaging
    University of Maryland Establishes Center for Health Related Informatics and Bioimaging News & Events UNIVERSITY OF MARYLAND ESTABLISHES CENTER FOR HEALTH CONTACT RELATED INFORMATICS AND BIOIMAGING Monday, June 10, 2013 Karen Robinson, Media Relations New Center, Led By Distinguished Bioinformatics Scientist Owen White, (410) 706-7590 Ph.D., Unites Scientists Across Disciplines to Integrate Genomics and Personalized Medicine into Research and Clinical Care [email protected] University of Maryland, Baltimore campus President Jay A. Perman, M.D., and Media Relations Team University of Maryland School of Medicine Dean E. Albert Reece, M.D., Ph.D., M.B.A., wish to announce the establishment of a new center to unite research scientists and physicians across disciplines. The center will employ these SEARCH ARTICLES interdisciplinary connections to enhance the use of cutting edge medical science such as genomics and personalized medicine to accelerate research discoveries and improve health care outcomes. Participants in the new News Archives University of Maryland Center for Health-Related Informatics and Bioimaging (CHIB) will collaborate with computer scientists, engineers, life scientists and others at a similar center at the University of Maryland, College Park campus, CHIB Co-Director Owen White, Ph.D. together forming a joint center supported by the M-Power Maryland initiative. LEARN MORE University of Maryland School of Medicine Dean E. Albert Reece, M.D., Ph.D., M.B.A., with the concurrence of President Perman, has appointed as co-director of the new center Owen White, Ph.D., Professor of Epidemiology and Public Health and Director of Bioinformatics at the University of Maryland School of Medicine Institute for Genome Sciences.
    [Show full text]
  • UV Laser-Induced, Time-Resolved Transcriptome Responses of Saccharomyces Cerevisiae
    University of Tennessee, Knoxville TRACE: Tennessee Research and Creative Exchange Microbiology Publications and Other Works Microbiology 2019 UV Laser-Induced, Time-Resolved Transcriptome Responses of Saccharomyces cerevisiae Melinda Hauser University of Tennessee, Knoxville Paul E. Abraham Oak Ridge National Laboratory, Oak Ridge Lorenz Barcelona University of Tennessee, Knoxville Jeffrey M. Becker University of Tennessee, Knoxville Follow this and additional works at: https://trace.tennessee.edu/utk_micrpubs Recommended Citation Hauser, Melinda; Abraham, Paul E.; Barcelona, Lorenz; and Becker, Jeffrey M., "UV Laser-Induced, Time- Resolved Transcriptome Responses of Saccharomyces cerevisiae" (2019). Microbiology Publications and Other Works. https://trace.tennessee.edu/utk_micrpubs/107 This Article is brought to you for free and open access by the Microbiology at TRACE: Tennessee Research and Creative Exchange. It has been accepted for inclusion in Microbiology Publications and Other Works by an authorized administrator of TRACE: Tennessee Research and Creative Exchange. For more information, please contact [email protected]. INVESTIGATION UV Laser-Induced, Time-Resolved Transcriptome Responses of Saccharomyces cerevisiae Melinda Hauser,* Paul E. Abraham,† Lorenz Barcelona,*,1 and Jeffrey M. Becker*,2 *Department of Microbiology, University of Tennessee, Knoxville, TN 37996 and †Chemical Sciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831 ORCID ID: 0000-0003-0467-5913 (J.M.B.) ABSTRACT We determined the effect on gene transcription of laser-mediated, long-wavelength KEYWORDS UV-irradiation of Saccharomyces cerevisiae by RNAseq analysis at times T15, T30, and T60 min after re- yeast covery in growth medium. Laser-irradiated cells were viable, and the transcriptional response was transient, gene expression with over 400 genes differentially expressed at T15 or T30, returning to basal level transcription by T60.
    [Show full text]
  • Computational Analysis and Predictive Cheminformatics Modeling of Small Molecule Inhibitors of Epigenetic Modifiers
    RESEARCH ARTICLE Computational Analysis and Predictive Cheminformatics Modeling of Small Molecule Inhibitors of Epigenetic Modifiers Salma Jamal1☯‡, Sonam Arora2☯‡, Vinod Scaria3* 1 CSIR Open Source Drug Discovery Unit (CSIR-OSDD), Anusandhan Bhawan, Delhi, India, 2 Delhi Technological University, Delhi, India, 3 GN Ramachandran Knowledge Center for Genome Informatics, CSIR Institute of Genomics and Integrative Biology (CSIR-IGIB), Delhi, India ☯ These authors contributed equally to this work. ‡ These authors are joint first authors on this work. * [email protected] a11111 Abstract Background The dynamic and differential regulation and expression of genes is majorly governed by the OPEN ACCESS complex interactions of a subset of biomolecules in the cell operating at multiple levels start- Citation: Jamal S, Arora S, Scaria V (2016) ing from genome organisation to protein post-translational regulation. The regulatory layer Computational Analysis and Predictive contributed by the epigenetic layer has been one of the favourite areas of interest recently. Cheminformatics Modeling of Small Molecule Inhibitors of Epigenetic Modifiers. PLoS ONE 11(9): This layer of regulation as we know today largely comprises of DNA modifications, histone e0083032. doi:10.1371/journal.pone.0083032 modifications and noncoding RNA regulation and the interplay between each of these major Editor: Wei Yan, University of Nevada School of components. Epigenetic regulation has been recently shown to be central to development Medicine, UNITED STATES of a number of disease processes. The availability of datasets of high-throughput screens Received: June 10, 2013 for molecules for biological properties offer a new opportunity to develop computational methodologies which would enable in-silico screening of large molecular libraries.
    [Show full text]
  • Genome Informatics 2016 Davide Chicco1 and Michael M
    Chicco and Hoffman Genome Biology (2017) 18:5 DOI 10.1186/s13059-016-1135-5 MEETINGREPORT Open Access Genome Informatics 2016 Davide Chicco1 and Michael M. Hoffman1,2,3* Abstract of sequence variants. Konrad Karczewski (Massachusetts General Hospital, USA) presented the Loss Of Func- A report on the Genome Informatics conference, held tion Transcript Effect Estimator (LOFTEE, https://github. at the Wellcome Genome Campus Conference Centre, com/konradjk/loftee). LOFTEE uses a support vector Hinxton, United Kingdom, 19–22 September 2016. machine to identify sequence variants that significantly disrupt a gene and potentially affect biological processes. We report a sampling of the advances in computational Martin Kircher (University of Washington, USA) dis- genomics presented at the most recent Genome Informat- cussed a massively parallel reporter assay (MPRA) that ics conference. As in Genome Informatics 2014 [1], speak- uses a lentivirus for genomic integration, called lentiM- ers presented research on personal and medical genomics, PRA [3]. He used lentiMPRA to predict enhancer activity, transcriptomics, epigenomics, and metagenomics, new and to more generally measure the functional effect of sequencing techniques, and new computational algo- non-coding variants. William McLaren (European Bioin- rithms to crunch ever-larger genomic datasets. Two formatics Institute, UK) presented Haplosaurus, a variant changes were notable. First, there was a marked increase effect predictor that uses haplotype-phased data (https:// in the number of projects involving single-cell analy- github.com/willmclaren/ensembl-vep). ses, especially single-cell RNA-seq (scRNA-seq). Second, Two presenters discussed genome informatics while participants continued the practice of present- approaches to the analysis of cancer immunotherapy ing unpublished results, a large number of the presen- response.
    [Show full text]
  • Teaching Bioinformatics and Neuroinformatics Using Free Web-Based Tools
    UCLA Behavioral Neuroscience Title Teaching Bioinformatics and Neuroinformatics Using Free Web-based Tools Permalink https://escholarship.org/uc/item/2689x2hk Author Grisham, William Publication Date 2009 Supplemental Material https://escholarship.org/uc/item/2689x2hk#supplemental Peer reviewed eScholarship.org Powered by the California Digital Library University of California – SUBMITTED – [Article; 34,852 Characters] Teaching Bioinformatics and Neuroinformatics Using Free Web-based Tools William Grisham 1, Natalie A. Schottler 1, Joanne Valli-Marill 2, Lisa Beck 3, Jackson Beatty 1 1Department of Psychology, UCLA; 2 Office of Instructional Development, UCLA; 3Department of Psychology, Bryn Mawr College Keywords: quantitative trait locus, digital teaching tools, web-based learning, genetic analysis, in silico tools DRAFT: Copyright William Grisham, 2009 Address correspondence to: William Grisham Department of Psychology, UCLA PO Box 951563 Los Angeles, CA 90095-1563 [email protected] Grisham Teaching Bioinformatics Using Web-Based Tools ABSTRACT This completely computer-based module’s purpose is to introduce students to bioinformatics resources. We present an easy-to-adopt module that weaves together several important bioinformatic tools so students can grasp how these tools are used in answering research questions. This module integrates information gathered from websites dealing with anatomy (Mouse Brain Library), Quantitative Trait Locus analysis (WebQTL from GeneNetwork), bioinformatics and gene expression analyses (University of California, Santa Cruz Genome Browser, NCBI Entrez Gene, and the Allen Brain Atlas), and information resources (PubMed). This module provides for teaching genetics from the phenotypic level to the molecular level, some neuroanatomy, some aspects of histology, statistics, Quantitaive Trait Locus analysis, molecular biology including in situ hybridization and microarray analysis in addition to introducing bioinformatic resources.
    [Show full text]
  • Gene Selection Using Gene Expression Data in a Virtual Environment
    276 Genome Informatics 13: 276-277 (2002) Gene Selection Using Gene Expression Data in a Virtual Environment Kunihiro Nishimura1 Shumpei Ishikawa2 [email protected] [email protected] Shuichi Tsutsumi2 Hiroyuki Aburatani2 [email protected] [email protected] Koichi Hirota2 Michitaka Hirose2 [email protected] [email protected] 1 Graduate School of Information Science and Technology , The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan 2 Reseach Center for Advanced Science and Technology , The University of Tokyo, 4-6-1 Komaba, Meguro-ku, Tokyo 153-8904, Japan Keywords: gene selection, gene expression, virtual reality, visualization 1 Gene Selection for Drug Discovery The gene expression data of many normal and diseased tissues enables the selection of genes that satisfy the conditions for finding a new drug. In the process of selecting genes is required to determine many filtering parameters. The system of the selection of genes from all genes is required to have the following features: 1) visualization of the data to provide the users to grasp both holistic and detail views of the data, 2) interactivity to determine many filtering parameters, 3) provides spatial cue to understand intuitively because there are many filtering parameters. In order to visualize various information of large number of genes, the system also needs a wide field of view. Virtual reality technology satisfies these requirements above. 2 Virtual Reality Virtual reality technology can help improve genome data analysis by enabling interactive visualization and providing a three-dimensional virtual environment.
    [Show full text]
  • Cheminformatics Education
    Cheminformatics Education Introduction Despite some skepticism at the turn of the century1,2 the terms “cheminformatics” and “chemoinformatics” are now in common parlance. The term “chemical informatics” is used less. The premier journal in the field, the Journal of Chemical Information and Modeling, does not use any of these terms and its cheminformatics papers are scattered across multiple sections including “chemical Information”. Unfortunately it is not just the name of the discipline that is undefined: opinions also vary on the scope. Paris supplied the following definition1 (shortened here): “Chem(o)informatics is a generic term that encompasses the design, creation, organization, storage, management, retrieval, analysis, dissemination, visualization and use of chemical information, not only in its own right, but as a surrogate or index for other data, information and knowledge”. Brown3 says the discipline is “mixing of information resources to transform data into information, and information into knowledge, for the intended purpose of making better decisions faster in the arena of drug lead identification and optimization”. Gasteiger4 says that cheminformatics is the application of informatics methods to solve chemical problems and Bajorath5 agrees that a broad definition such as that is needed to cover all the different scientific activities that have evolved, or have been assimilated, under the cheminformatics umbrella. Varnek’s definition6,7 is rather different. He considers that cheminformatics is a part of theoretical chemistry based on its own molecular model; unlike quantum chemistry in which molecules are represented as ensemble of electrons and nuclei, or force fields molecular modeling dealing with classical “atoms” and “bonds”, cheminformatics considers molecules as objects (graphs and vectors) in multidimensional chemical space.
    [Show full text]
  • Archaea and the Origin of Eukaryotes
    REVIEWS Archaea and the origin of eukaryotes Laura Eme, Anja Spang, Jonathan Lombard, Courtney W. Stairs and Thijs J. G. Ettema Abstract | Woese and Fox’s 1977 paper on the discovery of the Archaea triggered a revolution in the field of evolutionary biology by showing that life was divided into not only prokaryotes and eukaryotes. Rather, they revealed that prokaryotes comprise two distinct types of organisms, the Bacteria and the Archaea. In subsequent years, molecular phylogenetic analyses indicated that eukaryotes and the Archaea represent sister groups in the tree of life. During the genomic era, it became evident that eukaryotic cells possess a mixture of archaeal and bacterial features in addition to eukaryotic-specific features. Although it has been generally accepted for some time that mitochondria descend from endosymbiotic alphaproteobacteria, the precise evolutionary relationship between eukaryotes and archaea has continued to be a subject of debate. In this Review, we outline a brief history of the changing shape of the tree of life and examine how the recent discovery of a myriad of diverse archaeal lineages has changed our understanding of the evolutionary relationships between the three domains of life and the origin of eukaryotes. Furthermore, we revisit central questions regarding the process of eukaryogenesis and discuss what can currently be inferred about the evolutionary transition from the first to the last eukaryotic common ancestor. Sister groups Two descendants that split The pioneering work by Carl Woese and colleagues In this Review, we discuss how culture- independent from the same node; the revealed that all cellular life could be divided into three genomics has transformed our understanding of descendants are each other’s major evolutionary lines (also called domains): the archaeal diversity and how this has influenced our closest relative.
    [Show full text]
  • The Prokaryotic Biology of Soil
    87 (1) · April 2015 pp. 1–28 InvIted revIew the Prokaryotic Biology of Soil Johannes Sikorski Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Inhoffenstr. 7 B, 38124 Braunschweig, Germany E-mail: [email protected] Received 1 March 2015 | Accepted 17 March 2015 Published online at www.soil-organisms.de 1 April 2015 | Printed version 15 April 2015 Abstract Prokaryotes (‘Bacteria’ and ‘Archaea’) are the most dominant and diverse form of life in soil and are indispensable for soil ecology and Earth system processes. This review addresses and interrelates the breadth of microbial biology in the global context of soil biology primarily for a readership less familiar with (soil) microbiology. First, the basic properties of prokaryotes and their major differences to macro-organisms are introduced. Further, technologies to study soil microbiology such as high-throughput next-generation sequencing and associated computational challenges are addressed. A brief insight into the principles of microbial systematics and taxonomy is provided. Second, the complexity and activity of microbial communities and the principles of their assembly are discussed, with a focus on the spatial distance of a few µm which is the scale at which prokaryotes perceive their environment. The interactions of prokaryotes with plant roots and soil fauna such as earthworms are addressed. Further, the role, resistance and resilience of prokaryotic soil communities in the light of anthropogenic disturbances such as global warming, elevated CO2 and massive nitrogen and phosphorous fertilization is discussed. Finally, current discussions triggered by the above-addressed complexity of microbes in soil on whether microbial ecology needs a theory that is different from that of macroecology are viewed.
    [Show full text]
  • Informatics and the Human Genome Project
    Invited manuscript: submitted to IEEE Engineering in Medicine and Biology for a special issue on genome informatics. INFORMATICS AND THE HUMAN GENOME PROJECT Robert J. Robbins US Department of Energy [email protected] David Benton National Center for Human Genome Research [email protected] Jay Snoddy US Department of Energy [email protected] TABLE OF CONTENTS INTRODUCTION 1 INFORMATION AND GENOME PROJECTS 3 THE NATURE OF INFORMATION TECHNOLOGY 4 MOORE’S LAW 5 INFORMATICS 6 INFORMATICS ENABLES BIG SCIENCE 6 THE INTELLECTUAL STANDING OF INFORMATICS 7 AGENCY COMMITMENTS 8 COMMUNITY RECOMMENDATIONS 8 OTA Report 9 Baltimore White Paper 9 GeSTeC Report 10 Workshop on Database Interoperability 11 Meeting on Interconnection of Molecular Biology Databases (MIMBD) 11 Interoperable Software Tools 12 SUMMARY NEEDS 12 CURRENT TRENDS 12 FUTURE SUPPORT 13 OPEN ISSUES 13 Invited manuscript: submitted to IEEE Engineering in Medicine and Biology for a special issue on genome informatics. INFORMATICS AND THE HUMAN GENOME PROJECT1 ROBERT J. ROBBINS, DAVID BENTON, AND JAY SNODDY INTRODUCTION Information technology is transforming biology and the relentless effects of Moore’s Law (discussed later) are transforming that transformation. Nowhere is this more apparent than in the international collaboration known as the Human Genome Project (HGP). Before considering the relationship of informatics to genomic research, let us take a moment to consider the science of the HGP. It has been known since antiquity that like begets like, more or less. Cats have kittens, dogs have puppies, and acorns grow into oak trees. A scientific basis for that observation was first provided with the development of the new science of genetics at the beginning of this century.
    [Show full text]
  • Introduction to Bioinformatics a Complex Systems Approach
    Introduction to Bioinformatics A Complex Systems Approach Luis M. Rocha Complex Systems Modeling CCS3 - Modeling, Algorithms, and Informatics Los Alamos National Laboratory, MS B256 Los Alamos, NM 87545 [email protected] or [email protected] 1 Bioinformatics: A Complex Systems Approach Course Layout 3 Monday: Overview and Background Luis Rocha 3 Tuesday: Gene Expression Arrays – Biology and Databases Tom Brettin 3 Wednesday: Data Mining and Machine Learning Luis Rocha and Deborah Stungis Rocha 3 Thursday: Gene Network Inference Patrik D'haeseleer 3 Friday: Database Technology, Information Retrieval and Distributed Knowledge Systems Luis Rocha 2 Bioinformatics: A Complex Systems Approach Overview and Background 3A Synthetic Approach to Biology Information Processes in Biology – Biosemiotics Genome, DNA, RNA, Protein, and Proteome – Information and Semiotics of the Genetic System Complexity of Real Information Proceses – RNA Editing and Post-Transcription changes Reductionism, Synthesis and Grand Challenges Technology of Post-genome informatics – Sequence Analysis: dynamic programming, simulated anealing, genetic algorithms Artificial Life 3 Information Processes in Biology Distinguishes Life from Non-Life Different Information Processing Systems (memory) 3 Genetic System Construction (expression, development, and maintenance) of cells ontogenetically: horizontal transmission Heredity (reproduction) of cells and phenotypes: vertical transmission 3 Immune System Internal response based on accumulated experience (information) 3
    [Show full text]
  • Supplemental Information Legends Figure S1. Single Cell Chromatin Accessibility Landscape of MF Polarization
    Supplemental Information Legends Figure S1. Single cell chromatin accessibility landscape of MF polarization. Related to Figure 1. (A) Density plot of transcription start site (TSS) read enrichment as a function of unique scATAC fragment count per cell for the three samples. Cells passing the filter of having at least a TSS enrichment score of 3 and 1000 unique fragments are used in downstream analyses. Median TSS enrichment (MTE) is shown for each sample. (B) UMAP of TSS enrichment scores. (C) Tn5 bias corrected transcription factor footprints in M0(CTR), M2(IL-4) and M1(IFNG) macrophages around the respective motif’s center. (D) Relative expression level of Tlr2, Arg1 and Itgax from an IL-4 time course bulk RNA-seq experiment (GSE106706). (F) Heatmap visualization of the motif deviation scores of the indicated transcription factors over the M0(CTR) – M2(IL-4) and M0(CTR) – M1(IFNG) polarization trajectories. (E) scATAC-seq gene score values over the M2 polarization trajectory and relative bulk RNA-seq expression values (GSE106706) over the IL-4 time course for the indicated genes are shown, exhibiting: 1; reduced accessibility/expression (LOST), 2; gaining early accessibility/induction (EARLY) and showing late accessibility and late induction at the mRNA level (LATE). Small schematics indicate scATAC-seq data (Accessibility) and bulk RNA-seq data (mRNA level). Figure S2. Single cell transcriptomic analysis of MF polarization. Related to Figure 2. (A) UMAP of scRNA-seq experiments on M0(CTR), M2(IL-4) and M1(IFNG) macrophages. (B) Heatmap of differentially expressed genes in M0(CTR) vs. M2(IL-4) and M0(CTR) vs.
    [Show full text]