Genome Sequence of the Brown Norway Rat Yields Insights Into Mammalian Evolution

Total Page:16

File Type:pdf, Size:1020Kb

Genome Sequence of the Brown Norway Rat Yields Insights Into Mammalian Evolution articles Genome sequence of the Brown Norway rat yields insights into mammalian evolution Rat Genome Sequencing Project Consortium* *Lists of participants and affiliations appear at the end of the paper ........................................................................................................................................................................................................................... The laboratory rat (Rattus norvegicus) is an indispensable tool in experimental medicine and drug development, having made inestimable contributions to human health. We report here the genome sequence of the Brown Norway (BN) rat strain. The sequence represents a high-quality ‘draft’ covering over 90% of the genome. The BN rat sequence is the third complete mammalian genome to be deciphered, and three-way comparisons with the human and mouse genomes resolve details of mammalian evolution. This first comprehensive analysis includes genes and proteins and their relation to human disease, repeated sequences, comparative genome-wide studies of mammalian orthologous chromosomal regions and rearrangement breakpoints, reconstruc- tion of ancestral karyotypes and the events leading to existing species, rates of variation, and lineage-specific and lineage- independent evolutionary events such as expansion of gene families, orthology relations and protein evolution. Darwin believed that “natural selection will always act very slowly, primate: (1) These rodent genomic changes include approximately often only at long intervals of time”1. The consequences of evolution 250 large rearrangements between a hypothetical murid ancestor over timescales of approximately 1,000 millions of years (Myr) and and human, approximately 50 from the murid ancestor to rat, and 75 Myr were investigated in publications comparing the human about the same from the murid ancestor to mouse. (2) A threefold- with invertebrate and mouse genomes, respectively2,3. Here we higher rate of base substitution in neutral DNA is found along the describe changes in mammalian genomes that occurred in a shorter rodent lineage when compared with the human lineage, with the time interval, approximately 12–24 Myr (refs 4, 5) since the com- rate on the rat branch 5–10% higher than along the mouse branch. mon ancestor of rat and mouse. (3) Microdeletions occur at an approximately twofold-higher rate The comparison of these genomes has produced a number of than microinsertions in both rat and mouse branches. insights: † A strong correlation exists between local rates of microinsertions † The rat genome (2.75 gigabases, Gb) is smaller than the human and microdeletions, transposable element insertion, and nucleotide (2.9 Gb) but appears larger than the mouse (initially 2.5 Gb (ref. 3) substitutions since divergence of rat and mouse, even though these but given as 2.6 Gb in NCBI build 32, see http://www.ncbi.nlm. events occurred independently in the two lineages. nih.gov/genome/seq/NCBIContigInfo.html). † The rat, mouse and human genomes encode similar numbers of Background genes. The majority have persisted without deletion or duplication since the last common ancestor. Intronic structures are well History of the rat conserved. The rat, hated and loved at once, is both scourge and servant to † Some genes found in rat, but not mouse, arose through expansion mankind. The “Devil’s Lapdog” is the first sign in the Chinese of gene families. These include genes producing pheromones, or zodiac and traditionally carries the Hindu god Ganesh6. Rats are a involved in immunity, chemosensation, detoxification or reservoir of pathogens, known to carry over 70 diseases. They are proteolysis. involved in the transmission of infectious diseases to man, including † Almost all human genes known to be associated with disease have cholera, bubonic plague, typhus, leptospirosis, cowpox and hanta- orthologues in the rat genome but their rates of synonymous virus infections. The rat remains a major pest, contributing to substitution are significantly different from the remaining genes. famine with other rodents by eating around one-fifth of the world’s † About 3% of the rat genome is in large segmental duplications, a food harvest. fraction intermediate between mouse (1–2%) and human (5–6%). Paradoxically, the rat’s contribution to human health cannot be These occur predominantly in pericentromeric regions. Recent overestimated, from testing new drugs, to understanding essential expansions of major gene families are due to these genomic nutrients, to increasing knowledge of the pathobiology of human duplications. disease. In many parts of the world the rat remains a source of † The eutherian core of the rat genome—that is, bases that align meat. orthologously to mouse and human—comprises a billion nucleo- The laboratory rat (R. norvegicus) originated in central Asia and tides (,40% of the euchromatic rat genome) and contains the vast its success at spreading throughout the world can be directly majority of exons and known regulatory elements (1–2% of the attributed to its relationship with humans7.J.Berkenhout,in genome). A portion of this core constituting 5–6% of the genome his 1769 treatise Outline of the Natural History of Great Britain, appears to be under selective constraint in rodents and primates, mistakenly took it to be from Norway and used R. norvegicus while the remainder appears to be evolving neutrally. Berkenhout in the first formal Linnaean description of the species. † Approximately 30% of the rat genome aligns only with mouse, a Whereas the black rat (Rattus rattus) was part of the European considerable portion of which is rodent-specific repeats. Of the landscape from at least the third century AD and is the species non-aligning portion, at least half is rat-specific repeats. associated with the spread of bubonic plague, R. norvegicus probably † More genomic changes occurred in the rodent lineages than the originated in northern China and migrated to Europe somewhere 493 NATURE | VOL 428 | 1 APRIL 2004 | www.nature.com/nature © 2004 Nature Publishing Group articles around the eighteenth century8. They may have entered Europe BCM-HGSC, analysis was performed by an international team, after an earthquake in 1727 by swimming the Volga river. representing over 20 groups in six countries and relying largely on gene and protein predictions produced by Ensembl. The rat in research R. norvegicus was the first mammalian species to be domesticated Determination of the genome sequence for scientific research, with work dating to before 1828 (ref. 9). The first recorded breeding colony for rats was established in 1856 Atlas and the ‘combined’ sequencing strategy (ref. 9). Rat genetics had a surprisingly early start. The first studies Despite progress in assembling draft sequences2,3,39–44 the question by Crampe from 1877 to 1885 focused on the inheritance of coat of which method produces the highest-quality products is unre- colour10. Following the rediscovery of Mendel’s laws at the turn of solved. A significant issue is the choice between logistically simpler the century, Bateson used these concepts in 1903 to demonstrate WGS approaches versus more complex strategies employing bac- that rat coat colour is a mendelian trait10. The first inbred rat terial artificial chromosome (BAC) clones45–48. In the Public Human strain, PA, was established by King in 1909, the same year that Genome Project2 a BAC by BAC hierarchical approach was used and systematic inbreeding began for the mouse10. Despite this, the provided advantages in assembling difficult parts of the genome. mouse became the dominant model for mammalian geneticists, The draft mouse sequence was a pure WGS approach using the while the rat became the model of choice for physiologists, nutri- ARACHNE assembler3,49,50 but underrepresented duplicated tionists and other biomedical researchers. Nevertheless, there are regions owing to ‘collapses’ in the assembly3,51–53. This limitation over 234 inbred strains of R. norvegicus developed by selective of the mouse draft sequence was tolerable owing to the planned full breeding, which ‘fixes’ natural disease alleles in particular strains use of BAC clones in constructing the final finished sequence. or colonies11. The RGSPC opted to develop a ‘combined’ approach using both Over the past century, the role of the rat in medicine has WGS and BAC sequencing (Fig. 1). In the combined approach, transformed from carrier of contagious diseases to indispensable WGS data are progressively melded with light sequence coverage of tool in experimental medicine and drug development. Current individual BACs (BAC skims) to yield intermediate products called examples of use of the rat in human medical research include ‘enriched BACs’ (eBACs). eBACs covering the whole genome are surgery12, transplantation13–15, cancer16,17, diabetes18,19, psychiatric then joined into longer structures (bactigs). Bactigs are joined to disorders20 including behavioural intervention21 and addiction22, form larger structures: superbactigs, then ultrabactigs. During this neural regeneration23,24,wound25,26 and bone healing27,space process other data are introduced, including BAC end sequences, motion sickness28, and cardiovascular disease29–31. In drug develop- DNA fingerprints and other long-range information (genetic mar- ment, the rat is routinely employed both to demonstrate therapeutic kers, syntenic information), but the process is constrained by eBAC efficacy15,32,33 and to assess toxicity of novel therapeutic compounds structures. before
Recommended publications
  • The Rat Genome Database at 20: a Multi-Species Knowledgebase and Analysis Platform Jennifer R
    Published online 12 November 2019 Nucleic Acids Research, 2020, Vol. 48, Database issue D731–D742 doi: 10.1093/nar/gkz1041 The Year of the Rat: The Rat Genome Database at 20: a multi-species knowledgebase and analysis platform Jennifer R. Smith 1,*, G. Thomas Hayman1, Shur-Jen Wang1, Stanley J.F. Laulederkind 1, Matthew J. Hoffman1,2, Mary L. Kaldunski 1, Monika Tutaj1, Jyothi Thota1,Harika S. Nalabolu1, Santoshi L.R. Ellanki1, Marek A. Tutaj1, Jeffrey L. De Pons1, Anne E. Kwitek2, Melinda R. Dwinell2 and Mary E. Shimoyama1 1Rat Genome Database, Department of Biomedical Engineering, Medical College of Wisconsin, Milwaukee, WI 53226, USA and 2Genomic Sciences and Precision Medicine Center and Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA Received September 15, 2019; Revised October 21, 2019; Editorial Decision October 22, 2019; Accepted October 24, 2019 ABSTRACT In contrast to the current wealth of data and tools, the ‘bare bones’ first iteration (Figure 1,Table1) focused on just a few Formed in late 1999, the Rat Genome Database (RGD, data types––most notably a handful of known genes, plus https://rgd.mcw.edu) will be 20 in 2020, the Year of the EST and SSLP markers––localized only on genetic and RH Rat. Because the laboratory rat, Rattus norvegicus, maps, and offered a minimal number of tools for searching has been used as a model for complex human dis- and analyzing that data. eases such as cardiovascular disease, diabetes, can- Even before the first public release of the rat reference cer, neurological disorders and arthritis, among oth- genome in 2002 (1), RGD was focused on mapping disease- ers, for >150 years, RGD has always been disease- related regions and on comparative genomics between rat, focused and committed to providing data and tools human and mouse to support the use of rat as a model for for researchers doing comparative genomics and human disease.
    [Show full text]
  • Biocuration 2016 - Posters
    Biocuration 2016 - Posters Source: http://www.sib.swiss/events/biocuration2016/posters 1 RAM: A standards-based database for extracting and analyzing disease-specified concepts from the multitude of biomedical resources Jinmeng Jia and Tieliu Shi Each year, millions of people around world suffer from the consequence of the misdiagnosis and ineffective treatment of various disease, especially those intractable diseases and rare diseases. Integration of various data related to human diseases help us not only for identifying drug targets, connecting genetic variations of phenotypes and understanding molecular pathways relevant to novel treatment, but also for coupling clinical care and biomedical researches. To this end, we built the Rare disease Annotation & Medicine (RAM) standards-based database which can provide reference to map and extract disease-specified information from multitude of biomedical resources such as free text articles in MEDLINE and Electronic Medical Records (EMRs). RAM integrates disease-specified concepts from ICD-9, ICD-10, SNOMED-CT and MeSH (http://www.nlm.nih.gov/mesh/MBrowser.html) extracted from the Unified Medical Language System (UMLS) based on the UMLS Concept Unique Identifiers for each Disease Term. We also integrated phenotypes from OMIM for each disease term, which link underlying mechanisms and clinical observation. Moreover, we used disease-manifestation (D-M) pairs from existing biomedical ontologies as prior knowledge to automatically recognize D-M-specific syntactic patterns from full text articles in MEDLINE. Considering that most of the record-based disease information in public databases are textual format, we extracted disease terms and their related biomedical descriptive phrases from Online Mendelian Inheritance in Man (OMIM), National Organization for Rare Disorders (NORD) and Orphanet using UMLS Thesaurus.
    [Show full text]
  • The Candida Genome Database: the New Homology Information Page Highlights Protein Similarity and Phylogeny Jonathan Binkley, Martha B
    Published online 31 October 2013 Nucleic Acids Research, 2014, Vol. 42, Database issue D711–D716 doi:10.1093/nar/gkt1046 The Candida Genome Database: The new homology information page highlights protein similarity and phylogeny Jonathan Binkley, Martha B. Arnaud*, Diane O. Inglis, Marek S. Skrzypek, Prachi Shah, Farrell Wymore, Gail Binkley, Stuart R. Miyasato, Matt Simison and Gavin Sherlock Department of Genetics, Stanford University Medical School, Stanford, CA 94305-5120, USA Received September 12, 2013; Revised October 9, 2013; Accepted October 10, 2013 ABSTRACT mammalian hosts as well as a pathogen that causes painful opportunistic mucosal infections in otherwise The Candida Genome Database (CGD, http://www. healthy individuals and causes severe and deadly blood- candidagenome.org/) is a freely available online stream infections in the susceptible severely ill and/or im- resource that provides gene, protein and sequence munocompromised patient population (2). This fungus information for multiple Candida species, along with exhibits a number of properties associated with the web-based tools for accessing, analyzing and ability to invade host tissue, to resist the effects of exploring these data. The goal of CGD is to facilitate antifungal therapeutic drugs and the human immune and accelerate research into Candida pathogenesis system and to alternately cause disease or coexist with and biology. The CGD Web site is organized around the host as a commensal, including the ability to grow in Locus pages, which display information collected multiple morphological forms and to switch between about individual genes. Locus pages have multiple them, and the ability to grow as drug-resistant biofilms (3–7).
    [Show full text]
  • 1 Proposal for Discovering Snps from Eight Commonly Used Inbred Rat
    Proposal for Discovering SNPs from Eight Commonly Used Inbred Rat Strains Tim Aitman1, Richard Gibbs2, George Weinstock2, Norbert Huebner3, Michael Jensen- Seaman4, Daniel Maloney5 and Howard J. Jacob5 1Physiological Genomics and Medicine Group, MRC Clinical Sciences Centre, Imperial College Faculty of Medicine, Hammersmith Hospital, Ducane Road, London W12 0NN, UK. 2Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA. 3Max-Delbruck-Center for Molecular Medicine (MDC), Robert-Rossle-Str. 10, 13092 Berlin-Buch, Germany. 4Dept. of Biological Sciences, Mellon Hall, Duquesne University, 600 Forbes Ave., Pittsburgh PA 15282. 5Human and Molecular Genetics Center, Medical College of Wisconsin, 8710 Watertown Plank Road, Milwaukee WI 53236. For Correspondence: Howard Jacob, Ph. D. Director, Human and Molecular Genetics Center and Warren Knowles Professor Department of Physiology Medical College of Wisconsin 8710 Watertown Plank Road Milwaukee WI 53236 [email protected] Ph. (414) 456-4887 Fax (414) 456-6516 1 Table of Contents Importance of the organism 3 Research community 4 Current SNP resources and discovery efforts 5 A SNP HapMap 6 Required number of SNPs 6 Strain Selection 7 Literature cited 11 2 Importance of the organism: The importance of the laboratory rat in biomedical research is well established. Since 1966, there have been on average over 28,000 publications per year using rat (PubMed search, key word: rat); in the last eight years (1996-2003) there have been on average almost 37,000 publications annually. The initiation of the rat genome project has yielded a tremendous wealth of genomic resources including genetic maps; radiation hybrid (RH) cell lines and the associated RH maps (over 6,000 genetic markers and 16,000 genes and ESTs mapped); cDNA libraries generating more than 593,880 ESTs (with more being generated) clustered into over 63,000 UniGenes; over 10,033 genetic markers; and a published draft (~6.8 X) sequence of the genome based on the inbred BN (Brown Norway) strain.
    [Show full text]
  • Integrated Data and Tools for the Unification of Model Organism SUBJECT AREAS: DATA MINING Research DATABASES Julie Sullivan1,2, Kalpana Karra3, Sierra A
    InterMOD: integrated data and tools for the unification of model organism SUBJECT AREAS: DATA MINING research DATABASES Julie Sullivan1,2, Kalpana Karra3, Sierra A. T. Moxon6, Andrew Vallejos7, Howie Motenko5, J. D. Wong4, SOFTWARE Jelena Aleksic1,2, Rama Balakrishnan3, Gail Binkley3, Todd Harris4, Benjamin Hitz3, Pushkala Jayaraman8, COMPUTATIONAL PLATFORMS 1,2 5 6 1,2 4 3 AND ENVIRONMENTS Rachel Lyne , Steven Neuhauser , Christian Pich , Richard N. Smith , Quang Trinh , J. Michael Cherry , Joel Richardson5, Lincoln Stein4, Simon Twigger8, Monte Westerfield6,9, Elizabeth Worthey8 & Gos Micklem1,2 Received 14 November 2012 1Cambridge Systems Biology Centre, University of Cambridge, Cambridge CB2 1QR, United Kingdom, 2Department of Genetics, Accepted University of Cambridge, Cambridge CB2 3EH, United Kingdom, 3Department of Genetics, Stanford University, Stanford, CA, 5 April 2013 94305, USA, 4Ontario Institute for Cancer Research, Toronto, ON, M5G0A3, Canada, 5The Jackson Laboratory, Bar Harbor, 6 7 Published Maine, 04609, USA, ZFIN, University of Oregon, Eugene, OR, 97405, USA, Biotechnology and Bioengineering Center, Medical College of Wisconsin, Milwaukee, WI, 53226, USA, 8Human and Molecular Genetics Center, Medical College of Wisconsin, 8 May 2013 Milwaukee, WI, 53226, USA, 9Institute of Neuroscience, University of Oregon, Eugene, OR, 97405, USA. Model organisms are widely used for understanding basic biology, and have significantly contributed to the Correspondence and study of human disease. In recent years, genomic analysis has provided extensive evidence of widespread requests for materials conservation of gene sequence and function amongst eukaryotes, allowing insights from model organisms should be addressed to to help decipher gene function in a wider range of species. The InterMOD consortium is developing an G.M.
    [Show full text]
  • The Disgenet Knowledge Platform for Disease
    Published online 4 November 2019 Nucleic Acids Research, 2020, Vol. 48, Database issue D845–D855 doi: 10.1093/nar/gkz1021 The DisGeNET knowledge platform for disease genomics: 2019 update Janet Pinero˜ , Juan Manuel Ram´ırez-Anguita, Josep Sauch-Pitarch,¨ Francesco Ronzano, Emilio Centeno, Ferran Sanz and Laura I. Furlong * Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences, Pompeu Fabra University (UPF), Barcelona, Spain Downloaded from https://academic.oup.com/nar/article-abstract/48/D1/D845/5611674 by guest on 20 April 2020 Received September 14, 2019; Revised October 14, 2019; Editorial Decision October 15, 2019; Accepted October 18, 2019 ABSTRACT sults of genomic analysis and the identification of variant of clinical relevance remain a significant challenge (1). Vari- One of the most pressing challenges in genomic ant assessment still involves manual exploration of multi- medicine is to understand the role played by ge- ple sources of data, which requires a significant amount of netic variation in health and disease. Thanks to the time and experts in the domain. In this context, new bioin- exploration of genomic variants at large scale, hun- formatic tools and resources that enable the automation of dreds of thousands of disease-associated loci have every possible step in this process are crucial. In this regard, been uncovered. However, the identification of vari- resources such as ClinVar (2), ClinGen (3), the Genomics ants of clinical relevance is a significant challenge England PanelApp (https://panelapp.genomicsengland.co. that requires comprehensive interrogation of previ- uk/), Orphanet (4) and OMIM (5), among others, have ous knowledge and linkage to new experimental re- demonstrated their utility to support variant interpretation.
    [Show full text]
  • Maps for the Rat Genome Integrated and Sequence-Ordered
    Downloaded from genome.cshlp.org on April 2, 2014 - Published by Cold Spring Harbor Laboratory Press View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by MDC Repository Integrated and Sequence-Ordered BAC- and YAC-Based Physical Maps for the Rat Genome Martin Krzywinski, John Wallis, Claudia Gösele, et al. Genome Res. 2004 14: 766-779 Access the most recent version at doi:10.1101/gr.2336604 Supplemental http://genome.cshlp.org/content/suppl/2004/03/18/14.4.766.DC1.html Material References This article cites 50 articles, 18 of which can be accessed free at: http://genome.cshlp.org/content/14/4/766.full.html#ref-list-1 Creative This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the Commons first six months after the full-issue publication date (see License http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported License), as described at http://creativecommons.org/licenses/by-nc/3.0/. Email Alerting Receive free email alerts when new articles cite this article - sign up in the box at the Service top right corner of the article or click here. To subscribe to Genome Research go to: http://genome.cshlp.org/subscriptions Cold Spring Harbor Laboratory Press Downloaded from genome.cshlp.org on April 2, 2014 - Published by Cold Spring Harbor Laboratory Press Resource Integrated and Sequence-Ordered BAC- and YAC-Based Physical Maps for the Rat Genome Martin Krzywinski,1 John Wallis,2
    [Show full text]
  • Genomic Landscape of Rat Strain and Substrain Variation Hermsen Et Al
    Genomic landscape of rat strain and substrain variation Hermsen et al. Hermsen et al. BMC Genomics (2015) 16:357 DOI 10.1186/s12864-015-1594-1 Hermsen et al. BMC Genomics (2015) 16:357 DOI 10.1186/s12864-015-1594-1 RESEARCH ARTICLE Open Access Genomic landscape of rat strain and substrain variation Roel Hermsen1, Joep de Ligt1, Wim Spee1, Francis Blokzijl1, Sebastian Schäfer2, Eleonora Adami2, Sander Boymans1, Stephen Flink3, Ruben van Boxtel1, Robin H van der Weide1, Tim Aitman4, Norbert Hübner2, Marieke Simonis1, Boris Tabakoff3, Victor Guryev5 and Edwin Cuppen1* Abstract Background: Since the completion of the rat reference genome in 2003, whole-genome sequencing data from more than 40 rat strains have become available. These data represent the broad range of strains that are used in rat research including commonly used substrains. Currently, this wealth of information cannot be used to its full extent, because the variety of different variant calling algorithms employed by different groups impairs comparison between strains. In addition, all rat whole genome sequencing studies to date used an outdated reference genome for analysis (RGSC3.4 released in 2004). Results: Here we present a comprehensive, multi-sample and uniformly called set of genetic variants in 40 rat strains, including 19 substrains. We reanalyzed all primary data using a recent version of the rat reference assembly (RGSC5.0 released in 2012) and identified over 12 million genomic variants (SNVs, indels and structural variants) among the 40 strains. 28,318 SNVs are specific to individual substrains, which may be explained by introgression from other unsequenced strains and ongoing evolution by genetic drift.
    [Show full text]
  • The Genome Sequence of the Norway Rat, Rattus Norvegicus Berkenhout
    Wellcome Open Research 2021, 6:118 Last updated: 28 SEP 2021 DATA NOTE The genome sequence of the Norway rat, Rattus norvegicus Berkenhout 1769 [version 1; peer review: 1 approved] Kerstin Howe 1, Melinda Dwinell2, Mary Shimoyama2+, Craig Corton1, Emma Betteridge1, Alexander Dove1, Michael A. Quail1, Michelle Smith1, Laura Saba 3, Robert W. Williams 4, Hao Chen5, Anne E. Kwitek 2, Shane A. McCarthy1,6, Marcela Uliano-Silva1, William Chow1, Alan Tracey 1, James Torrance 1, Ying Sims1, Richard Challis 1, Jonathan Threlfall 1, Mark Blaxter 1 1Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK 2Medical College of Wisconsin, Milwaukee, Wisconsin, 53226, USA 3Skaggs School of Pharmacy and Pharmaceutical Sciences,, University of Colorado Anschutz Medical Center, Aurora, Colorado, 80045, USA 4Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, Tennessee, 38103, USA 5Department of Pharmacology, Addiction Science, and Toxicology, University of Tennessee Health Science Center, Memphis, Tennessee, 38103, USA 6Department of Genetics, University of Cambridge, Cambridge, CB2 3EH, UK + Deceased author v1 First published: 18 May 2021, 6:118 Open Peer Review https://doi.org/10.12688/wellcomeopenres.16854.1 Latest published: 18 May 2021, 6:118 https://doi.org/10.12688/wellcomeopenres.16854.1 Reviewer Status Invited Reviewers Abstract We present a genome assembly from an individual male Rattus 1 norvegicus (the Norway rat; Chordata; Mammalia; Rodentia; Muridae). The genome sequence is 2.44 gigabases in span. The majority of the version 1 assembly is scaffolded into 20 chromosomal pseudomolecules, with 18 May 2021 report both X and Y sex chromosomes assembled.
    [Show full text]
  • What's New in Flybase
    What’s new in FlyBase (in its 25th year) Steven Marygold • JBrowse • GAL4 search • Gene Snapshots • New weBsite: FlyBase 2.0 • Gene Groups • Gene2Function • Orthologs • Alliance of Genome Resources • Human Disease Models Outline 1. Recent additions to FlyBase 2. New features in FlyBase 2.0 3. Multi-species dataBases 4. Further information/feedBack Outline 1. Recent additions to FlyBase 2. New features in FlyBase 2.0 3. Multi-species dataBases 4. Further information/feedBack JBrowse Gene Snapshots Epidermal growth factor receptor (Egfr) is the transmembrane tyrosine kinase receptor for signaling ligands in the TGFα family (grk, spi, vn, and Krn), which utilises the intracellular MAP kinase pathway. Egfr roles include growth regulation, cell survival and developmental patterning. Gene Groups • gene products sharing molecular function (kinases, tRNAs…) • gene families (actins, odorant receptors…) • suBunits of complexes (riBosome, spliceosome…) Gene Group Reports Manually written description of group, with notes List of members, with export to: 1. Hit-list 2. Batch Download 3. Orthologs tool Links to external resources Orthology search tool Orthology search tool Orthology search results Human Disease Model data Human Disease Model Reports Manually written summary of disease and model Links to related diseases/fly models List of fly stocks used in model ‘GAL4 etc.’ search tool ‘GAL4 etc.’ search results Community Interactions Outline 1. Recent additions to FlyBase 2. New features in FlyBase 2.0 3. Multi-species dataBases 4. Further information/feedBack New features include: • New/Improved tools – New hit-list management – Sequence Downloader – Revised Jump-to-Gene/Search Box • Enhanced Report pages – Navigation panel – GO summary riBBons (Gene Report) – Protein domain graphics (Gene/Polypeptide Report) – Reference filtering • Mobile device friendly New hit-lists New hit-lists New hit-lists New hit-lists New hit-lists New hit-lists Revised Gene Reports Revised Gene Reports Revised Gene Reports Revised Gene Reports Revised Jump to Gene/Search Outline 1.
    [Show full text]
  • Fall 2009 Or Phenotype Data • Transcripts • Pdfs Please
    ZFIN NEWS In This Issue: The Zebrafish Information Network • Community Wiki http://zfin.org • Expand Your View of Gene Function: The Reference Genome Project • Submitting Your Unpublished Expression Volume 6, Number 2 Fall 2009 or Phenotype Data • Transcripts • PDFs Please Community Wiki We are very pleased to introduce the new ZFIN Community Wiki (http://wiki.zfin.org). Wikis allow easy creation, editing, commenting, and searching of interlinked web pages. The Community Wiki will begin with two sections – Experimental Protocols and Antibodies – more sections may be added in the future. I. Experimental Protocols The Protocols Wiki is divided into sections that mirror chapters of The Zebrafish Book e.g. Cellular Methods, Genetic Methods, etc. The full contents of The Zebrafish Book 5th edition have been copied to the wiki. Researchers can submit new protocols to an appropriate chapter or add comments to existing protocols. II. Antibodies Each antibody page in the Antibody Wiki provides a concise summary of antibody properties including host organism, recognized gene products, labeled anatomy and suppliers. You can add comments with technical tips to antibody pages or link to full protocols in the Protocols Wiki. The Antibody Wiki has been pre-populated with all the antibodies in the ZFIN antibody database. New antibodies can be added by researchers as needed. Working in collaboration with a test group of researchers, we have customized the ZFIN Community Wiki to make it easy to use. We have chosen Confluence (http://www.atlassian.com/software/confluence/) for our wiki software, which has many attractive features including advanced search, RSS feeds, HTML/PDF/Word-format page export, and a fairly gentle learning curve for beginners.
    [Show full text]
  • Introduction to Genome Informatics
    GenomeGenome InformaticsInformatics Systems Biology and the Omics Cascade (Course 2143) Day 3, June 11th, 2008 Kiyoko F. Aoki-Kinoshita IntroductionIntroduction GenomeGenome informaticsinformatics coverscovers thethe computercomputer-- basedbased modelingmodeling andand datadata processingprocessing ofof genomegenome--basedbased data.data. ThisThis includesincludes databasesdatabases andand resourcesresources forfor genomicgenomic analysis.analysis. YouYou werewere introducedintroduced toto KEGGKEGG onon DayDay 2.2. SomeSome otherother usefuluseful databasesdatabases andand resourcesresources willwill bebe coveredcovered today.today. ButBut first!first! DataData formatsformats ItIt isis usuallyusually notnot enoughenough toto simplysimply looklook atat thethe datadata providedprovided byby databasesdatabases ToTo actuallyactually useuse thethe datadata forfor analysis,analysis, oneone oftenoften needsneeds toto savesave thethe retrievedretrieved datadata ThisThis requiresrequires knowledgeknowledge aboutabout thethe datadata formatsformats usedused byby eacheach databasedatabase SoSo wewe willwill covercover thethe majormajor datadata formatsformats usedused inin bioinformaticsbioinformatics DataData FormatsFormats MajorMajor datadata formats:formats: –– GenBankGenBank –– EMBLEMBL/UniProt/UniProt –– FASTAFASTA –– PDBPDB FormatsFormats suitedsuited forfor programming:programming: –– ASN.1ASN.1 (Abstract(Abstract SyntaxSyntax NotationNotation One)One) –– XMLXML (eXtensible(eXtensible MarkupMarkup Language)Language) GenBankGenBank formatformat
    [Show full text]