Flybase: Enhancing Drosophila Gene Ontology Annotations

Total Page:16

File Type:pdf, Size:1020Kb

Flybase: Enhancing Drosophila Gene Ontology Annotations Published online 23 October 2008 Nucleic Acids Research, 2009, Vol. 37, Database issue D555–D559 doi:10.1093/nar/gkn788 FlyBase: enhancing Drosophila Gene Ontology annotations Susan Tweedie1,*, Michael Ashburner1, Kathleen Falls2, Paul Leyland1, Peter McQuilton1, Steven Marygold1, Gillian Millburn1, David Osumi-Sutherland1, Andrew Schroeder2, Ruth Seal1, Haiyan Zhang2 and The FlyBase Consortiumy 1Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, UK and 2The Biological Laboratories, Harvard University, 16 Divinity Avenue, Cambridge, MA 02138, USA Received September 19, 2008; Revised October 8, 2008; Accepted October 9, 2008 ABSTRACT activity’ will automatically gather those products labelled with the more specific types of kinase activity. FlyBase (http://flybase.org) is a database of GO annotation comprises at least three components: a Drosophila genetic and genomic information. Gene GO term that describes molecular function, biological role Ontology (GO) terms are used to describe three or subcellular location; an ‘evidence code’ that describes attributes of wild-type gene products: their molecu- the type of analysis used to support the GO term lar function, the biological processes in which they (Table 1); and an attribution to a specific reference. play a role, and their subcellular location. This arti- There may also be supporting evidence for the choice of cle describes recent changes to the FlyBase GO GO term in the form of database cross-references; for annotation strategy that are improving the quality instance, a gene function may be ‘inferred from genetic of the GO annotation data. Many of these changes interaction’, in which case an identifier for the interacting stem from our participation in the GO Reference gene will be included. Qualifiers may also be included to Genome Annotation Project—a multi-database col- modify the annotation; currently these are: ‘colocalizes_- laboration producing comprehensive GO annotation with’, ‘contributes_to’ and ‘NOT’. In the case of the sets for 12 diverse species. ‘NOT’ qualifier, this has the effect of negating the annota- tion. FlyBase uses ‘NOT’ sparingly for cases where there is a prior expectation that a GO term should apply to a gene product but evidence exists to the contrary. Another INTRODUCTION TO GENE ONTOLOGY annotation component, one that FlyBase has neglected ANNOTATION until recently, is date. FlyBase now records an accurate What gene products do and where they do it are date for each GO annotation reflecting when the annota- fundamental questions for biologists no matter what tion was originally made or last reviewed by a curator. GO organism is being studied. The Gene Ontology (GO; annotations made prior to implementing the data compo- www.geneontology.org) was established 10 years ago as nent are dated 20060803. Examples of GO annotations are a means of summarizing this information consistently shown in Table 2. across different databases by using a common set of GO annotation is useful for both small-scale and large- defined controlled vocabulary terms. FlyBase was one of scale analyses. It can provide a first indication of the three founding members of the Gene Ontology Consor- nature of a gene product and, in conjunction with evidence tium [GOC; (1)] but the GO has since been adopted by codes, point directly to papers with pertinent experimental many model organism databases, making comparison of data. However, it is particularly useful in the analysis gene function between diverse species feasible. The GO of large data sets such as the output of a microarray also encodes relationships between terms, which allows experiment. For instance, the AmiGO GO Term Enrich- for efficient searching and computational reasoning. For ment Tool (amigo.geneontology.org/cgi-bin/amigo/term_ example, a search for all gene products with ‘kinase enrichment) will show significant shared GO terms or *To whom correspondence should be addressed. Tel: +44 1223 333 963; Fax: +44 1223 333 992; Email: [email protected] Present address: Ruth Seal, EBI, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK yThe FlyBase Consortium comprises: FlyBase-Harvard: W. Gelbart, L. Bitsoi, M. Crosby, A. Dirkmaat, D. Emmert, L. S. Gramates, K. Falls, R. Kulathinal, B. Matthews, M. Roark, S. Russo, A. Schroeder, S. St Pierre, H. Zhang, P. Zhou and M. Zytkovicz; FlyBase-Cambridge: M. Ashburner, N. Brown, P. Leyland, P. McQuilton, S. Marygold, G. Millburn, D. Osumi-Sutherland, R. Stefancsik, S. Tweedie and M. Williams; and FlyBase- Indiana: T. Kaufman, K. Matthews, J. Goodman, G. Grumbling, V. Strelets and R. Wilson. ß 2008 The Author(s) This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/ by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. D556 Nucleic Acids Research, 2009, Vol. 37, Database issue parents of those GO terms associated with a list of genes. TermLink (retrieves data for all Drosophila species) Whatever the scale of use, for the best results it is impor- and QueryBuilder (species and other criteria can be speci- tant that GO annotation be as complete and accurate as fied) (2). possible. This article describes recent changes to GO Table 3 shows a summary of the current FlyBase GO annotation procedures in FlyBase that aim to improve annotations; 10 131 (72%) of the 14 029 Drosophila mela- the quality of our functional annotations. nogaster protein-coding genes have at least one GO anno- tation and 9403 (67%) have annotations with specific (non-root) GO terms. GO ANNOTATION IN FLYBASE The GO is dynamic and its content can change on a daily basis. Most of these changes are the addition of GO annotations appear on the Gene Report page in new terms but other alterations, such as making a term FlyBase (Figure 1) and are also available as a download- obsolete, a term name change or a change to ontology able file (gene_association.fb.gz in the genes section of the structure, require us to revisit our existing annotations Precomputed Files page accessed from the Files menu on to choose alternative GO terms or check that the anno- the FlyBase home page). This file is in the standard GO tations are still valid. Rather than attempt to keep gene_association file format (www.geneontology.org/ completely up-to-date with this moving target, FlyBase GO.format.annotation.shtml) and corresponds to the loads a new version of the GO every one or two releases data that are submitted to the GOC for a given FlyBase of FlyBase. Both new and existing annotations are made release. GO data are searchable in FlyBase using both consistent with this ‘frozen’ version for a given release. The frozen version of the GO, and all other ontologies Table 1. Evidence codes used in GO annotation used in FlyBase, are available in the Ontology Terms sec- tion of the Precomputed Files page. Manually assigned evidence codes Experimental evidence codes The GO annotation set is submitted to the GOC at the Inferred from Direct Assay (IDA) same time as a new version of FlyBase is released. Users Inferred from Physical Interaction (IPI) should be aware that there may be a few differences Inferred from Mutant Phenotype (IMP) between the data downloadable from FlyBase and the Inferred from Genetic Interaction (IGI) Inferred from Expression Pattern (IEP) file downloadable from the GO website. Differences arise Computational analysis evidence codes because the submitted files are screened for errors, and Inferred from Sequence or Structural Similarity (ISS) lines that do not meet a series of quality controls are a Inferred from Sequence Orthology (ISO) removed. For instance, any annotations with terms that Inferred from Sequence Alignment (ISA)a Inferred from Sequence Model (ISM)a have become obsolete since the ontology was frozen at Inferred from Genomic Context (IGC) FlyBase will be stripped out of the files available from Inferred from Reviewed Computational Analysis (RCA) the GO site. Author statement evidence codes Traceable Author Statement (TAS) Non-traceable Author Statement (NAS) Curatorial statement codes GO CURATION PIPELINE Inferred by Curator (IC) No biological Data available (ND) Most GO data in FlyBase are entered via our paper-by- Automatically assigned evidence codes paper literature curation process; curators read papers and Inferred from Electronic Annotation (IEA) chose the appropriate terms based on the data described. a For gene products that have not yet been experimentally Three new subcategories of the ISS evidence code used to assign GO characterized, GO terms are assigned manually if there is terms based on sequence similarity. ISO is used when the similar sequences are considered to be orthologous. ISA is used where there significant sequence similarity to gene products of known is extensive sequence alignment but the sequences are not known to be function, using the ‘inferred from sequence similarity’ evi- orthologous. ISM is used when a sequence model has been generated dence code or one of the new more specific versions of this from a set of related sequences, e.g. hidden Markov models for trans- code (Table 1). When no specific term can be assigned membrane regions. Full documentation of evidence codes together with how they are used in annotation can be found on the GO website from either published data or sequence homology, a (http://www.geneontology.org/GO.contents.doc.shtml). gene is annotated
Recommended publications
  • Maternally Inherited Pirnas Direct Transient Heterochromatin Formation
    RESEARCH ARTICLE Maternally inherited piRNAs direct transient heterochromatin formation at active transposons during early Drosophila embryogenesis Martin H Fabry, Federica A Falconio, Fadwa Joud, Emily K Lythgoe, Benjamin Czech*, Gregory J Hannon* CRUK Cambridge Institute, University of Cambridge, Li Ka Shing Centre, Cambridge, United Kingdom Abstract The PIWI-interacting RNA (piRNA) pathway controls transposon expression in animal germ cells, thereby ensuring genome stability over generations. In Drosophila, piRNAs are intergenerationally inherited through the maternal lineage, and this has demonstrated importance in the specification of piRNA source loci and in silencing of I- and P-elements in the germ cells of daughters. Maternally inherited Piwi protein enters somatic nuclei in early embryos prior to zygotic genome activation and persists therein for roughly half of the time required to complete embryonic development. To investigate the role of the piRNA pathway in the embryonic soma, we created a conditionally unstable Piwi protein. This enabled maternally deposited Piwi to be cleared from newly laid embryos within 30 min and well ahead of the activation of zygotic transcription. Examination of RNA and protein profiles over time, and correlation with patterns of H3K9me3 deposition, suggests a role for maternally deposited Piwi in attenuating zygotic transposon expression in somatic cells of the developing embryo. In particular, robust deposition of piRNAs targeting roo, an element whose expression is mainly restricted to embryonic development, results in the deposition of transient heterochromatic marks at active roo insertions. We hypothesize that *For correspondence: roo, an extremely successful mobile element, may have adopted a lifestyle of expression in the [email protected] embryonic soma to evade silencing in germ cells.
    [Show full text]
  • The Biogrid Interaction Database
    D470–D478 Nucleic Acids Research, 2015, Vol. 43, Database issue Published online 26 November 2014 doi: 10.1093/nar/gku1204 The BioGRID interaction database: 2015 update Andrew Chatr-aryamontri1, Bobby-Joe Breitkreutz2, Rose Oughtred3, Lorrie Boucher2, Sven Heinicke3, Daici Chen1, Chris Stark2, Ashton Breitkreutz2, Nadine Kolas2, Lara O’Donnell2, Teresa Reguly2, Julie Nixon4, Lindsay Ramage4, Andrew Winter4, Adnane Sellam5, Christie Chang3, Jodi Hirschman3, Chandra Theesfeld3, Jennifer Rust3, Michael S. Livstone3, Kara Dolinski3 and Mike Tyers1,2,4,* 1Institute for Research in Immunology and Cancer, Universite´ de Montreal,´ Montreal,´ Quebec H3C 3J7, Canada, 2The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada, 3Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA, 4School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JR, UK and 5Centre Hospitalier de l’UniversiteLaval´ (CHUL), Quebec,´ Quebec´ G1V 4G2, Canada Received September 26, 2014; Revised November 4, 2014; Accepted November 5, 2014 ABSTRACT semi-automated text-mining approaches, and to en- hance curation quality control. The Biological General Repository for Interaction Datasets (BioGRID: http://thebiogrid.org) is an open access database that houses genetic and protein in- INTRODUCTION teractions curated from the primary biomedical lit- Massive increases in high-throughput DNA sequencing erature for all major model organism species and technologies (1) have enabled an unprecedented level of humans. As of September 2014, the BioGRID con- genome annotation for many hundreds of species (2–6), tains 749 912 interactions as drawn from 43 149 pub- which has led to tremendous progress in the understand- lications that represent 30 model organisms.
    [Show full text]
  • Flybase: Updates to the Drosophila Melanogaster Knowledge Base Aoife Larkin 1,*, Steven J
    Published online 21 November 2020 Nucleic Acids Research, 2021, Vol. 49, Database issue D899–D907 doi: 10.1093/nar/gkaa1026 FlyBase: updates to the Drosophila melanogaster knowledge base Aoife Larkin 1,*, Steven J. Marygold 1, Giulia Antonazzo1, Helen Attrill 1, Gilberto dos Santos2, Phani V. Garapati1, Joshua L. Goodman3,L.SianGramates 2, Gillian Millburn1, Victor B. Strelets3, Christopher J. Tabone2, Jim Thurmond3 and FlyBase Consortium† 1 Department of Physiology, Development and Neuroscience, University of Cambridge, Downing Street, Cambridge Downloaded from https://academic.oup.com/nar/article/49/D1/D899/5997437 by guest on 15 January 2021 CB2 3DY, UK, 2The Biological Laboratories, Harvard University, 16 Divinity Avenue, Cambridge, MA 02138, USA and 3Department of Biology, Indiana University, Bloomington, IN 47405, USA Received September 15, 2020; Revised October 13, 2020; Editorial Decision October 14, 2020; Accepted October 22, 2020 ABSTRACT In this paper, we highlight a variety of new features and improvements that have been integrated into FlyBase since FlyBase (flybase.org) is an essential online database our last review (3). We have introduced a number of new fea- for researchers using Drosophila melanogaster as a tures that allow users to more easily find and use the data in model organism, facilitating access to a diverse ar- which they are interested; these include customizable tables, ray of information that includes genetic, molecular, improved searching using expanded Experimental Tool Re- genomic and reagent resources. Here, we describe ports, more links to and from other resources, and bet- the introduction of several new features at FlyBase, ter provision of precomputed bulk files. We have upgraded including Pathway Reports, paralog information, dis- tools such as Fast-Track Your Paper and ID validator.
    [Show full text]
  • PANTHER Tutorial 2011.Pptx
    PANTHER Classificaon System version 7 Huaiyu Mi Department of Preven3ve Medicine Keck School of Medicine University of Southern California USA August 27, 2011, ICSB Tutorial, Heidelberg, Germany 0 Outline • PANTHER Background – How PANTHER is built? • PANTHER Website at a Glance – Brief overview of all PANTHER pages • PANTHER Basic Func3onali3es • PANTHER Tools – Tutorial on tool usage 1 PANTHER BACKGROUND 2 PANTHER Database 3 4 What’s new in PANTHER 7.0? • Whole genome sequence coverage from 48 organisms. • New tree building algorithm (GIGA) for improved phylogene3c relaonships of genes and families. • Improved Hidden-Markov Models • Improved ortholog iden3ficaon. • Implement GO slim and PANTHER protein class for classifying genes and families. • Expanded sets of genomes and sequence iden3fier for PANTHER tools. • PANTHER Pathway diagram in SBGN. 5 PANTHER PROTEIN LIBRARY 6 What is PANTHER? PANTHER library (PANTHER/LIB) • a family tree Sequences • a mul3ple sequence alignment • an HMM PANTHER subfamily HMM models PANTHER GO slim and Protein Class Stas3c models Phylogene3c trees Mul3sequence (HMM) alignments • Molecular func3on • Biological process • Cellular component • Protein class 7 Building PANTHER Protein Family Library Select sequences Build clusters Curaon PANTHER Build MSA Protein Libray Build trees PANTHER GO slim Build and Protein Class HMMs ontology 8 Complete Gene Sets • 12 GO Reference Genomes • 36 other genomes to help reconstruct evolu3onary history – 14 bacterial genomes – 2 archaeal genomes – 2 fungal genomes – 2 plant genomes – 1 amoebozoan genome – 3 prost genomes – 2 protostome genomes – 10 deuterostome genomes 9 “Standard” set of protein coding genes and corresponding protein sequences Get list of genes in each genome • 48 genomes • Sources of genes – MOD Get list of all protein products – ENSEMBL from given source – NCBI (Entrez) • Sources of protein sequences Get mapping of – UniProt each protein – product to UniProt NCBI (Refseq) – ENSEMBL • One protein is selected for Select one each gene.
    [Show full text]
  • Annual Scientific Report 2011 Annual Scientific Report 2011 Designed and Produced by Pickeringhutchins Ltd
    European Bioinformatics Institute EMBL-EBI Annual Scientific Report 2011 Annual Scientific Report 2011 Designed and Produced by PickeringHutchins Ltd www.pickeringhutchins.com EMBL member states: Austria, Croatia, Denmark, Finland, France, Germany, Greece, Iceland, Ireland, Israel, Italy, Luxembourg, the Netherlands, Norway, Portugal, Spain, Sweden, Switzerland, United Kingdom. Associate member state: Australia EMBL-EBI is a part of the European Molecular Biology Laboratory (EMBL) EMBL-EBI EMBL-EBI EMBL-EBI EMBL-European Bioinformatics Institute Wellcome Trust Genome Campus, Hinxton Cambridge CB10 1SD United Kingdom Tel. +44 (0)1223 494 444, Fax +44 (0)1223 494 468 www.ebi.ac.uk EMBL Heidelberg Meyerhofstraße 1 69117 Heidelberg Germany Tel. +49 (0)6221 3870, Fax +49 (0)6221 387 8306 www.embl.org [email protected] EMBL Grenoble 6, rue Jules Horowitz, BP181 38042 Grenoble, Cedex 9 France Tel. +33 (0)476 20 7269, Fax +33 (0)476 20 2199 EMBL Hamburg c/o DESY Notkestraße 85 22603 Hamburg Germany Tel. +49 (0)4089 902 110, Fax +49 (0)4089 902 149 EMBL Monterotondo Adriano Buzzati-Traverso Campus Via Ramarini, 32 00015 Monterotondo (Rome) Italy Tel. +39 (0)6900 91402, Fax +39 (0)6900 91406 © 2012 EMBL-European Bioinformatics Institute All texts written by EBI-EMBL Group and Team Leaders. This publication was produced by the EBI’s Outreach and Training Programme. Contents Introduction Foreword 2 Major Achievements 2011 4 Services Rolf Apweiler and Ewan Birney: Protein and nucleotide data 10 Guy Cochrane: The European Nucleotide Archive 14 Paul Flicek:
    [Show full text]
  • The Drosophila Melanogaster Transcriptome by Paired-End RNA Sequencing
    Downloaded from genome.cshlp.org on September 29, 2021 - Published by Cold Spring Harbor Laboratory Press Resource The Drosophila melanogaster transcriptome by paired-end RNA sequencing Bryce Daines,1,2,6 Hui Wang,2,6 Liguo Wang,3,6 Yumei Li,1,2 Yi Han,2 David Emmert,4 William Gelbart,4 Xia Wang,1,2 Wei Li,3,5 Richard Gibbs,1,2,7 and Rui Chen1,2,7 1Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA; 2Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA; 3Dan L. Duncan Cancer Center, Baylor College of Medicine, Houston, Texas 77030, USA; 4Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts 02138, USA; 5Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, Texas 77030, USA RNA-seq was used to generate an extensive map of the Drosophila melanogaster transcriptome by broad sampling of 10 developmental stages. In total, 142.2 million uniquely mapped 64–100-bp paired-end reads were generated on the Illumina GA II yielding 3563 sequencing coverage. More than 95% of FlyBase genes and 90% of splicing junctions were observed. Modifications to 30% of FlyBase gene models were made by extension of untranslated regions, inclusion of novel exons, and identification of novel splicing events. A total of 319 novel transcripts were identified, representing a 2% increase over the current annotation. Alternate splicing was observed in 31% of D. melanogaster genes, a 38% increase over previous estimations, but significantly less than that observed in higher organisms. Much of this splicing is subtle such as tandem alternate splice sites.
    [Show full text]
  • Rnacentral: Progress in the Development of the Non-Coding RNA Sequence Database
    RNAcentral: Progress in the Development of the Non-coding RNA Sequence Database Other potential titles: ● RNAcentral: New Developments in the Non-coding RNA Sequence Database ● RNAcentral: Three Years of ncRNA Data Integration Anton I. Petrov​1,​ Blake A. Sweeney​1,​ Boris Burkov​1*,​ Simon Kay​1,​ Rob Finn​1,​ Alex Bateman​1,​ and The RNAcentral Consortium European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD *To whom correspondence should be addressed: [email protected] BACKGROUND More than fifty major specialised resources exist in the area of non-coding RNA (ncRNA) research. RNAcentral (h​ttp://rnacentral.org)​ provides a unified interface that aggregates information from over twenty of them, including: - larger databases (e.g. RefSeq, ENA, Ensembl!) - databases with specific content type (e.g. Rfam, PDBe) - RNA type-specific databases (e.g. miRBase, GtRNAdb, snoPY) - organism-specific databases (e.g. HGNC, FlyBase, PomBase). RESULTS Launched in 2014 [2], RNAcentral currently contains over 10 million ncRNA sequences from more than 20 RNA databases and Is updated several times a year. Recent updates include ncRNA data from: - HGNC - Ensembl - FlyBase - Rfam (most of sequences in RNAcentral are annotated with Rfam families, while the rest are candidates for producing new Rfam families) There are three main ways of browsing the data through the RNAcentral website. - The text search makes it easy to explore all ncRNA sequences, compare data across different resources, and discover what is known about each ncRNA. - Using the sequence similarity search one can search data from multiple RNA databases starting from a sequence. - Finally, one can explore ncRNAs in select species by genomic location using an integrated genome browser.
    [Show full text]
  • Basic Bioinformatics
    Basic Bioinformatics Purpose of this protocol: From your literature reading, you might find proteins from other model systems (for example mouse) that have known roles in apoptosis, either upstream or downstream of IAPs. You will need to find out: A Search GenBank for the mammalian gene(s) mentioned in the literature, get the GenBank accession numbers of these genes; B Search FlyBase to identify Drosophila homologs and to find out more information on these genes – CG number of the gene, coding and 5’ and 3’-UTR sequences, and other information on the function of this gene. Also find out if it has alternative isoforms; C Search OpenBiosystems to find out if they carry RNAi clones of these genes, and get the clone ID and catalog number; This protocol has three sections, to answer the above three questions. Part A: Find genes in GenBank by name. 1) Go to NCBI website: http://www.ncbi.nlm.nih.gov/ . Under “Search”, select “Nucleotide” and type in the gene name and species that you want to search for. Use Booleans, and tags which you can find at the PubMed help page: http://www.ncbi.nlm.nih.gov/entrez/query/static/help/pmhelp.html#SearchFieldD escriptionsandTags If you know the author who published the paper on the gene you are interested, you can also include the author name in your search along with the gene name and species. Note: You will usually get more than one result. Read the title of each to screen out the ones that are obviously not what you are looking for.
    [Show full text]
  • Flybase 2.0: the Next Generation Jim Thurmond1, Joshua L
    Published online 26 October 2018 Nucleic Acids Research, 2019, Vol. 47, Database issue D759–D765 doi: 10.1093/nar/gky1003 FlyBase 2.0: the next generation Jim Thurmond1, Joshua L. Goodman1, Victor B. Strelets1, Helen Attrill2, L. Sian Gramates 3, Steven J. Marygold2, Beverley B. Matthews3, Gillian Millburn2, Giulia Antonazzo2, Vitor Trovisco2, Thomas C. Kaufman1, Brian R. Calvi1,* and the FlyBase Consortium† 1Department of Biology, Indiana University, Bloomington, IN 47408, USA, 2Department of Physiology, Development and Neuroscience, University of Cambridge, Downing Street, Cambridge CB2 3DY, UK and 3The Biological Laboratories, Harvard University, 16 Divinity Avenue, Cambridge, MA 02138, USA Received September 20, 2018; Editorial Decision October 06, 2018; Accepted October 09, 2018 ABSTRACT primary scientific literature covering more than a century of genetics research. Over the years, the consortium has de- FlyBase (flybase.org) is a knowledge base that sup- veloped new formats of data display and new bioinformatic ports the community of researchers that use the fruit tools to mine these data for biological discovery and trans- fly, Drosophila melanogaster, as a model organism. lational research. These efforts have transformed FlyBase The FlyBase team curates and organizes a diverse from a simple database into a powerful knowledge base. array of genetic, molecular, genomic, and develop- The FlyBase site has undergone major changes since our mental information about Drosophila. At the begin- last review two years ago (1). In February 2017, we released ning of 2018, ‘FlyBase 2.0’ was released with a signifi- a beta version of the next-generation website, which we have cantly improved user interface and new tools.
    [Show full text]
  • Phylogenetic Profiling of the Arabidopsis Thaliana Proteome
    Open Access Research2004Gutiérrezet al.Volume 5, Issue 8, Article R53 Phylogenetic profiling of the Arabidopsis thaliana proteome: what comment proteins distinguish plants from other organisms? Rodrigo A Gutiérrez*†§, Pamela J Green‡, Kenneth Keegstra*† and John B Ohlrogge† Addresses: *Department of Energy Plant Research Laboratory, Michigan State University, East Lansing, MI 48824-1312, USA. †Department of Plant Biology, Michigan State University, East Lansing, MI 48824-1312, USA. ‡Delaware Biotechnology Institute, University of Delaware, 15 § Innovation Way, Newark, DE 19711, USA. Current address: Department of Biology, New York University, 100 Washington Square East, New reviews York, NY 10003, USA. Correspondence: Rodrigo A Gutiérrez. E-mail: [email protected] Published: 15 July 2004 Received: 16 March 2004 Revised: 10 May 2004 Genome Biology 2004, 5:R53 Accepted: 7 June 2004 The electronic version of this article is the complete one and can be found online at http://genomebiology.com/2004/5/8/R53 reports © 2004 Gutiérrez et al.; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL Phylogenetic<p>Theopportunitycodingof life</p> genes availability to profilingof decipher <it>Arabidopsis of theof the the complete genetic Arabidopsis thaliana genomefactors </it>onthaliana that sequence define the proteome: ofbasis plant <it>Arab of form whattheir idopsisand proteinspattern function. thaliana of distingu sequence To </it>together beginish similarityplants this task, from wi to thwe other organismsthose have organisms? of classified other across organisms the the nuc threlear proe domains protein-vides an deposited research Abstract Background: The availability of the complete genome sequence of Arabidopsis thaliana together with those of other organisms provides an opportunity to decipher the genetic factors that define plant form and function.
    [Show full text]
  • Arabidopsis Thaliana Functional Genomics Project Annual Report 2008
    The Multinational Coordinated Arabidopsis thaliana Functional Genomics Project Annual Report 2008 Xing Wang Deng [email protected] Chair Joe Kieber [email protected] Co-chair Joanna Friesner [email protected] MASC Coordinator and Executive Secretary Thomas Altmann [email protected] Sacha Baginsky [email protected] Ruth Bastow [email protected] Philip Benfey [email protected] David Bouchez [email protected] Jorge Casal [email protected] Danny Chamovitz [email protected] Bill Crosby [email protected] Joe Ecker [email protected] Klaus Harter [email protected] Marie-Theres Hauser [email protected] Pierre Hilson [email protected] Eva Huala [email protected] Jaakko Kangasjärvi [email protected] Julin Maloof [email protected] Sean May [email protected] Peter McCourt [email protected] Harvey Millar [email protected] Ortrun Mittelsten- Scheid [email protected] Basil Nikolau [email protected] Javier Paz-Ares [email protected] Chris Pires [email protected] Barry Pogson [email protected] Ben Scheres [email protected] Randy Scholl [email protected] Heiko Schoof [email protected] Kazuo Shinozaki [email protected] Klaas van Wijk [email protected] Paola Vittorioso [email protected] Wolfram Weckwerth [email protected] Weicai Yang [email protected] The Multinational Arabidopsis Steering Committee—July 2008 Front Cover Design Philippe Lamesch, Curator at TAIR/Carnegie Institute for Science, and Joanna Friesner, MASC Coordinator at the University of California, Davis, USA Images Map of geographical distribution of ecotypes of Arabidopsis thaliana Updated representation contributed by Philippe Lamesch, adapted from Jonathon Clarke, UK (1993).
    [Show full text]
  • PANTHER Version 11: Expanded Annotation Data from Gene Ontology and Reactome Pathways, and Data Analysis Tool Enhancements
    Published online 28 November 2016 Nucleic Acids Research, 2017, Vol. 45, Database issue D183–D189 doi: 10.1093/nar/gkw1138 PANTHER version 11: expanded annotation data from Gene Ontology and Reactome pathways, and data analysis tool enhancements Huaiyu Mi*, Xiaosong Huang, Anushya Muruganujan, Haiming Tang, Caitlin Mills, Diane Kang and Paul D. Thomas* Division of Bioinformatics, Department of Preventive Medicine, Keck School of Medicine of USC, University of Southern California, Los Angeles, CA 90033, USA Received October 05, 2016; Revised October 27, 2016; Editorial Decision October 28, 2016; Accepted November 16, 2016 ABSTRACT INTRODUCTION The PANTHER database (Protein ANalysis THrough Protein ANalysis THrough Evolutionary Relationships Evolutionary Relationships, http://pantherdb.org) (PANTHER) is a multifaceted data resource for classifica- contains comprehensive information on the evolu- tion of protein sequences by evolutionary history, and by tion and function of protein-coding genes from 104 function. Protein-coding genes from 104 organisms are clas- completely sequenced genomes. PANTHER software sified by evolutionary relationships, and by structured rep- resentations of protein function including the Gene Ontol- tools allow users to classify new protein sequences, ogy (GO) and biological pathways. The foundation of PAN- and to analyze gene lists obtained from large-scale THER is a comprehensive ‘library’ of phylogenetic trees genomics experiments. In the past year, major im- of protein-coding gene families. These trees attempt to re- provements include a large expansion of classifica- construct the evolutionary events (speciation, gene duplica- tion information available in PANTHER, as well as tion and horizontal gene transfer) that led to the modern- significant enhancements to the analysis tools.
    [Show full text]