Flybase at 25: Looking to the Future
Total Page:16
File Type:pdf, Size:1020Kb
FlyBase at 25: looking to the future The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters Citation Gramates, L. S., S. J. Marygold, G. dos Santos, J. Urbano, G. Antonazzo, B. B. Matthews, A. J. Rey, et al. 2017. “FlyBase at 25: looking to the future.” Nucleic Acids Research 45 (Database issue): D663-D671. doi:10.1093/nar/gkw1016. http://dx.doi.org/10.1093/nar/ gkw1016. Published Version doi:10.1093/nar/gkw1016 Citable link http://nrs.harvard.edu/urn-3:HUL.InstRepos:30370952 Terms of Use This article was downloaded from Harvard University’s DASH repository, and is made available under the terms and conditions applicable to Other Posted Material, as set forth at http:// nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of- use#LAA Published online 30 October 2016 Nucleic Acids Research, 2017, Vol. 45, Database issue D663–D671 doi: 10.1093/nar/gkw1016 FlyBase at 25: looking to the future L. Sian Gramates1,*, Steven J. Marygold2, Gilberto dos Santos1, Jose-Maria Urbano2, Giulia Antonazzo2,BeverleyB.Matthews1, Alix J. Rey2, Christopher J. Tabone1, Madeline A. Crosby1, David B. Emmert1, Kathleen Falls1, Joshua L. Goodman3, Yanhui Hu4, Laura Ponting2, Andrew J. Schroeder1, Victor B. Strelets3, Jim Thurmond3, Pinglei Zhou1 and the FlyBase Consortium† 1The Biological Laboratories, Harvard University, 16 Divinity Avenue, Cambridge, MA 02138, USA, 2Department of Physiology, Development and Neuroscience, University of Cambridge, Downing Street, Cambridge CB2 3DY, UK, 3Department of Biology, Indiana University, Bloomington, IN 47405, USA and 4Department of Genetics, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA Received September 30, 2016; Revised October 14, 2016; Editorial Decision October 15, 2016; Accepted October 18, 2016 ABSTRACT reference genome (4,5); a complete reannotation of the ref- erence genome on the basis of RNA-Seq coverage data, Since 1992, FlyBase (flybase.org) has been an es- RNA-Seq junction data, and transcription start site profiles sential online resource for the Drosophila research from the modENCODE project (6–10); and the generation community. Concentrating on the most extensively of large public collections of insertion mutants by the Gene studied species, Drosophila melanogaster, FlyBase Disruption Project (11–14). includes information on genes (molecular and ge- As biological knowledge continues to expand and new netic), transgenic constructs, phenotypes, genetic techniques are developed, FlyBase strives to provide our and physical interactions, and reagents such as expanding user community with improved access to new stocks and cDNAs. Access to data is provided data and data types. Recent years have seen the addition through a number of tools, reports, and bulk-data of Gene Group Reports (1), the design of multiple tools to downloads. Looking to the future, FlyBase is ex- access RNA-Seq data, improvements to FlyBase ontologies (15,16), the incorporation of the Disease Ontology (17,18), panding its focus to serve a broader scientific com- the addition of the user–curation tool Fast-Track Your Pa- munity. In this update, we describe new features, per (19) and enhancements to our major search interface, datasets, reagent collections, and data presentations QuickSearch. that address this goal, including enhanced orthology In the past year, we have continued to add new types data, Human Disease Model Reports, protein domain of data, features, tools and user-help options. Cooperation search and visualization, concise gene summaries, a across the model organism communities has become an in- portal for external resources, video tutorials and the creasingly important issue, especially as it impacts research FlyBase Community Advisory Group. into human disease. In the interest of enhancing the acces- sibility of FlyBase to users outside our traditional commu- INTRODUCTION nity, we have incorporated new orthology data for Homo sapiens and eight model organisms, added a robust orthol- Now in its 25th year, FlyBase continues to serve as the lead- ogy search tool, and improved the display of orthology calls ing database and web portal for genetic, genomic, and func- on FlyBase Gene Reports. We have also highlighted the util- tional information about the model organism Drosophila ity of flies in human disease research through the addition melanogaster (1,2). During this tenure, the primary litera- of Human Disease Model Reports. ture has been and continues to be the major source of data In addition, we have made improvements to existing data in FlyBase (3). Over the last decade, there has been an ad- types and tools. Many FlyBase Gene Reports now include ditional focus on other types of data, with high-throughput a Gene Snapshot, which is a short pithy overview of a data projects featuring prominently. Recent years have seen gene’s function, written by an expert researcher in lan- the completion of Release 6 of the Drosophila melanogaster *To whom correspondence should be addressed. Tel: +1 617 495 9925; Fax: +1 617 495 1354; Email: [email protected] †The members of the FlyBase Consortium are listed in the Acknowledgments. Present addresses: Laura Ponting, Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK. Andrew J. Schroeder, Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA. C The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected] D664 Nucleic Acids Research, 2017, Vol. 45, Database issue guage friendly to the non-specialist. Our genome browser (Online Mendelian Inheritance in Man). For DIOPT-based GBrowse now includes a track for Pfam protein domains, searches, the number of individual algorithms supporting with domain mappings to multiple protein isoforms for a given orthologous gene-pair relationship, compared to genes with multiple transcripts (20). We have incorpo- the total number of relevant algorithms for that call, is ex- rated several new high-throughput datasets and two major pressed as the ‘Score’; the explicit list of supporting algo- reagent collections, the P[acman] collection of BAC clones rithms is also shown. The ‘Best score’ and ‘Best Rev Score’ and the Vienna Tiles (VT) GAL4 collection of fly stocks columns indicate whether the given ortholog has the high- into GBrowse (21,22). est score for the query gene, and whether the reciprocal rela- There is a wealth of resources housed in external web- tionship is also true. Links are also provided to an alignment sites that are vitally important to FlyBase users. We have up- of amino acid sequences between predicted orthologs on the dated, expanded and reorganized the Resources page where DIOPT site, and to FlyBase Gene Reports where a non- we maintain lists of hundreds of third party websites, adding Drosophila gene has been expressed transgenically in flies. a portal page for the most popular resource categories, and If desired, the results table may be downloaded in TSV for- a prominent new access button on the FlyBase front page. mat by clicking on the link at the top of the table. The results FlyBase has also been working to improve engagement page for OrthoDB-based searches is similar, except that the with our user community.We are now producing video tuto- DIOPT-specific columns are absent and the OrthoDB ‘or- rials, to help our users navigate FlyBase and FlyBase tools. thology group’ to which the species belongs is given. Finally, we have recruited a FlyBase Community Advisory A similar presentation is used to show DIOPT and Group, with representatives from fly labs in over 40 coun- OrthoDB-derived orthology data within the ‘Orthologs’ tries, which we periodically survey about potential changes section of D. melanogaster Gene Reports (not shown). Or- and improvements to FlyBase. thology relationships between D. melanogaster and hu- mans are highlighted in a separate ‘Human Orthologs’ sub- section, which also includes links to the human gene re- Enhanced orthology data ports and any associated diseases at the OMIM website FlyBase has recently incorporated orthology data from (omim.org) (25). Furthermore, Gene Group Reports, which the DRSC Integrative Ortholog Prediction Tool (DIOPT) serve to describe and list sets of functionally related D. (23). This dataset, added in the FB2016 02 release of melanogaster genes or gene products (1), have been en- FlyBase, integrates ortholog predictions between several hanced with a ‘View orthologs’ option––clicking this button species (currently H. sapiens, D. melanogaster and seven automatically runs all listed genes through the QuickSearch other model organisms) from multiple individual tools Orthologs tool to generate the results table described above. (currently Compara, HGNC, Homologene, Inparanoid, Isobase, OMA, OrthoDB, orthoMCL, Panther, Phylome, Human disease model reports RoundUp, TreeFam and ZFIN). The DIOPT approach thereby provides a streamlined comparison of orthology The Human Disease Model Report (Figure 2), first intro- predictions originating from different algorithms based on duced in September 2015 (FB2015 04) is designed as an sequence homology,