Flybase: Updates to the Drosophila Melanogaster Knowledge Base Aoife Larkin 1,*, Steven J

Published online 21 November 2020 Nucleic Acids Research, 2021, Vol. 49, Database issue D899–D907 doi: 10.1093/nar/gkaa1026 FlyBase: updates to the Drosophila melanogaster knowledge base Aoife Larkin 1,*, Steven J. Marygold 1, Giulia Antonazzo1, Helen Attrill 1, Gilberto dos Santos2, Phani V. Garapati1, Joshua L. Goodman3,L.SianGramates 2, Gillian Millburn1, Victor B. Strelets3, Christopher J. Tabone2, Jim Thurmond3 and FlyBase Consortium† 1 Department of Physiology, Development and Neuroscience, University of Cambridge, Downing Street, Cambridge Downloaded from https://academic.oup.com/nar/article/49/D1/D899/5997437 by guest on 15 January 2021 CB2 3DY, UK, 2The Biological Laboratories, Harvard University, 16 Divinity Avenue, Cambridge, MA 02138, USA and 3Department of Biology, Indiana University, Bloomington, IN 47405, USA Received September 15, 2020; Revised October 13, 2020; Editorial Decision October 14, 2020; Accepted October 22, 2020 ABSTRACT In this paper, we highlight a variety of new features and improvements that have been integrated into FlyBase since FlyBase (ﬂybase.org) is an essential online database our last review (3). We have introduced a number of new fea- for researchers using Drosophila melanogaster as a tures that allow users to more easily find and use the data in model organism, facilitating access to a diverse ar- which they are interested; these include customizable tables, ray of information that includes genetic, molecular, improved searching using expanded Experimental Tool Re- genomic and reagent resources. Here, we describe ports, more links to and from other resources, and bet- the introduction of several new features at FlyBase, ter provision of precomputed bulk files. We have upgraded including Pathway Reports, paralog information, dis- tools such as Fast-Track Your Paper and ID validator. We ease models based on orthology, customizable ta- have also introduced more high-level groupings to improve bles within reports and overview displays (‘ribbons’) data accessibility; these include our curated Pathway Re- of expression and disease data. We also describe a ports that facilitate understanding and exploration of a number of major signaling pathways, as well as new sum- variety of recent important updates, including incor- mary ‘ribbons’ representing expression and disease model poration of a developmental proteome, upgrades to data. In order to further promote and simplify user-led in- the GAL4 search tab, additional Experimental Tool vestigation of genes, we have integrated new paralog infor- Reports, migration to JBrowse for genome browsing mation, reviewed and revised functional data for enzymes and improvements to batch queries/downloads and and introduced a novel orthology-based pipeline to iden- the Fast-Track Your Paper tool. tify D. melanogaster genes potentially relevant to human disease. Finally, we give details about our genome browser change from GBrowse to JBrowse (4). INTRODUCTION FlyBase (flybase.org) is the leading database and web por- tal for data related to the fruit fly, Drosophila melanogaster. IMPROVEMENTS FOR ACCESSING AND SORTING FlyBase contains a wide variety of data types curated from REAGENTS published scientific literature and incorporated from other New responsive tables databases, and strives to continually improve display and functionality for users. Gene Report pages have a number We have introduced responsive tables into our reports, al- of sections that may be of interest to the user; where ap- lowing users to customize the display and easily access the plicable, these include: genomic information, Gene Ontol- subset of the data that is important to them. These tables ogy (GO, 1) annotations, transcripts, protein domains, ex- are currently implemented in the ‘Alleles, Insertions, and pression, alleles and transgenic constructs, phenotypes, or- Transgenic Constructs’ of Gene Reports, the ‘Members’ thologs, human disease models, physical and genetic inter- section of Pathway Reports and the hitlist of results of the actions, stocks and reagents, and associated references. New QuickSearch GAL4 etc. tab (Figure 1). We plan to expand users of FlyBase may find it useful to refer to Marygold et them to other tables on the website. Key features include al (2). the ability to Show/Hide columns, column reordering, row *To whom correspondence should be addressed. Tel: +44 1223333865; Email: [email protected] †The members of the FlyBase Consortium are listed in the Acknowledgements. C The Author(s) 2020. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. D900 Nucleic Acids Research, 2021, Vol. 49, Database issue Downloaded from https://academic.oup.com/nar/article/49/D1/D899/5997437 by guest on 15 January 2021 Figure 1. Responsive tables can be filtered, sorted and rearranged by users to easily browse or search genetic reagents. FlyBase has incorporated responsive tables into (A) the ‘Alleles, Insertions and Transgenic Constructs’ section of Gene Reports, (B) the ‘GAL4 etc’ QuickSearch hitlist and (C) the Frequently Used GAL4 Drivers table. filtering and sorting. As well as providing direct links to Tool Reports have been generated for genetically encoded the selected data subsets, the user-specified table can be sensors, allowing researchers to find transgenic reagents downloaded in a number of formats and the filtered results that can be used to monitor changes in the levels of small can be exported to a FlyBase hitlist for further filtering or molecules (e.g. GCaMP) or to monitor biophysical prop- analysis. erties such as pH (e.g. pHluorinE) or membrane potential (e.g. Voltron). Secondly, Experimental Tool Reports have been made for reagents that can be used to modify the activ- Additional Experimental Tool Reports ity of an excitable cell; these are divided into neuron activa- We continue to link newly generated transgenic alleles and tion tools (e.g. Chrimson) and neuron inhibition tools (e.g. constructs to any relevant experimental tools, to make it Kir2.1). The ‘Alleles, Insertions, and Transgenic Constructs’ easier for researchers to find fly strains and reagents with tables on Gene Reports and the results of the QuickSearch particular characteristics. In addition, we have added two ‘GAL4etc.’ tab have been modified to include experimental new broad classes of experimental tool. First, Experimental tool information where relevant. Nucleic Acids Research, 2021, Vol. 49, Database issue D901 Updates to GAL4 etc QuickSearch tab pathway are displayed in the Members table, in the ‘# Path- way Refs’ column (Figure 2B). This information should The GAL4 etc QuickSearch tab allows FlyBase users to help users differentiate between a novel regulator and a well- search for transgenic drivers or reporters by temporal- documented central pathway component, for instance. As spatial expression pattern, using terms from the FlyBase part of on-going curation, additional members/supporting Developmental Ontology, the Drosophila Anatomy Ontol- evidence are added to these pages to keep them current. ogy and the GO in defined fields. In the updated tool, users Buttons are provided to export member genes to the ‘Batch can alternatively search for drivers/reporters that reflect Download’ tool or to a standard FlyBase hitlist for further the expression pattern of a specific gene. Searches from the refinement or analysis, or to our ortholog tool to obtaina GAL4 etc QuickSearch tab now result in a responsive table- list of predicted orthologs using data from the DRSC Inte- styled hitlist that includes every driver and reporter that grative Ortholog Prediction Tool (DIOPT, 6). matches the searched-for pattern and allows the user to filter Pathway Reports can be accessed via a dedicated ‘Path- by encoded tool (e.g. lexA), tool use (e.g. binary expression ways’ tab on the QuickSearch tool of the FlyBase home- Downloaded from https://academic.oup.com/nar/article/49/D1/D899/5997437 by guest on 15 January 2021 system - driver), tag (e.g. Tag:MYC), or tag use (e.g. epitope page or from links in the ‘Function’ and ‘Pathways’ sec- tag) (Figure 1B). tions of the Gene Report (Figure 2C). The ‘Pathways’ section of a Gene Report also includes a ‘Metabolic Pathways’ Frequently Used GAL4 Drivers table subsection and has linkouts to other metabolic pathway resources––BioCyc (7), KEGG (8) and Reactome (9). The GAL4 etc. tab also includes a link to the Frequently Used GAL4 Drivers table, a resource in which FlyBase has compiled information for more than 250 GAL4 drivers. Enhanced annotation and display of enzyme data This includes the 150 stocks most ordered from the Bloom- Almost a third of protein-coding genes in D. melanogaster ington Drosophila Stock Center, and those drivers that have encode enzymes, reflecting their critical importance for bio- been curated to more than 20 publications. We have con- logical processes. Recently, we have made focused efforts to densed expression information included in this resource review the functional annotations of all enzyme-encoding to emphasize how drivers are being used by the research genes, as well as improve the display of enzyme nomencla- community as experimental tools. This expression informa- ture and the reactions they catalyze. tion includes commonly used

Flybase: Updates to the Drosophila Melanogaster Knowledge Base Aoife Larkin 1,*, Steven J

Maternally Inherited Pirnas Direct Transient Heterochromatin Formation

The Biogrid Interaction Database

Annual Scientific Report 2011 Annual Scientific Report 2011 Designed and Produced by Pickeringhutchins Ltd

The Drosophila Melanogaster Transcriptome by Paired-End RNA Sequencing

Rnacentral: Progress in the Development of the Non-Coding RNA Sequence Database

Basic Bioinformatics

Flybase 2.0: the Next Generation Jim Thurmond1, Joshua L

Phylogenetic Profiling of the Arabidopsis Thaliana Proteome

Arabidopsis Thaliana Functional Genomics Project Annual Report 2008

Bioinformatics Approaches for Functional Predictions in Diverse Informatics Environments. Paula Maria Moolhuijzen, Bsc This Thes

Abstracts Selected for Platform Presentation

The Zebrafish Information Network: New Support for Non-Coding Genes