Xenbase: a Xenopus Biology and Genomics Resource Jeff B

Xenbase: a Xenopus Biology and Genomics Resource Jeff B

Published online 4 November 2007 Nucleic Acids Research, 2008, Vol. 36, Database issue D761–D767 doi:10.1093/nar/gkm826 Xenbase: a Xenopus biology and genomics resource Jeff B. Bowes1, Kevin A. Snyder1, Erik Segerdell2, Ross Gibb1, Chris Jarabek1, Etienne Noumen1, Nicolas Pollet3 and Peter D. Vize1,2,* 1Department of Computer Science, 2Department of Biological Sciences, University of Calgary, 2500 University Drive NW, Calgary, Alberta, Canada and 3Laboratoire Developpement et Evolution, CNRS UMR 8080, Universite Paris-Sud, Orsay 91405, France Received August 13, 2007; Revised September 18, 2007; Accepted September 20, 2007 ABSTRACT information, e.g. the genome, gene function, gene expres- sion and the literature, and externally to resources such as Xenbase (www.xenbase.org) is a model organism human congenital disease data from OMIM and mutant database integrating a diverse array of biological data from the mouse and the zebrafish. and genomic data on the frogs, Xenopus laevis Xenopus has a number of unique experimental advan- and Xenopus (Silurana) tropicalis. Data is collected tages as a vertebrate model system (2). Paramount from other databases, high-throughput screens and among these is the robustness of early embryos and their the scientific literature and integrated into a number amenability to microinjection and microsurgery. This of database modules covering subjects such as makes them a particularly attractive system for testing community, literature, gene and genomic analysis. the ectopic activity of gene products and loss-of-function Gene pages are automatically assembled from experiments using antagonizing reagents such as morpho- data piped from the Entrez Gene, Gurdon Institute, linos (3), dominant-negatives and neomorphic proteins JGI, Metazome, MGI, OMIM, PubMed, Unigene, (4). Morpholinos are synthetic oligonucleotides (3) that can be used to inhibit hnRNA splicing or mRNA Zfin, commercial suppliers and others. These data translation and are the common gene inhibition reagent are then supplemented with in-house annotation. in Xenopus as neither siRNA or miRNA have yet been Xenbase has implemented the Gbrowse genome shown to function in frog embryos. Xenopus embryos browser and also provides a BLAST service that develop very quickly and form a full set of differentiated allows users to specifically search either laevis tissues within days of fertilization, allowing rapid analysis or tropicalis DNA or protein targets. A table of of the effects of manipulating embryonic gene expression. Xenopus gene synonyms has been implemented There have also been a number of high-throughput and allows the genome, genes, publications and screens of gene expression patterns by wholemount in high-throughput gene expression data to be seam- situ hybridization in Xenopus (5). Not only are these lessly integrated with other Xenopus data and to data critical to ongoing research efforts in Xenopus, they external database resources, making the wealth of also serve researchers working in other systems by providing gene expression and function data not available developmental and functional data from the frog in their own model. available to the broader research community. Building a Xenopus database offered a variety of data handling challenges. In particular, seamlessly combining data from two species, Xenopus laevis—a tetraploid, INTRODUCTION and Xenopus tropicalis—a diploid (2), in an integrated Over the past decade the Xenopus experimental system has environment is complex. There is a wealth of data in the been supported by static community web sites such as the literature on gene function, full-length cDNA sequences Xenopus Molecular Marker Resource and by specialized and in situ databases on laevis, but no genome sequence databases such as Axeldb (1) and XDB3 (N. Ueno, or genetics. Xenopus tropicalis on the other hand has Personal communication). The latter databases contain a sequenced genome, genetics and strong EST support massive amounts of in situ gene expression data but there but little literature or gene expression data. Integrating is often little or no annotation and there is limited data from both organisms in a common environment integration with other data types or external resources. will allow these different strengths to complement each Xenbase (http://www.xenbase.org) was designed to greatly other. As laevis has larger embryos that are better broaden the range of data stored and to integrate this suited to microsurgery and tropicalis has the advantages information both internally between different types of of a diploid in mutant and experimental gene *To whom correspondence should be addressed. Tel: +403 220 8502; Fax: +403 289 9311; Email: [email protected] ß 2007 The Author(s) This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/ by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. D762 Nucleic Acids Research, 2008, Vol. 36, Database issue knockdown experiments, both systems will continue to be records is transferred into Xenbase. Predicted protein and used in parallel and need ongoing support. A challenge mRNA sequences from JGI gene models are then used to that is common to all model organism databases is finding identify laevis orthologs locally via BLAST while gene a balance between automation and the manual curation of names and symbols are used to identify orthologs from data, yet another is to use tools and lessons from other mammals and fish. model organism databases without inheriting flaws and As a quality control step, a Xenbase staff annotator limitations. The public release of Xenbase version 2.0 is manually checks community annotations, ensuring that a solid step towards achieving these various goals. sufficient criteria were applied to select a specific gene We utilize elements of the generic model organism name and that the approved HGNC name and symbol database (GMOD) schema CHADO (6), the GMOD were applied. This is performed by Xenbase staff using the genome browser Gbrowse (7), and elements of zfin (8) and JGI gene annotation toolset on the JGI web site. As new MGI (9) database design in a customized manner to records are added, or old records modified, they are provide a comprehensive online resource of Xenopus flagged by a Xenbase weekly script, and then manually developmental and genomics data. processed by a Xenbase annotator. The Xenbase annota- tions from the JGI database are promoted to gene pages, while machine annotated pages are used to build clearly GENE PAGES identified temporary gene records. At the heart of Xenbase are the gene pages—a catalog of As HGNC names constantly evolve and change (11), over 11 000 genes derived from annotations of Joint Xenbase runs scripts that ensure the Xenopus data remains Genome Institute (JGI) Xenopus tropicalis gene models synchronized with the values associated with the predicted (http://genome.jgi-psf.org/Xentr4/). At the moment, human ortholog. When conflicts are identified these Xenopus tropicalis sequence assembly 4.1 has 27 934 records are flagged and a curator reannotates the gene associated models, generated by a variety of gene predic- record at the JGI and this will then be processed as an tion algorithms. Annotation from both a community- updated record in the next download cycle at Xenbase driven jamboree and ongoing distributed annotation has keeping both Xenopus databases in synchrony with resulted in approximately 2700 manually curated HGNC names. gene models. A total of 11 230 gene model annotations In addition to gene names, gene symbols, mRNA and have also been generated by the metazome automated protein sequence, a variety of additional data is collected pipeline which uses synteny to assign orthology. Of the from the JGI web portal. This includes genomic scaffold 2700 manually curated Xenbase gene pages, most also mapping data, appropriate ontology terms from GO, contain a large variety of other data including sequence KOG and EC, plus metazome and Unigene IDs. When data on X. laevis and human, mouse and zebrafish information for a manually curated gene is not present orthologs gathered from Entrez Gene (10) or from local in the JGI database, for example when a mouse gene BLAST. name/symbol does not match the HGNC name/symbol, Data from the two contributing Xenopus species are a custom curation tool allows an annotator to add this arranged in columns, with information in the left column information manually. This tool also allows the two representing tropicalis data and that in the right, laevis duplicated paralogs present in the zebrafish genome to be data (Figure 1). There are exceptions to this layout, for added to a single Xenbase gene page. A screenshot of example data regarding orthologs from mammalian and a representative gene page is illustrated by Figure 1. non-mammalian species and GO terms common to the gene products of both species. At the moment only one of the two paralogs from the tetraploid species, laevis, GENE NAMING AND SYNONYM MATCHING is displayed and paired with data from the orthologous tropicalis gene. In the future, data on the laevis paralog Applying HGNC names to Xenopus genes has discon- will also be made available through a tab system. When nected genome-generated gene models from the scientific laevis genomic data becomes available it will also be literature, as the names used in these two environments arranged on the right side of the page, although there has rarely match. The previous gene naming custom in the been no firm agreement on when this project will begin. Xenopus community was quite ad hoc—most genes were The Xenopus community voted to adopt HUGO gene only given a symbol, the symbol usually began with an nomenclature committee (HGNC) names and symbols ‘X’, and the symbol could refer to orthology, homology, (11) in surveys posted shortly before genome annotation function, a fictional character, etc. Once new gene names was initiated.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    7 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us