The Microbesonline Web Site for Comparative Genomics

Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press Resource The MicrobesOnline Web site for comparative genomics Eric J. Alm,1 Katherine H. Huang,1 Morgan N. Price,1 Richard P. Koche,3 Keith Keller,3 Inna L. Dubchak,1,2 and Adam P. Arkin1,3,4,5 1Physical Biosciences Division and 2Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA; 3Department of Bioengineering, University of California, Berkeley, California 94720, USA; 4Howard Hughes Medical Institute, Berkeley, California 94720, USA At present, hundreds of microbial genomes have been sequenced, and hundreds more are currently in the pipeline. The Virtual Institute for Microbial Stress and Survival has developed a publicly available suite of Web-based comparative genomic tools (http://www.microbesonline.org) designed to facilitate multispecies comparison among prokaryotes. Highlights of the MicrobesOnline Web site include operon and regulon predictions, a multispecies genome browser, a multispecies Gene Ontology browser, a comparative KEGG metabolic pathway viewer, a Bioinformatics Workbench for in-depth sequence analysis, and Gene Carts that allow users to save genes of interest for further study while they browse. In addition, we provide an interface for genome annotation, which like all of the tools reported here, is freely available to the scientific community. Functional genomic studies have generated a large and growing We have compiled a list of genome analysis tools that we database of experimental results for eukaryotic model organisms, believe to be among the most useful of those currently available, such as yeast. For most prokaryotic model organisms, fewer direct and have combined all of these tools on a single Web site (http:// experimental data are available, yet this apparent deficit is com- www.microbesonline.org). To make it as simple as possible for pensated by the wealth of genomic sequence available. Although users to select genes of interest while they are browsing, we have a wide variety of tools are available to study various aspects of implemented a “Gene Cart” feature analogous to the “Shopping genomic sequence, many are not designed to facilitate direct Cart” common to many commercial Web sites. Genes can be comparison of multiple genomes and thus fully exploit this added to the cart from most of the pages on the MicrobesOnline source of information. Furthermore, many tools are imple- Web site, and then saved for further study, downloaded to a local mented in different Web sites, and use different gene nomencla- computer, or analyzed using the Bioinformatics Workbench. Us- ture, which makes it inconvenient to perform a deep analysis on ers can create and save any number of Gene Carts for further a single gene and then return to the original Web site to identify study. The Workbench currently includes basic sequence analysis more genes to analyze. To perform basic analyses such as mul- tools for multiple sequence alignment and for building phyloge- tiple sequence alignment, users often need to download gene netic trees. sequences, manipulate them into a particular file format (e.g., In addition to collecting these tools in one location, the FASTA), and cut and paste them into another Web site. To con- MicrobesOnline Web site offers several new features as well as tinue further analysis, such as constructing a phylogenetic tree, extensions of existing tools to facilitate comparative genomics. users may be required to paste the resulting alignment into a In particular, metabolic pathway maps allow for the comparison third Web site. of two different genomes to highlight differences in their ex- Several other Web sites and software tools have been de- pected physiological capabilities. The Gene Ontology browser we scribed that assist in the annotation and exploration of compara- developed, called VertiGO, differs from similar tools in that it tive genomic data. The Prolinks and STRING databases offer con- allows any number of the genomes to be displayed at the same venient tools for browsing predicted functional associations time. A multispecies genome browser capable of zoom and scroll among proteins, while GenDB and other systems allow for de- actions allows users to simultaneously align any number of the tailed annotation of individual genomes (Snel et al. 2000; Meyer 200+ genomes currently hosted at MicrobesOnline, quickly ac- et al. 2003; Bowers et al. 2004; von Mering et al. 2005). The ERGO cess additional info on displayed genes, and save displayed genes system combines some of these features, but a full version of the to a user’s Gene Cart. system is not publicly available (Overbeek et al. 2003). The TIGR In addition to these comparative genomic features, Comprehensive Microbial Resource (CMR) also offers some use- MicrobesOnline hosts additional genome annotations produced ful tools such as genome alignment and precomputed BLAST and by the VIMSS group including predicted pseudogenes, complete TIGRfam results (Peterson et al. 2001). operon predictions, as well as “regulon” predictions based on both comparative genomic and (for some genomes) experimental gene expression data that may be useful for identifying groups 5Corresponding author. of genes with related functions. Finally, the MicrobesOnline Web E-mail [email protected]; fax (510) 486-6219. site provides an opportunity for microbiologists to annotate in- Article and publication are at http://www.genome.org/cgi/doi/10.1101/ gr.3844805. Freely available online through the Genome Research Immediate dividual genes using a simple Web-based interface, and to share Open Access option. those annotations with the microbial research community at large. 15:1015–1022 ©2005 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/05; www.genome.org Genome Research 1015 www.genome.org Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press Alm et al. Results and Discussion numbers, Gene Ontology (GO) identifiers (Camon et al. 2003), or COGs (Tatusov et al. 1997). Finding genes of interest Often it is desirable to locate genes that have sequence iden- Most browsing sessions begin with a search for a gene of interest. tity to an existing gene for which there is little available anno- The Keyword Search window provides a simple yet powerful in- tation information. For these cases, users can search available terface for identifying genes with a particular function. Users can genomes using a Web-based BLAST interface. Results are dis- search one or more specific genomes, high-level taxonomic played in the traditional BLAST output format, but hits to genes groups (e.g., Firmicutes), or the entire database. Because genome in the MicrobesOnline database are hyperlinked to the “Locus annotations are often incomplete, and because no single protein Info” pages with more detailed descriptions of each gene. family database is best for all situations, the Keyword Search fea- Multispecies Genome Browser ture returns hits to a wide number of annotation types: (1) genes with a matching name or synonym (e.g., from a search for The MicrobesOnline Comparative Genome Browser allows the “trpA”); (2) genes with a matching gene description (e.g., from a analysis of genes based on their physical position on the chro- search for “tryptophan”); (3) genes mapping to a matching de- mosome, and highlights conservation of gene order across dis- scription in the COG database; (4) genes with a matching Inter- tantly related species, which generally indicates functionally re- pro domain description; and (5) genes assigned to a matching lated genes in conserved operon structures (Dandekar et al. 1998; GO category. This comprehensive search feature allows users to Overbeek et al. 1999; Lathe III et al. 2000), as well as large-scale combine searches across individual databases such as Pfam (Bate- synteny between closely related genomes. From the Comparative man et al. 2004), TIGRfam (Haft et al. 2003), or COG (Tatusov et Genome Browser (Fig. 1) users can select any number of genomes al. 1997) to identify genes that might have a particular function. and align them using an “anchor” gene. Users can add or delete It should be noted, however, that the primary databases might selected genomes from the view, zoom in or out, and scan up- have more up-to-date information than the combined database stream or downstream. All genes within a view are color-coded at MicrobesOnline. An advanced search feature is also provided, according to predicted orthology relationships. Each gene on the which allows further text/Boolean searches for genes by KEGG browser is a hyperlink that can perform one of three actions: (1) metabolic pathways (Kanehisa 2002), enzyme commission (EC) load the Locus Info Page for that gene (described in more detail Figure 1. Comparative Genome Browser. In a subgroup of the ␥-Proteobacteria, two ancient operons for adjacent steps in the TCA cycle, sdhCDAB and sucABCD, have been merged into one larger operon. This view from the Comparative Genome Browser shows the sdhA locus across an archaeon (Halobacterium sp.), a Gram-positive bacterium (Mycobacterium tuberculosis), and a diverse range of proteobacteria. Note that the larger operon is present in only a single clade, yet it is highly conserved within this group, suggesting that it may reflect a recent innovation within the ␥-Proteobacteria. 1016 Genome Research www.genome.org Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press The MicrobesOnline Web site for comparative genomics below); (2) realign genomes by selecting a new gene as the an- Info page. The number of unique genes that are linked to a GO chor; or (3) add that gene to the user’s Gene Cart (described term is displayed for each genome. Many GO terms are not ex- below). Figure 1 shows an example of how the browser was used clusive since a single gene can be involved in multiple biological to investigate the evolutionary history of an operon identified in processes and/or molecular functions.

The Microbesonline Web Site for Comparative Genomics

Computer Applications Making Rapid Advances in High Throughput Microbial Proteomics (HTMP) Balakrishna Anandkumar1,§, Steve W

For Immediate Release. Oct. 8Th, 2010 Microbesonline.Org Enhanced for Metagenomics and Metabolism

Comparing Genomes: Databases and Computational Tools for Comparative Analysis of Prokaryotic Genomes DOI: 10.3395/Reciis.V1i2.Sup.105En

OMA Browser—Exploring Orthologous Relations Across 352 Complete Genomes Adrian Schneider*, Christophe Dessimoz and Gaston H

GTL PI Meeting 2006 Abstracts Molecular Interactions

Comparative Genomics and Functional Analysis of Rhamnose Catabolic Pathways and Regulons in Bacteria Irina A

Environmental Conditions Shape the Nature of a Minimal Bacterial Genome

VIMSS Computational Microbiology Core Research on Comparative and Functional Genomics

Pangenome Analysis Toolkit

Myoglobin Primary Structure Reveals Multiple Convergent Transitions To

GTL PI Meeting 2007 Abstracts

The Impact of Long-Distance Horizontal Gene Transfer on Prokaryotic Genome Size