Of Caenorhabditis Elegans
Total Page:16
File Type:pdf, Size:1020Kb
82-86 Nucleic Acids Research, 2001, Vol. 29, No. 1 © 2001 Oxford University Press Worm Base: network access to the genome and biology of Caenorhabditis elegans Lincoln Stein*, Paul Sternberg\ Richard Durbin2, Jean Thierry-Mieg3 and John Spieth4 Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA, 1 Howard Hughes Medical Institute and California Institute of Technology, Pasadena, CA, USA, 2The Sanger Centre, Hinxton, UK, 3National Center for Biotechnology Information, Bethesda, MD, USA and 4Genome Sequencing Center, Washington University, StLouis, MO, USA Received September 18, 2000; Accepted October 4, 2000 ABSTRACT and a page that shows the locus in the context of the physical map. WormBase {http://www.wormbase.org) is a web-based WormBase uses HTML linking to represent the relationships resource for the Caenorhabditis elegans genome and between objects. For example, a segment of genomic sequence its biology. It builds upon the existing ACeDB data object is linked with the several predicted gene objects base of the C.elegans genome by providing data contained within it, and each predicted gene is linked to its curation services, a significantly expanded range of conceptual protein translation. subject areas and a user-friendly front end. The major components of the resource are described below. The C.elegans genome DESCRIPTION WormBase contains the 'essentially complete' genome of Caenorhabditis elegans (informally known as 'the worm') is a C.elegans, which now stands at 99.3 Mb of finished DNA small, soil-dwelling nematode that is widely used as a model interrupted by approximately 25 small gaps. In addition, the system for studies of metazoan biology (1). Caenorhabditis resource contains a reference set of predicted and confirmed elegans' popularity results from the confluence of several genes curated by the W ormBase staff, conceptual translations, factors: its developmental program is understood at the single alternative splicing patterns inferred from EST overlaps, and cell level (2,3), its complete genome is known (4), and it is the underlying raw data used to make these determinations. highly amenable to genetic manipulation, including RNA Worm Base also maintains a regularly updated list of DNA and inhibition (RNAi) intervention (5). protein similarity matches between the C.elegans genome and WormBase is a collaborative effort to capture, curate and sequences from other species. distribute information about C.elegans biology. It is an The biologist can gain access to the genome in a number of outgrowth of ACeDB (http://www.acedb.org), the database ways: he can search the genome by specifying the name of a well-known marker, such as the name of a clone, a predicted used in the course of the C.elegans sequencing project to gene or a genetic locus. The researcher may also enter the coordinate the sequencing effort and to integrate the worm genome via a BLAST search. WormBase has several alternative sequence with the genetic and physical maps. Unlike the displays of genomic information, including a purely graphical original ACeDB project, however, WormBase is heavily display, a mixture of graphics and HTML (Fig. 1), and a purely committed to the curation and interpretation of the C.elegans tabular representation. The displays are designed to minimize literature, and has moved from a genome-centric perspective to the amount of extraneous information displayed to the user. one that more evenly balances the worm genome with other For example, the display for a predicted gene does not, by aspects of its biology. default, show the gene's DNA or conceptual translation. WormBase, like its predecessor, uses a high-level, object However, these data are easily accessed with a single click on oriented framework to organize and present C.elegans information. a 'pop down' icon. The researcher navigates through a series of biologically meaningful object classes such as Locus, Sequence, Cell and Cells and their lineages Paper. Each of these object classes is associated with at least one WormBase contains the complete lineages for the male and WormBase web page that is specialized for its display, and hermaphrodite organisms and information that describes each many classes have multiple alternative representations cell and its primary biological function. Biologists can search designed to meet different research needs. For example, if a for cells by their standard nomenclature, or by way of a pedigree researcher is reviewing a page of information on a particular browser that displays the complete lineage in the manner of an genetically-mapped locus, he can easily switch between a text expandable outline in a word processor (Fig. 2). The lineage display of the mutant phenotype, strains and alleles, a graphical display provides flexible search functionality; for example, form that shows the position of the locus on the genetic map researchers can use the •lineage browser to identify all cells *To whom correspondence should be addressed. Tel: +I 516 367 8380; Fax: +I 516 367 8389; Email: !stein @csh l. org Nucleic Acids Research, 2001, Vol. 29, No. 1 83 Fil" Go Communic:o.toc Help !J d • !I:OO!JnOIU "" LOCabon: ~ tt p : //WWW. •10r1r<b ;Se . ~ q;, / l:l~ t· l -/ac ;;f el e ;~S-/~ eq/;e;Uen ~ o· Wnan Relan>d .. s ~ cl<. "~"''""~ Ro'li'OaG Hern e sea~t:~ N ets cap~ Pdnt sectm l~ '>•r.IJ lff&t!IWf' I\ XI ormBase Sequence Report for: 80545 sym bot jisos.qs B<'!1<5.1b IY't"'O'............... .........ll- tj' 't·'l" 'jYC),,.:fl "' ~~ .......................__. to, .:..,: .:r;~:!'t' ;·~... "'': -e H~-~--+----~IY' III"fiY"'~.....---t, tp • ..................... _.._....... -+- --+-+......... - •1~1---i--+---t--t--t--i--+--~ l -t--l-+--t--+-- Descnpuo .. - : L .• -- - .• -- .. - ·---- ·- --- --- - - ·- ·-- •. - ·-- -·- __ ___ .. _ ·- ------- -· - ·- _.. __,. ____ ·-- -· - ·--- - ··- --· - ·--- - ·----1 , Sf!qt~Mt:B ty p ~t: : gen~m te - - r i n i~Mn , ~ ---·- -- -- ---- - -----·r ----------·- -- -... -· ------·-··- ·- - ----·-- ---- ------- - ---------- --------- ------ -- - - ---·---- - -- ---- - -- -- "' Qr\<jn ; ( ·.~p}IU:'\Q I ?n Uns•J!ilii !' 1 S1 LOtl$ ~ tO I ~~- - ---- - - ----- ·----r-_- _·- ... ·--·-··--- -· ---- -·- ----· .,.._ ___--- --·----- ·-----.·-·-·-- ·--··- ·----·-·--,. -··----·--- .! 1\'iltltw· s e qu<,JCu ~ 1;;,.PEEL 1'!·: R '·"~ • . G.enB <:uMMBL.iD:, [ P.~tl!_~ ~ t; ~ • 1.S:DGJr'C"e clone: ,Rllmat!CS: Figure 1. The sequence browser di splays alternative splicing patterns predicted by EST overlaps. involved in the genesis of the amphid organ. From there they Another tool available in Worm Base allows the user to move can jump to a page that gives information on the function and between the genetic map and the genome sequence, providing spatial location of each cell, and move onward from there to a the means to associate genetically-mapped loci with predicted summary of the genes that are known to be specifically genes. If the researcher wishes, he can drill down to the raw expressed in the amphid organ. genetic mapping data, and see the genotypes of individuals The default display for each cell summarizes its topological produced by mapping crosses. position in the worm' s anatomy as well as its temporal position in the lineage and provides information on cell function and Expression patterns interactions. For approximately a third of the cells in the adult, WormBase contains the results from reporter gene fusion WormBase displays schematics that show their position in an experiments performed by a number of groups, chiefly those of idealized worm. Ian Hope (6). These experiments relate cells, organs and stages of the life cycle with the expression of tagged genes. The user Genetic maps interface (Fig. 3) allows researchers to search for genes that are A series of pages give researchers access to up to date genetic expressed in arbitrary combinations of organs, cells and life maps, and list known mutants, alleles and phenotypes. Links to stages, and displays micrographic data as well as text summaries the C.elegans Genetics Center allow users to order strains and of the findings. to submit new genetic mapping information. WormBase provides two versions of the graphical genetic map. The first Bibliographic references version is a static HTML image map that will work correctly WormBase contains an extensive bibliography of papers on all web browsers. The second is an interactive Java applet published in C.elegans biology going back to the mid-1970s, that provides scrolling and zooming functionality, but requires as well as unpublished abstracts from the biannual Worm a plug-in to work properly on some browsers. Meetings and brief reports contributed to the Worm Breeder's 84 Nucleic Acids Research, 2001, Vol. 29, No. 1 rerRME- Edrt \!lew Go C01nl1111r\ICWU J.. · Baolalla r ~ ~ .!J, Loeab on: j)')np : 1 !"'"" ·wormbase .Or9/DI'Wl /ac.s/e l~gar s/C<'! 11 /ped19 Pedigree Browser for RME neuro11s Jl SMWCallj SMW L r O<'> g ~ : --------'-' I L rM a g ~ l ;· ---""-' CeU GrooF'. ___R_ M_E_ n_e_v_ro_ns____ -'_.1 !&Gro;j i ! Clcs• An[ Fu ll lineage -.:7 ~.R ~ "' 1l.L.& ... ~ v '::iJ.I3:.• ---;;-~ .;.R ~I.e ~:. t; ~ .... ~ ~~ "' "c ol ~R."' h.-. .1 -;,b; Rtffil "<? ~ ,'\ ~ ;\ll.l;) ·fl(l !i!:&:fi ~ ~ ~ ~ v /t. ~;)lil Jl [) c.R :t M'l ~ ""' "ovr;) ~ " e..a"· · ~~ "" AR "'rl" ~ AP >'UP'> "=" A.: ~ l e c tr.d~ ~ ~ ~~ ~ Figure 2. The lineage browser allows users to selectively di splay portions of the C.el egans cell lineage. Gazette newsletter. Each paper is cross-referenced with the at the Sanger Centre, http://wormbase.sanger.ac.uk. The data loci , sequences and cells that it refers to, as well as to the base can be browsed by casual users using a simple keyword authors, their affiliations and contact information. search interface that appears on the home page. With this interface biologists can find database objects based on their External links names, accession numbers or any text that appears within the WormBase is extensively linked to other sources of information, object itself. including GenBank/EMBL, SWISS-PROT, and the Worm Caenorhabditis elegans biologists and other researchers Proteome Database (WPD).