NCBI News National Center for Biotechnology Information National Library of Medicine National Institutes of Health Spring 2004 Department of Health and Human Services Transitioning from LocusLink to Entrez Gene Cancer Chromosomes: a New Entrez Database A gene-based view of annotated The Entrez Gene help document genomes is essential to capitalize on provides tips to ease the transition Three databases, the NCI/NCBI the increase in the sequencing and for LocusLink users to the current SKY (Spectral Karyotyping)/M- analysis of model genomes. The Entrez Gene database. FISH (Multiplex-FISH) and CGH Entrez Gene database has been (Comparative Genomic The default display format for developed to supply key connections Hybridization) Database, the NCI Entrez Gene is the graphics display between maps, sequences, expression Mitelman Database of Chromosome shown in Figure 1 for BMP7, which profiles, structure, function, homolo- Aberrations in Cancer, and the NCI resembles the traditional view of a gy data, and the scientific literature. Recurrent Chromosome Aberrations LocusLink record. The array of col- Unique identifiers are assigned to in Cancer databases are now integrat- ored boxes at the head of LocusLink genes with defining sequence, genes ed into NCBI’s Entrez system as the reports that provide links to gene- with known map positions, and “Cancer Chromosomes” database. related resources is replaced by the genes inferred from phenotypic Cancer Chromosomes supports “Links” menu in Gene, which information. These gene identifiers searches for cytogenetic, clinical, or includes additional links, such as are tracked, and functional informa- reference information using the flexi- those to Books, GEO, UniSTS, and tion is added when available. Access ble Entrez search and retrieval sys- Taxonomy. The Gene Transcripts Entrez Gene from the Entrez Home and Products section is provided continued on page 3 Page or directly at: when a gene has been annotated on www.ncbi.nlm.nih.gov/entrez/ a genomic Reference Sequence In this issue query.fcgi?db=gene 1 Transitioning from LocusLink to continued on page 6 Entrez Gene 1 Cancer Chromosomes: a New Entrez Database 2 HomoloGene 4 BLAST Link (BLink) 5 Debut of HCT Database 7 350Kb Sequence Length Limit Removed 7 New Eukaryotic Genomes in Map Viewer 8 Environmental Samples from the Sargasso Sea 8 HIV Protein Interaction Database 9 Perform Reverse ePCR 9 New Organisms in UniGene 9 Rat Gets NP_999999 10 RefSeq Release 6 10 Entrez Tools new “Hotspot” 11 BLAST Lab Figure 1. Entrez Gene display for human BMP7 showing links to over 20 related resources in the “Links” 12 Entrez Quiz pulldown menu. HomoloGene: An Entrez Database with a New Look NCBI News HomoloGene is a system for auto- New Views of the Data mated detection of homologs among the annotated genes of several com- HomoloGene reports include NCBI News is distributed four times pletely sequenced eukaryotic homology and phenotype informa- a year. We welcome communication genomes. The genomes represented tion drawn from Online Mendelian from users of NCBI databases and in the recent Build 36 of Inheritance in Man (OMIM), Mouse software and invite suggestions for Genome Informatics (MGI), articles in future issues. Send corre- HomoloGene include H. sapiens, spondence to NCBI News at the M.musculus, R.norvegicus , D. Zebrafish Information Network address below. To subscribe to NCBI melanogaster , A. gambiae, C. elegans , S. (ZFIN), Saccharomyces Genome News, send your name and address to pombe, S. cerevisiae , N. crassa, M. grisea, Database (SGD), Clusters of either the street or E-mail address below. A. thaliana, and P. falciparum. Orthologous Groups (COG), and NCBI News FlyBase. A “Pairwise Scores” display National Library of Medicine NCBI has adopted a new Homolo- gives a table of pairwise statistics for Bldg. 38A, Room 3S-308 Gene build procedure which is guid- members of a Homologene group 8600 Rockville Pike ed by the taxonomic tree, relies on that includes percent amino acid and Bethesda, MD 20894 conserved gene order and measures nucleotide identities, the Jukes- Phone: (301) 496-2475 Fax: (301) 480-9241 of DNA similarity among closely Cantor genetic distance parameter, E-mail: [email protected] related species, while making use of D, the ratio of non-synonymous to protein similarity for more distantly synonymous amino acid substitutions Editors Dennis Benson related organisms. The new compu- (Ka/Ks) for predicted proteins, and David Wheeler tational procedure greatly increases the ratio of nucleotide identities the reliability of the computed within non-coding regions of the Contributors Susan Dombrowski homologous gene sets and the result- transcript to those within coding Scott McGinnis ing HomoloGene entries now regions (Knr/Knc). Tao Tao include paralogs in addition to orthologs. For more details or to —DW Writers search the database, see the Vyvy Pham New HomoloGene FTP File Formats David Wheeler Homologene home page at: The Homologene data is available by FTP where Editing and Production www.ncbi.nlm.nih.gov/entrez/ the data for each build is contained in two files; Robert Yates query.fcgi?db=homologene "homologene.data" and "homologene.xml.gz". Follow the "FTP site" link in the sidebar on the Graphic Design New Search Strategies Supported Homologene home page to download the files. Robert Yates homologene.data homologene.data is a tab delimited file con- In 1988, Congress established the Because HomoloGene is now an taining, from left to right: National Center for Biotechnology Entrez database, it can be queried •HomoloGene group id •Taxonomy ID Information as part of the National Library using an assortment of fielded terms •gene ID •gene symbol •geninfo identifier of Medicine; its charge is combined with boolean operators. (gi) of the protein product of the gene to create information systems for Among the fields unique to Homolo- •accession number of the protein product molecular biology and genetics of the gene Gene is the “Ancestor” field which data and perform research in homologene.xml.gz computational molecular biology. refers to the taxonomic group of the homologene.xml.gz is a compressed file last common ancestor of the species that contains a complete XML version of the The contents of this newsletter may represented in a HomoloGene entry. HomoloGene build and includes the infor- be reprinted without permission. mation available on the public webpage. The mention of trade names, com- Using the “Ancestor” field it is possi- The Homologene XML DTD is available in mercial products, or organizations ble to limit a search to genes con- the archive "homologene.dtd.tar" at the top level of the ftp site. does not imply endorsement by served in one of 9 ancestral groups: NCBI, NIH, or the U.S. Government. Sordariomycetes (147,550 entries), The old HomoloGene FTP files of the formats NIH Publication No. 04-3272 Eukaryota (2,759 entries), used in "hmlg.ftp" and "hmlg.trip.ftp" will be dis- Fungi/Metazoa (33,154 entries), continued after a transition period. During the ISSN 1060-8788 transition, a new set of codes, reflecting the new build procedure, will be used in these files ISSN 1098-8408 (Online Version) Bilateria (33,213 entries), Coelomata to indicate the nature of the evidence for (33,316 entries), Mammalia (9,172 homology: b - reciprocal best, B - reciprocal entries), Ascomycota (1,083 entries), best in a self-consistent triplet, m - similarity between sequences that do not give reciprocal Insecta (1,689 entries), Rodentia (1,587 best hits. entries). Spring 2004 NCBI News 2 Cancer Chromosomes Mitelman and Recurrent databases & CGH Database, the total matches continued from page 1 use a different system. The menus found in the Mitelman database, and include all ICD-O-3 terms entered the total matches from the Mitelman into the database to date and all Recurrent Database. tem. Search tips are provided in the terms used in the Mitelman and Help document at: Recurrent databases. Descriptions of From the results list, users can access the sections and terms indexed are the pull-down menu and display a www.ncbi.nlm.nih.gov/entrez/query/ variety of features, including the cor- SkyCgh/help.html given in the Help document. responding literature from PubMed, Search “Cancer Chromosomes” Searches based on case information, the results as a list of UI (unique from the database pulldown menu such as diagnosis and disease site, identifier) numbers, or view related on the NCBI home page or navigate return a “case-based report” that lists reports based on common cytoge- to the “Cancer Chromosomes” all cases matching the query terms. netic or diagnostic features. Users page for advanced searches via the Searches based on underlying cytoge- can also view Similarity reports, link on the Entrez home page at: netic features are displayed as a which show terms common to a “clone/cell report” in which each group or records within several term www.ncbi.nlm.nih.gov/Entrez clone or cell-line is listed separately. categories such as diagnosis/site and cytogenetic abnormalities Three search formats are (including CGH) among the offered on the Entrez selected cases or clones/cells. Chromosomes home page: Term co-occurences are listed a conventional Entrez at several levels: common to Query, a Quick/Simple all cases, common to 50%- Search, and an Advanced 90% of cases, and common Search. The Entrez Query to less than 50% of cases. is performed using the The common term or abnor- search box at the top of mality is shown in the left col- the page, and, as with umn and the number of other Entrez databases, affected cases is shown in the searches may be combined right column. The cytogenet- using term limits and ic abnormalities are shown at Boolean expressions. The all levels of resolution. Select Simple Search, available via Figure 1. Results of an Entrez Cancer Chromosomes search for records using the ‘Similarity Report (High a link in the sidebar on the query "8q".
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages12 Page
-
File Size-