Highlights of the 'Gene Nomenclature Across Species' Meeting

MEETING REPORT Highlights of the ‘Gene Nomenclature Across Species’ Meeting Elspeth A. Bruford* Project Coordinator, HUGO Gene Nomenclature Committee (HGNC), EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, UK *Correspondence to: E-mail: [email protected] Date received (in revised form): 24th February, 2010 Abstract The first ‘Gene Nomenclature Across Species’ meeting was held on 12th and 13th October 2009, at the Møller Centre in Cambridge, UK. This meeting, organised and hosted by the HUGO Gene Nomenclature Committee (HGNC), brought together invited experts from the fields of gene nomenclature, phylogenetics and genome assembly and annotation. The central aim of the meeting was to discuss the issues of coordinating gene naming across vertebrates, culminating in the publication of recommendations for assigning nomenclature to genes across multiple species. Meeting summary coorganiser of the meeting, kicked off by discussing The meeting began with a welcome and outline of the current work of the HGNC, ‘An Essential the agenda from Elspeth Bruford, one of the Resource for the Human Genome’. Matt outlined meeting organisers and the group coordinator for the roles of the HGNC, including a summary of the HUGO Gene Nomenclature Committee the process of symbol assignment, and its current (HGNC). HGNC has been based at the European efforts in coordinating gene naming across ver- Bioinformatics Institute (EBI) at Hinxton, UK, tebrates. He also highlighted instances where the since 2007. Since its inception in 1979, the lack of approved gene nomenclature for most mam- HGNC has been assigning gene symbols and malian genomes has resulted in valuable published names to all human genes, including pseudogenes data for these species being absent or confused in and non-coding RNAs. the genomic databases. He was followed by Janan The first session was chaired by Jennifer Harrow, Eppig, principal investigator of the MGI, who, in who leads the Human and Vertebrate Analysis and her talk, ‘What’s in a Name’, told us about current Annotation (Havana) group from the Wellcome nomenclature issues and activities for the mouse. Trust Sanger Institute (WTSI), also located on the As well as genes, the group at MGI also name Hinxton campus. This session was devoted to genetic markers, alleles, mutations and strains. introducing the three established gene nomencla- Current efforts are focused on creating a unified ture groups for mammals — namely HGNC, the gene catalogue for the mouse, by comparing gene Mouse Genome Nomenclature Committee models from the National Center for (MGNC) — based at the Mouse Genome Informatics Biotechnology Information (NCBI)’s Entrez Gene Database (MGI) at the Jackson Laboratory in database, the Ensembl database and the Havana Maine, USA — and the Rat Genome and group’s Vega database. The mouse genetics com- Nomenclature Committee, based at the Rat munity began naming genes in a standardised way Genome Database (RGD) in Milwaukee, long before the human community, with the first Wisconsin, USA. Matt Wright from the HGNC, Mouse Nomenclature Guide published in 1940. In # HENRY STEWART PUBLICATIONS 1479–7364. HUMAN GENOMICS. VOL 4. NO 3. 213–217 FEBRUARY 2010 213 MEETING REPORT Bruford 2003, the International Committee on nomenclature is identical to human where possible, Standardized Genetic Nomenclature for Mice, and and where a gene has been duplicated in Xenopus the Rat Genome and Nomenclature Committee relative to mammals the gene symbols are appended agreed to unify rules and guidelines for gene, allele with a numeral or letter suffix to indicate this. and mutation nomenclature in the mouse and rat. The newest nomenclature group, the Chicken It was therefore apt that Janan was followed by Gene Nomenclature Committee (CGNC; PMID: Mary Shimoyama from the RGD. Mary talked 19607656), also aims to name chicken genes based about ‘Nomenclature Assignment, Review and on the names assigned to human genes. Alan Resolution at the Rat Genome Database’, starting Archibald from the Roslin Institute, Edinburgh, with a discussion of the pipelines and software they UK, updated us on the progress of the CGNC, have established for naming rat genes, quantitative which has begun its naming efforts by transferring trait loci (QTLs) and strains, and for making the human gene symbols to 1:1 orthologues in nomenclature updates and orthology assignments chicken. To date, over 8,000 genes with a con- between rat, mouse and human. The state of the firmed 1:1 orthologue in human have been current rat genome assembly can prove problematic, assigned approved names by the CGNC. and there is a need to establish a core consensus rat After lunch, the third session turned to look at gene set in a manner similar to that of the other mammalian genomes that do not have an Consensus CDS (CCDS) projects that are currently established nomenclature group. Elizabeth in place for the human and mouse genomes Murchison from the WTSI spoke first on ‘Gene (PMID: 19498102). Other issues raised by Mary Annotation and Nomenclature in Marsupials and included problems with synchronising updates Monotremes’. While currently they are only rep- between databases, the need for timely adoption of resented by three ‘complete’ genomes in the public RGD gene nomenclature by some databases, and domain (namely those for the opossum, wallaby the lack of requirement for authors to use standar- and platypus), the important positions of these dised nomenclature in many journals. non-eutherian mammals in the vertebrate phylo- The second session was chaired by Derek geny mean that they should be able to teach us Stemple, head of the Vertebrate Development and some fascinating lessons about the evolution of the Genetics group at the WTSI, and focused on the mammalian genome. In most cases, marsupial and three further vertebrate nomenclature groups, start- monotreme genes do have clear eutherian ortholo- ing with a report from Monte Westerfield, the gues, but Elizabeth also discussed the platypus principal investigator of the Zebrafish Model defensin genes, which have shown us that dupli- Organism Database (ZFIN). Zebrafish gene names cation of these immune genes has independently are based on human names wherever possible, but resulted in the convergent evolution of venom in the symbols are written in lower case to distinguish both monotremes and reptiles. them from human gene symbols (which are in Chris Elsik and Ross Tellam, the analysis leaders upper case letters) or mouse/rat symbols (which of the Bovine Genome Sequencing and Analysis are lower case except for an initial upper case Consortium, then told us about the ‘Annotation of letter). Monte raised the important point that the Bovine Genome — the Easy and the Difficult’. species-specific mutants can drive the naming of This talk highlighted several common and recur- genes, such as the oep one-eyed pinhead gene in ring themes from the meeting: the importance of zebrafish, which is the orthologue of human high coverage and a quality genome assembly; the teratocarcinoma-derived growth factor 1 (TDGF1) necessity of producing a consensus gene set that is gene. The next speaker was Erik Segerdell from deposited in a centralised database (in this case the Xenbase, a Xenopus laevis and tropicalis resource Bovine Genome Database, www.bovinegenome. based at the University of Calgary in Alberta, org); and the need for expert input into specific Canada. As for zebrafish, Xenopus gene groups and families of genes. As currently there are 214 # HENRY STEWART PUBLICATIONS 1479–7364. HUMAN GENOMICS. VOL 4. NO. 3. 213–217 FEBRUARY 2010 Highlights of the ‘Gene Nomenclature Across Species’ Meeting MEETING REPORT no guidelines for assigning bovine gene symbols, of genes and over 400 mouse genes, there are only the 5,757 bovine gene models found in both around 120 sets of 1:1 orthologues, making the Ensembl and Entrez Gene, over 60 per cent have direct transfer of gene names between species different symbols assigned to them in each database, impossible without extensive manual curation. The so, clearly, there is a need for standardising the afternoon concluded with a lively discussion on nomenclature for this genome. Jim Reecy, the nomenclature guidelines across species, chaired by bioinformatics coordination leader of the USA’s Alan Archibald. All those present at the meeting National Animal Genome Research Program, then agreed that it would be useful to have a common talked to us about porcine gene annotation. To set of nomenclature rules that could be applied date, over 17,000 gene models have been annotated to any novel vertebrate genome, and that these in the swine genome, of which nearly 10,000 have would be based on human gene nomenclature but been projected from other species. Manual annota- also take into account species-specific character- tion, both from the Havana team at WTSI and istics. This should prove an invaluable resource for from community annotation, is now being used to assigning standardised gene names to newly refine these gene models. Jim also mentioned the sequenced genomes. International Society of Animal Genetics (ISAG), The next day, the proceedings began with two which is an established forum for the livestock gen- in-depth talks on complex gene families, following etics community. Its genome sequence workshops on from Lisa’s presentation the previous afternoon could provide an excellent opportunity for gene on zinc fingers. This session was chaired by Vasilis nomenclature committees to meet. The final Vasiliou from the University of Denver, Colorado, speaker of this session was Noelle Cockett, the USA, an expert in the aldehyde dehydrogenase sheep genome coordinator, based at Utah State family. The first talk came from Jed Goldstone University, USA, who updated us on the ‘Assembly from Woods Hole Oceanographic Institution in of the Ovine Whole Genome Reference Massachusetts, USA, who studies the evolution of Sequence’. The sheep genome is still in the early the cytochrome P450 (CYP) superfamily. While stages of assembly.

Load more