Genetic Nomenclature Zhiliang Hu Iowa State University, [email protected]

Animal Science Publications Animal Science 2012 Genetic Nomenclature Zhiliang Hu Iowa State University, [email protected] James M. Reecy Iowa State University, [email protected] Follow this and additional works at: http://lib.dr.iastate.edu/ans_pubs Part of the Agriculture Commons, Animal Sciences Commons, and the Genetics Commons The ompc lete bibliographic information for this item can be found at http://lib.dr.iastate.edu/ ans_pubs/170. For information on how to cite this item, please visit http://lib.dr.iastate.edu/ howtocite.html. This Book Chapter is brought to you for free and open access by the Animal Science at Iowa State University Digital Repository. It has been accepted for inclusion in Animal Science Publications by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. Genetic Nomenclature Abstract Genetics includes the study of genotypes, phenotypes and the mechanisms of genetic control between them. Genetic terms describe the processes, genes, alleles and traits with which genetic phenomena are described and examined. In this chapter we will concentrate on the discussions of genetic term standardizations and, at the end of the chapter, we will list some terms relevant to genetic processes and concepts in a Genetic Glossary. Disciplines Agriculture | Animal Sciences | Genetics Comments This is a chapter from The Genetics of the Dog, 2nd edition, chapter 23 (2012): 496. Posted with permission. This book chapter is available at Iowa State University Digital Repository: http://lib.dr.iastate.edu/ans_pubs/170 2 Genetic om ncla ure Zhi-Liang Hu and James M. Reecy Department of Animal Science, Iowa State University, Ames, Iowa, USA Introduction 496 Locus and Gene Names and Symbols 497 Locus name and symbol 497 Allele name and symbol 497 Genotype terminology 498 Future Prospects 498 References 499 Genetic Glossary 499 Introduction 8 gene (GDF8) (one can also find inappropriate abbreviations such as GDF-8 in the literature) and Genetics includes the study of genotypes, is referred to as the 'bully whippet' locus in dogs. phenotypes and the mechanisms of genetic While all these names are interchangeably used control between them. Genetic terms describe in the literature, it gets more complicated when the processes, genes, alleles and traits with one considers paralogous gene duplications which genetic phenomena are described and across species, which led Rodgers eta/. (2007) examined. In this chapter we will concentrate to propose MSTN-1 and MSTN-2 as paralogue on the discussions of genetic term standardiza names. Unfortunately, this naming scheme does tions and, at the end of the chapter, we will list not follow Human Gene Nomenclature some terms relevant to genetic processes and Committee (HGNC) guidelines, which would concepts in a Genetic Glossary. indicate that the relevant genes should be named A standardized genetic nomenclature is MSTNl and MSTN2. (Note that work on the vital for unambiguous concept description, effi standardization of human gene nomenclature is cient genetic data management and effective far more advanced than that in other species.) communications among not only scientists, but From the previous example, we see that also among canine veterinarians, breeding there is evidently a need for all researchers to societies and those individuals who are inter follow a standardized genetic nomenclature in ested in the subject. This issue becomes even order for them to communicate to each other more evident in the post-genomics era, owing correctly on the terms they use. When a term to the rapid accumulation of large quantities of is used by someone to describe a genetic phe genetic data, and the use of computer software nomenon, the correct use of the term will help to manage such data, which imposes a chal to quickly and precisely place the subject under lenge for the precise definition and interpreta the unambiguously defined scope. More impor tion of genetic terms. tantly, a standard nomenclature will help to For example, the Myostatin gene (MSTN) is minimize the time one has to spend in differ also known as Growth and Differentiation Factor entiating the instances where two terms may ©CAB International 2012. The Genetics of the Dog, 496 2nd Edn (eds E.A. Ostrander and A. Ruvinsky) actually mean the same or different things, for the gene name. The symbol will consist of which is often a costly process. The need for a upper-case Latin letters and possibly Arabic standardized genetic nomenclature becomes numerals. Gene symbols must be unique. more pressing when ontologies are employed The locus name should be in capitalized in biological research for computers to man Latin letters or a combination of Latin letters age the genetic terms. Ontology provides a and Arabic numerals. If the locus name is two new way to effectively use, standardize and or more words, each word should be in capital manage genetics terms. The Gene Ontology Latin characters. The locus symbol should con (GO) Consortium has provided a good exam sist of as few Latin letters as possible or a com ple (The Gene Ontology Consortium, 2000). bination of Latin letters and Arabic numerals. When genomics information must be trans The characters of a symbol should always be ferred across species to perpetuate genetic dis capital Latin characters, and should begin with coveries, the role of a standardized genetic the initial letter of the name of the locus. If the nomenclature becomes even more important locus name is two or more words, then the ini (Bruford, 2010). tial letters should be used in the locus symbol. The goal of this chapter is to help estab The locus name and symbol should be printed lish guidelines for nomenclature, with the hope in italics wherever possible; otherwise they that it will facilitate comparison of results should be underlined. between experiments and, most importantly, When assigning gene nomenclature, the prevent confusion. gene name and symbol should be assigned based on existing HGNC nomenclature where possible (i.e. 1:1 for canine: human orthologues). Ensembl has used the new EPO (Enredo, Pecan, Locus and Gene Names Ortheus) pipeline (Paten eta/., 2008) for whole and Symbols genome alignment of the dog genome. Initial efforts to provide information about genes pre Locus name and symbol dicted during the canine genome sequencing effort assigned standardized nomenclature based These guidelines for gene nomenclature are on human gene nomenclature for 3613 canine adapted and abbreviated from the Human genes (http://uswest.ensembl.org/Canis_famil Gene Nomenclature Committee Guidelines iaris/lnfo/StatsTable). (http://www.genenames.org/guidelines.html). There are two categories of novel canine A gene is defined as: 'A functional heredi genes: (i) novel genes predicted by bioinfor tary unit that occupies a fixed location on a matic gene prediction programs; and (ii) novel chromosome, has a specific influence on phe canine genes that were studied before the com notype, and is capable of mutation to various pletion of the canine genome. In cases where allelic forms. In the absence of demonstrated no known strict 1: 1 human orthologue exists, function a gene may be characterized by the LOC # or Ensembl ID should be used as a sequence, transcription or homology'. A 'locus', temporary gene symbol. In order to assign a which is not synonymous with a gene, refers to name to a novel gene, it will need to be manu a position in the genome that can be identified ally curated and assigned a unique name fol by a marker. A 'chromosome region' is defined lowing HGNC guidelines. as a genomic region that has been associated with a particular syndrome or phenotype. Gene names and symbols will follow the human gene when 1: 1 orthology is known. Allele name and symbol Gene names should be short and specific and convey the character or function of the gene. These guidelines for allele nomenclature are They will be written using American spelling adapted from Young (1998) and mouse and contain only Latin letters or a combination genome nomenclature guidelines (http://www. of Latin letters and Arabic numerals. The first informatics.jax.org/mgihome/nomen/gene. letter of a gene symbol should be the same as shtml), in accordance with HGNC guidelines. The allele names should be as brief as pos Genotype terminology sible, yet still convey the variation associated with the allele. Alleles do not have to be named, The genotype of an individual should be but should be given symbols. If a new allele is shown by printing the relevant locus, gene similar to one that is already named, it should or allele symbols for the two homologous be named according to the breed, geographical chromosomes concerned, separated by a location or population of origin. If new alleles slash, e.g. MSTNcs1234567-T!rs1234567·C. Unlinked are to be named for a recognized locus, they loci should be separated by semicolons, e.g. should conform to nomenclature established CDll Rsa/·2400/2200; ESRPuull·5700!4200. Linked or for that locus. The first letter of the allele name syntenic loci should be separated by a space should be lower case. However, this does not or dash and listed in linkage order (e.g. apply when the allele is only a symbol. POU1FJNG_STCHCIG-PRSS7Nl), or in alpha An allele symbol should be as brief as pos betical order if the linkage order is not known. sible and consist of Latin letters or a combina For X-linked loci, the hemizygous case should tion of Latin letters and Arabic numerals. Like have a /Y following the locus and allele sym a gene symbol, an allele symbol should be an bol, e.g. AREcoSII·l094 /Y. Likewise, Y-linked abbreviation of the allele name, and should loci should be designated by IX following the start with the same letter.

Genetic Nomenclature Zhiliang Hu Iowa State University, [email protected]

Illegitimate DNA Integration in Mammalian Cells

Gene Nomenclature System for Rice

Systematic Review of the Genetics of Sudden Unexpected Death in Epilepsy: Potential Overlap with Sudden Cardiac Death and Arrhythmia-Related Genes C

The Human Genome Project

A Genetic Screen for Genes That Impact Peroxisomes in Drosophila Identifies

Indicators of the Molecular Pathogenesis of Virulent Newcastle

Identification of Common Gene Networks Responsive To

The Prepattern Transcription Factor Irx3 Directs Nephron Segment Identity

Genetics As Explanation: Limits to the Human Genome Project’ (2006, 2009)

The Genetic Basis of Malformation of Cortical Development Syndromes: Primary Focus on Aicardi Syndrome

Highlights of the 'Gene Nomenclature Across Species' Meeting

Connecting Myelin-Related and Synaptic Dysfunction In