Standard Genetic Nomenclature Zhiliang Hu Iowa State University, [email protected]
Total Page:16
File Type:pdf, Size:1020Kb
Animal Science Publications Animal Science 2014 Standard Genetic Nomenclature Zhiliang Hu Iowa State University, [email protected] James M. Reecy Iowa State University, [email protected] Fiona M. McCarthy University of Arizona Carissa A. Park Iowa State University, [email protected] Follow this and additional works at: http://lib.dr.iastate.edu/ans_pubs Part of the Agriculture Commons, Animal Sciences Commons, and the Genetics Commons The ompc lete bibliographic information for this item can be found at http://lib.dr.iastate.edu/ ans_pubs/171. For information on how to cite this item, please visit http://lib.dr.iastate.edu/ howtocite.html. This Book Chapter is brought to you for free and open access by the Animal Science at Iowa State University Digital Repository. It has been accepted for inclusion in Animal Science Publications by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. Standard Genetic Nomenclature Abstract Genetics includes the study of genotypes and phenotypes, the mechanisms of genetic control between them, and information transfer between generations. Genetic terms describe processes, genes and traits with which genetic phenomena are examined and described. While the genetic terminologies are extensively discussed in this book and elsewhere, the standardization of their names has been an ongoing process. Therefore, this chapter will only concentrate on discussions about the issues involved in the standardization of gene and trait terminologies. Disciplines Agriculture | Animal Sciences | Genetics Comments This is a chapter in The Genetics of Cattle, 2nd edition, chapter 24 (2014): 598. Posted with permission. This book chapter is available at Iowa State University Digital Repository: http://lib.dr.iastate.edu/ans_pubs/171 24 Standard Ganetic·•Nomencl,ture Zhi-Liang Hu, 1 James M. Reecy, 1 Fiona McCarthy2 and Carissa A. Park1 1/owa State University, Ames, Iowa, USA; 2 University of Arizona, Tucson, Arizona, USA Introduction 598 Locus and Gene Names and Symbols 599 Locus name and symbol 599 Allele name and symbol 600 Genotype terminology 600 Gene annotations and the gene ontology (GO) 600 Trait and Phenotype Terminology 602 Traits 602 Super-traits 602 Trait hierarchy and ontology 603 Current status of research 604 Trait and phenotype nomenclature 605 Future Prospects 605 Acknowledgements 607 References 607 Introduction A standardized genetic nomenclature is vital for unambiguous concept description, effi Genetics includes the study of genotypes cient genetic data management and effective and phenotypes, the mechanisms of genetic communications not only among scientists, but control between them, and information also among those who are involved in cattle transfer between generations. Genetic terms production and genetic improvement. This describe processes, genes and traits with issue has become even more critical in the which genetic phenomena are examined and post-genomics era due to rapid accumulation described. While the genetic terminologies of large quantities of genetic and phenotypic are extensively discussed in this book and data, and the requirement for data manage ·) elsewhere, the standardization of their names ment and computational analysis, which has been an ongoing process. Therefore, increases the need for precise definition and this chapter will only concentrate on discus interpretation of gene and trait terms. sions about the issues involved in the stand For example, the Myostatin (MSTN) gene ardization of gene and trait terminologies. is known as Growth and Differentiation Factor Readers may wish to refer to online resources 8 (GDF8 or GDF-8) in some literature and (see Table 24.1 for URLs) for lists of the is also referred to as the 'muscle hypertrophy' glossaries currently in use. or 'double-muscling' locus in cattle. While the ©CAB International 2015. The Genetics of Cattle, 598 2nd Edn (eds D.J. Garrick and A. Ruvinsky) interchangeable use of all these names in the that they will facilitate comparison of results literature can cause confusion, it gets more between experiments and, most importantly, complicated when one considers paralogous gene prevent confusion. duplications across species, which led Rodgers et a/. (2007) to propose MSTN-1 and MSTN- 2. Unfortunately, this naming scheme does not Locus and Gene Names follow the Human Genome Organization (HUGO) Gene Nomenclature Committee (HGNC) and Symbols guidelines, which indicate that these paral ogues should be named MSTNl and MSTN2, Locus name and symbol respectively. In terms of traits, an example that would The following guidelines for cattle gene nomen benefit from consistent nomenclature is the clature are adapted and abbreviated from the longissimus dorsi muscle area, which is also HUGO Gene Nomenclature Committee (HGNC; referred to as the loin eye area (LEA), loin mus see Table 24.1 for URL). cle area (LMA), meat area (MLD), ribeye area A gene is defined as 'a DNA segment (REA), etc. Each of these is known to certain that contributes to phenotype/function. In the researchers as their default name for the trait. absence of demonstrated function a gene may Complexity is further increased by variation in be characterized by sequence, transcription or anatomic locations, physiological stages and homology.' A locus is not synonymous with a methods used to measure a given trait. This gene. It is defined as 'a point in the genome, may seem manageable at first, but once one identified by a marker, which can be mapped starts to compare data across different labora by some means. A locus could be an anony tories, publications or species, it quickly becomes mous non-coding DNA segment or a cytogenetic very confusing. feature.' A single gene may have numerous The 'standard genetic nomenclature' rec loci within it (each may be defined by different ommendations made by the Committee on markers). Genetic Nomenclature of Sheep and Goats A gene name should be short and specific, (COGNOSAG) in the 1980s and 1990s initially and convey the character or function of the covered sheep and goats and were later extended gene. Gene names should be written using . to cattle (Broad et a/., 1999). Dolling (1999) American spelling and contain only Latin let summarized these efforts and abstracted guide ters or a combination of Latin letters and lines for practical use. In 2009, an international Arabic numerals. meeting to discuss coordination of gene names A gene symbol should start with the same across vertebrate species was held in Cambridge, letter as the gene name. The gene symbol UK (Bruford, 2010). While we may hesitate to should consist of upper-case Latin letters and dictate how genetic terms are defined, adopting possibly Arabic numerals. Gene symbols must a standardized genetic nomenclature system ena be unique. bles researchers to more easily manage and com A locus name should be in capitalized pare their data, both within and across species. Latin letters or a combination of Latin letters The emergence of the use of ontologies in bio and Arabic numerals. logical research has contributed a new way to A locus symbol should consist of as few effectively organize biological data and facilitate Latin letters as possible or a combination of analysis of large datasets. Adopting standardized Latin letters and Arabic numerals. The charac nomenclature will further enable researchers to ters of a symbol should always be capital Latin unambiguously organize and manage their data. characters and should begin with the initial let When genomic information must be transferred ter of the name of the locus. If the locus name across species to perpetuate genetic discoveries, is two or more words, then the initial letters of the role of a standardized genetic nomenclature each word should be used in the locus symbol. becomes even more important. Gene and locus names and symbols should The goal of this chapter is to clearly state be printed in italics whenever possible; other guidelines for nomenclature, with the hope wise they should be underlined. -- 500 Z·-LHu.etal.) 1 When assigning cattle gene nomenclature, serological or nucleotide methods. The HGNC the gene name and symbol should be assigned guideline recommends that 'allele designation based on existing HGNC nomenclature when should be written on the same line as gene 1:1 human:bovine orthology is well established. symbol separated by an asterisk e.g. PGM1 * 1, Recognized members of gene families should be the allele is printed as * 1 '. The wild-type allele named following existing naming schemes. can be denoted with a+ (e.g. MSTN+). Neither+ Initial efforts to provide information about genes nor - symbols should be used in alleles detected predicted during the cattle genome sequencing by biochemical, serological or nucleotide meth project resulted in the assignment of standard ods. Null alleles should be designated by the ized names for 57 57 cattle genes based on number zero. A single nucleotide polymor human gene nomenclature (Bovine Genome phism (SNP) allele should be designated based Sequencing and Analysis Consortium, 2009). on its dbSNP_id, followed by a hyphen and the There are two categories of novel cattle specific nucleotide (e.g. MSTNrs1234567-T). genes: (i) novel genes predicted by bioinformatic If the SNP occurs outside of an identified gene, gene prediction programs; and (ii) novel genes the SNP locus can be designated using the that have been studied prior to the completion dbSNP_id