<<

Animal Science Publications Animal Science

2014 Standard Genetic Zhiliang Hu Iowa State University, [email protected]

James M. Reecy Iowa State University, [email protected]

Fiona M. McCarthy University of Arizona

Carissa A. Park Iowa State University, [email protected]

Follow this and additional works at: http://lib.dr.iastate.edu/ans_pubs Part of the Agriculture Commons, Animal Sciences Commons, and the Commons The ompc lete bibliographic information for this item can be found at http://lib.dr.iastate.edu/ ans_pubs/171. For information on how to cite this item, please visit http://lib.dr.iastate.edu/ howtocite.html.

This Book Chapter is brought to you for free and open access by the Animal Science at Iowa State University Digital Repository. It has been accepted for inclusion in Animal Science Publications by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. Standard Genetic Nomenclature

Abstract Genetics includes the study of and , the mechanisms of genetic control between them, and information transfer between generations. Genetic terms describe processes, and traits with which genetic phenomena are examined and described. While the genetic terminologies are extensively discussed in this book and elsewhere, the standardization of their names has been an ongoing process. Therefore, this chapter will only concentrate on discussions about the issues involved in the standardization of and trait terminologies.

Disciplines Agriculture | Animal Sciences | Genetics

Comments This is a chapter in The Genetics of Cattle, 2nd edition, chapter 24 (2014): 598. Posted with permission.

This book chapter is available at Iowa State University Digital Repository: http://lib.dr.iastate.edu/ans_pubs/171 24 Standard Ganetic·•Nomencl,ture

Zhi-Liang Hu, 1 James M. Reecy, 1 Fiona McCarthy2 and Carissa A. Park1 1/owa State University, Ames, Iowa, USA; 2 University of Arizona, Tucson, Arizona, USA

Introduction 598 Locus and Gene Names and Symbols 599 Locus name and symbol 599 name and symbol 600 terminology 600 Gene annotations and the (GO) 600 Trait and Terminology 602 Traits 602 Super-traits 602 Trait hierarchy and ontology 603 Current status of research 604 Trait and phenotype nomenclature 605 Future Prospects 605 Acknowledgements 607 References 607

Introduction A standardized genetic nomenclature is vital for unambiguous concept description, effi­ Genetics includes the study of genotypes cient genetic data management and effective and phenotypes, the mechanisms of genetic communications not only among scientists, but control between them, and information also among those who are involved in cattle transfer between generations. Genetic terms production and genetic improvement. This describe processes, genes and traits with issue has become even more critical in the which genetic phenomena are examined and post-genomics era due to rapid accumulation described. While the genetic terminologies of quantities of genetic and phenotypic are extensively discussed in this book and data, and the requirement for data manage­ ·) elsewhere, the standardization of their names ment and computational analysis, which has been an ongoing process. Therefore, increases the need for precise definition and this chapter will only concentrate on discus­ interpretation of gene and trait terms. sions about the issues involved in the stand­ For example, the (MSTN) gene ardization of gene and trait terminologies. is known as Growth and Differentiation Factor Readers may wish to refer to online resources 8 (GDF8 or GDF-8) in some literature and (see Table 24.1 for URLs) for lists of the is also referred to as the 'muscle hypertrophy' glossaries currently in use. or 'double-muscling' locus in cattle. While the

©CAB International 2015. The Genetics of Cattle, 598 2nd Edn (eds D.J. Garrick and A. Ruvinsky) interchangeable use of all these names in the that they will facilitate comparison of results literature can cause confusion, it gets more between experiments and, most importantly, complicated when one considers paralogous gene prevent confusion. duplications across , which led Rodgers et a/. (2007) to propose MSTN-1 and MSTN- 2. Unfortunately, this naming scheme does not Locus and Gene Names follow the Organization (HUGO) Committee (HGNC) and Symbols guidelines, which indicate that these paral­ ogues should be named MSTNl and MSTN2, Locus name and symbol respectively. In terms of traits, an example that would The following guidelines for cattle gene nomen­ benefit from consistent nomenclature is the clature are adapted and abbreviated from the longissimus dorsi muscle area, which is also HUGO Gene Nomenclature Committee (HGNC; referred to as the loin eye area (LEA), loin mus­ see Table 24.1 for URL). cle area (LMA), meat area (MLD), ribeye area A gene is defined as 'a DNA segment (REA), etc. Each of these is known to certain that contributes to phenotype/function. In the researchers as their default name for the trait. absence of demonstrated function a gene may Complexity is further increased by variation in be characterized by sequence, transcription or anatomic locations, physiological stages and homology.' A locus is not synonymous with a methods used to measure a given trait. This gene. It is defined as 'a point in the genome, may seem manageable at first, but once one identified by a marker, which can be mapped starts to compare data across different labora­ by some means. A locus could be an anony­ tories, publications or species, it quickly becomes mous non-coding DNA segment or a cytogenetic very confusing. feature.' A single gene may have numerous The 'standard genetic nomenclature' rec­ loci within it (each may be defined by different ommendations made by the Committee on markers). Genetic Nomenclature of Sheep and Goats A gene name should be short and specific, (COGNOSAG) in the 1980s and 1990s initially and convey the character or function of the covered sheep and goats and were later extended gene. Gene names should be written using . to cattle (Broad et a/., 1999). Dolling (1999) American spelling and contain only Latin let­ summarized these efforts and abstracted guide­ ters or a combination of Latin letters and lines for practical use. In 2009, an international Arabic numerals. meeting to discuss coordination of gene names A gene symbol should start with the same across vertebrate species was held in Cambridge, letter as the gene name. The gene symbol UK (Bruford, 2010). While we may hesitate to should consist of upper-case Latin letters and dictate how genetic terms are defined, adopting possibly Arabic numerals. Gene symbols must a standardized genetic nomenclature system ena­ be unique. bles researchers to more easily manage and com­ A locus name should be in capitalized pare their data, both within and across species. Latin letters or a combination of Latin letters The emergence of the use of ontologies in bio­ and Arabic numerals. logical research has contributed a new way to A locus symbol should consist of as few effectively organize biological data and facilitate Latin letters as possible or a combination of analysis of large datasets. Adopting standardized Latin letters and Arabic numerals. The charac­ nomenclature will further enable researchers to ters of a symbol should always be capital Latin unambiguously organize and manage their data. characters and should begin with the initial let­ When genomic information must be transferred ter of the name of the locus. If the locus name across species to perpetuate genetic discoveries, is two or more words, then the initial letters of the role of a standardized genetic nomenclature each word should be used in the locus symbol. becomes even more important. Gene and locus names and symbols should The goal of this chapter is to clearly state be printed in italics whenever possible; other­ guidelines for nomenclature, with the hope wise they should be underlined. -- 500 Z·-LHu.etal.) 1

When assigning cattle gene nomenclature, serological or methods. The HGNC the gene name and symbol should be assigned guideline recommends that 'allele designation based on existing HGNC nomenclature when should be written on the same line as gene 1:1 human:bovine orthology is well established. symbol separated by an asterisk e.g. PGM1 * 1, Recognized members of gene families should be the allele is printed as * 1 '. The wild-type allele named following existing naming schemes. can be denoted with a+ (e.g. MSTN+). Neither+ Initial efforts to provide information about genes nor - symbols should be used in detected predicted during the cattle genome sequencing by biochemical, serological or nucleotide meth­ project resulted in the assignment of standard­ ods. Null alleles should be designated by the ized names for 57 57 cattle genes based on number zero. A single nucleotide polymor­ human gene nomenclature (Bovine Genome phism (SNP) allele should be designated based Sequencing and Analysis Consortium, 2009). on its dbSNP_id, followed by a hyphen and the There are two categories of novel cattle specific nucleotide (e.g. MSTNrs1234567-T). genes: (i) novel genes predicted by bioinformatic If the SNP occurs outside of an identified gene, gene prediction programs; and (ii) novel genes the SNP locus can be designated using the that have been studied prior to the completion dbSNP_id as the locus symbol, followed by a of the cattle genome. In addition, it is anticipated hyphen and the nucleotide allelic variants as in that, in the future, additional novel genes will rs1234567-T. be identified by RNA-sequencing experiments. The allele name and symbol should be In cases where no strict 1: 1 human orthologue printed in italics whenever possible; otherwise exists that has been assigned nomenclature, the they should be underlined. NCB! LOC# or Ensembl 10 should be used as a temporary gene symbol for predicted genes with no known function. In order to assign a symbol/ Genotype terminology name to novel genes, they will need to be manu­ ally curated and assigned a unique symbol/name The genotype of an individual should be following these guidelines. shown by printing the relevant locus and allele symbols for the two homologous chromo­ Allele name and symbol somes concerned, separated by a slash, e.g. MSTNrs1234567-T/rs1234567-C. Unlinked loci should be separated by a semicolon, e.g. These guidelines for allele nomenclature are CD 11Rsal-2400!2200; ESRPvuii-5 700/4200. adapted from Dolling (1999) and mouse genome Linked loci should be separated by a space nomenclature guidelines (see Table 24.1 for or dash and listed in linkage order (e.g. URL), consistent with HGNC guidelines. POU1F1A!G-STCHC/G-PRSS7A!T), or in Alleles do not have to be named, but alphabetical order if the linkage order is not should be assigned symbols. An allele symbol known. For X-linked loci, the hemizygous case should always be written following the locus should have a /Y following the locus and allele symbol. It can consist of Latin letters or a com­ symbol, e.g. AR-Eco57I-1094/Y. Likewise, j bination of Latin letters and Arabic numerals. Y-linked loci should be designated by /X follow­ An allele name should be as brief as possible, ing the locus and allele symbol. and should convey the variation associated ' with the allele. If a new allele is similar to one that has already been named, it should be named according to the breed, geographic Gene annotations and the I location or population of origin. If new alleles gene ontology (GO) are to be named for a recognized locus, they should conform to nomenclature established Advances in genomic technologies require that for that locus. The first letter of the allele name researchers be able to functionally analyse I should be lower case. large, high-throughput datasets to gain insight The allele name and symbol may be into the complex systems they are studying. By identical for a locus detected by biochemical, using the same nomenclature and procedures to describe gene function, gene components annotations are used by secondary source can be consistently linked to function in a way databases like Gene (see Table 24.1 for that facilitates effective computational analysis URL; Sayers eta/., 2012) and UniProt (UniProt and promotes comparative genomics. In 1998, Consortium, 2010), genome browsers like the GO Consortium was formed to standardize Ensembl (see Table 24.1 for URL; Flicek, functional annotation in the form of gene 2013), and analysis tools like DAVID (see ontologies that can be used across all eukary­ Table 24.1 for URL; Huang, 2009), among otes (Gene Ontology Consortium, 2000). This other publicly accessible resources and tools. effort not only provided a standard method for A growing number of model and functional annotation but also promoted data livestock animal species (including bovine) sharing and enabled modelling of functional databases and working groups contribute genomics datasets. The GO consists of three annotation sets to the GO repository separate ontologies: Biological Process, Cellular (McCarthy, 2007; Reese, 2010). Component, and Molecular Function. Genes GO annotations are created by captur­ or gene products are associated with GO terms ing the gene product information (database, that represent gene attributes. database accession, name and symbol, type A GO term is defined with a term name, a of gene product and species taxon), its asso­ unique identifier and a definition (preferably ciated GO term, GO sub-ontology and evi­ indicating which of the three sub-ontologies it dence for the assertion with references. The belongs to, information about its relationships current practice for bovine GO annotation to other GO terms and cited sources). GO is to provide names and symbols based terms may also have synonyms, database upon a combination of NCBI Entrez Gene cross-references and comments to provide and UniProtKB names. In instances where more detailed information. A unique GO iden­ there is no suitable gene symbol, database tifier consists of the prefix 'GO' followed by accessions are used. Continued efforts are a colon and six to eight numerical digits, made to improve the accuracy of the bovine e.g. G0:0000016. It serves as a key to refer­ GO annotations by transferring GO annota­ ence GO terms in a GO database. An example tions from better annotated in of a GO term is shown in Fig. 24.1. human and mouse based on Ensembl orthol­ Standard GO annotations are maintained ogy. As of September 2012, GO annota­ by the GO Consortium (see Table 24.1 for tion for bovine (McCarthy, 2007) comprises URL), which provides updates of quality­ 306,7 46 annotation entries for 41,63 7 checked data for public access. The GO gene products; 86.7% of these annotations

id: G0:0000016 name: lactase activity namespace: molecular_function def: "Catalysis of the reaction: lactose + H20 = D-glucose + D­ galactose. • [EC:3.2.1.108] synonym: "lactase-phlorizin hydrolase activity• BROAD [EC:3.2.1.108] synonym: "lactose galactohydrolase activity• EXACT [EC:3.2.1.108] xref: EC:3.2.1.108 xref: MetaCyc:LACTASE-RXN xref: Reactome:20536 is_a: G0:0004553 ! hydrolase activity, hydrolyzing 0-glycosyl compounds

Fig. 24.1. An example of a GO term. (For further information, see Table 24.1 for GO website URL.) J are computationally derived (AgBase: see efforts emerged: the Online Mendelian Table 24.1 for URL). Inheritance in Animals (OMIA) database and To contribute annotations to the GO, or for the Animal QTL database (QTLdb). a complete list of bovine GO data, users are OMIA (see Table 24.1 for URL) was initi­ encouraged to contact either the GO Consortium ated in 1978. To date, it contains >400 cattle or AgBase at their respective websites. trait variations and/or abnormalities from cat­ tle genetic research publications (Nicholas, Chapter 5). The Animal QTLdb (see Table 24.1 Trait and Phenotype Terminology for URL) has a collection of 4 70 cattle traits, including measurement method variations Cattle traits are conventionally named based (Hu eta/., 2013), of which 407 traits have at least one QTL. Curators at both OMIA and on performance (e.g. body weight), physiologi­ Animal QTLdb made efforts to make each cal parameters (e.g. blood cholesterol level), database entry unique in terms of the names anatomic locations/dissections (e.g. loin muscle area), physical-chemical properties (e.g. milk and their representations. Expanded from. the QTLdb development, an Animal Trait content), livelihood soundness (e.g. Ontology (ATO) project at Iowa State immune capacity) and exterior appearance (e.g. coat colour), etc. As such, there is a good University (see Table 24.1 for URL) has been launched to standardize traits for livestock spe­ chance a trait will be named differently by dif­ ferent people, even within a species commu­ cies including cattle. Its initial purpose was to help with organization and management of nity. Furthermore, traits have been studied trait information through the use of a con­ across many species, which adds additional of complexity to their naming. The study of traits trolled vocabulary to facilitate comparison may also involve the study of underlying genes QTL results and standardize trait data annota­ tion and retrieval (Hu et a/., 2005, 2007). and markers, environments and management It was soon introduced to the community protocols that contribute to the manifestation (Hughes eta/., 2008). of a trait. Therefore, it is obvious that factors that contribute to the naming of a trait are multi-dimensional. As the amount of trait infor­ mation associated with a gene or chromosomal Super-traits region is growing exponentially, we cannot overemphasize the need for a standard nomen­ Compared to standard gene nomenclature, clature to be used by researchers to communi­ trait name standardization is far more com­ cate as consistently and unambiguously as plex, not only because the same trait can be possible, with the aid of bioinformatics tools. named differently (e.g. 'loin eye area' versus 'ribeye area'), but also because many factors contribute to how a trait is defined under vari­ Traits ous circumstances. For example, Fig. 24.2 shows a list of 10 'backfat thickness' varia­ Cattle trait terms can be found ubiquitously tions, each of which is defined by their specific throughout journal articles, farm reports and measurement methods, measuring time and daily communications among scientists and specific anatomic locations, which may con­ cattle industry personnel. A trait term can be tribute to trait comparison difficulties and created by anyone, and each person may have increase the potential for confusion. a slightly different definition for any given One attempt to simplify the comparisons term. As such, hundreds of thousands of terms was by introduction of the concept of 'trait can be found in the literature with various nam­ types' or 'super-traits'. Hu et a/. (2005) ing conventions used. Previously, there was no described trait type as a general physical or central repository where the uniqueness of a chemical property of, or the processes that trait term could be maintained and checked, lead to, or types of measurements that result until two relatively recent database development in, an observation (phenotype). The 'trait types' ~~a~~~~~""~fhEJil~~~~···· 60S

Backfat thickness (average backfat) by ultrasound Backfat thickness (average backfat) by ruler } bymethods Backfat thickness at the 7th rib Backfat thickness at the 121h rib Backfat thickness at the 12th_ 13th rib } by locat;orn; Backfat thickness at the 13th rib Backfat thickness measured at 1-3 days postpartum Backfat thickness measured at 40-42 days postpartum Backfat thickness measured at 90-92 days postpartum Backfat thickness measured at 130-150 days postpartum }"'"""' Fig. 24.2. An example of the trait name variations by different modifiers such as measurement methods, time and sampling locations. This variation can easily add difficulties for accurate and unambiguous trait comparisons. or 'super-traits' were initially used to serve as a their associated QTL, put them together and general concept for a trait, regardless of pos­ present them to the user in real time. sible variations in trait names based on meas­ However, people of different disciplines urement times, locations or methods. As the may see the need for a different trait hierarchy, ATO project progressed, the factors in the which may better capture the subtleties required methods of trait measurements, such as point in their field. For example, for body weight in time or time span, anatomic locations, gained over a period of time (e.g. average daily instruments, etc., were classified as 'trait modi­ gain, ADG), a farmer considers it a production fiers', because they do not constitute a compo­ trait, a nutritionist may see it as an indicator for nent of a trait, but only affect the way a trait is feed conversion efficiency and a veterinarian described. Therefore, the 'super-trait' may only may find it a health status parameter. Similarly, be employed to categorize variations in how a blood cholesterol levels may be used to predict trait is defined or named. For example, 'rib eye meat quality by beef producers, and may also area', 'rib-eye area', 'rib muscle area', 'longis­ be used as a parameter to predict coronary simus dorsi muscle area', 'longissimus muscle heart disease by those who use cattle as an ani­ area', 'loin eye area', 'loin muscle area', etc. model for human heart disease research. can be unified as 'longissimus dorsi muscle Therefore, a simple hierarchy may be helpful to area (LMA)'. 'Backfat', 'backfat depth', 'back­ reduce the complexity in some cases, although fat thickness', 'backfat above muscle dorsi', may not be adequate in all cases. In addition, 'backfat intercept', 'backfat linear', etc. may all due to the existence of multiple overlapping simply be referred to as 'subcutaneous fat hierarchies for cattle traits, the management of thickness'. such data may introduce one more dimension of complexity to the ontology structure. Ontologies are controlled vocabularies Trait hierarchy and ontology used to describe objects and relationships between them in a formal manner. In an ontol­ In order to compare QTL across experiments, ogy, the Directed Acyclic Graph (DAG), a the Cattle QTLdb uses a trait hierarchy (Fig. mathematical graphic modelling method, is 24.3) to provide a framework for organizing used to solve data management problems with the traits and easily locating them (Hu et a/., complex hierarchical structures. For example, 2013). This approach simplifies the proce­ the trait 'marbling' may belong to the 'meat dures by which traits are defined, linked and quality', 'adipose trait' or 'muscular system compared. Subsequently, a computer program physiology' hierarchies. Computer tools have could be implemented to automatically process been developed and are freely available to the database searches, so that when a user manage such ontology data with DAG struc­ queries for a trait by keywords, the database tures. The two most popular tools that are can gather and retrieve related trait names and likely to be useful to the cattle genetics community Cattle traits Current status of research

Disease susceptibility The ATO has been a successful project since General health parameters its development from the QTLdb several Mastitis years ago. Recently, the developers of ATO Organ disorder have begun working with Mouse Genome Parasite load Informatics, the Rat Genome Database, Parasite resistance European Animal Disease Genomics Network of Excellence (EADGENE) and the French Carcass characteristics National Institute for Agricultural Research (INRA) to incorporate the Mammalian Pheno­ Meat quality type Ontology (MPO) and the ATO into a unified Vertebrate Trait (VT) Ontology (Park et a/., 2013; see Table 24.1 for URL). To + Milk composition - fat reach a proper granularity level of the trait Milk composition - other ontology, Product Trait (PT) Ontology (see · Milk composition - protein Table 24.1 for URL) and Clinical Measurement Milk processing trait Ontology (CMO; Shimoyama et a/., 2012; Milk yield see Table 24.1 for URL) were introduced. By Production traits reuse of existing ontologies and integration of Energy efficiency production-specific livestock traits, research­ Feed conversion ers at INRA have also launched an Animal Feed intake Trait Ontology for Livestock (ATOL) site, con­ Growth taining over 1000 traits including those of cat­ Life history traits tle (Golik eta/., 2012). Current efforts have been aimed at Lifetime production enhancing the ability to standardize trait Reproduction traits nomenclature within and across species. For Fertility example, a disease such as mastitis in dairy cat­ General tle may have been considered a 'trait' in classi­ Semen quality cal animal genetic studies. In fact, in terms of concept specifications, it is not a characteristic Behavioural cattle trait observable in the general popula­ Conformation tion, but rather an abnormal manifestation in Pigmentation some cattle (in fact, resistance to mastitis is a trait). In addition, a trait name may have vari­ Fig. 24.3. A simple cattle trait class hierarchy used ations because it is 'modified' by measurement in the Animal QTLdb for users to browse for traits time or method (Fig. 24.2), but the names of interest. (See Table 24.1 for URL.) actually represent the same trait. The separa­ tion of diseases from traits reflects the efforts toward a well-defined and standardized trait are AmiGO and OBO-Edit (Gene Ontology nomenclature. Standardization of the trait Tools, see Table 24.1 for URL). AmiGO is an nomenclature will undoubtedly help the cattle ontology browser adapted to the ATO data­ genomics community make meaningful trait base, which allows users to share and view comparisons, as well as facilitate the transfer of trait data stored in ATO with any web browser genomics information from some well-studied on the internet. OBO-Edit is a java-based species. The challenge of using ontologies to ontology data editor that can be used by any­ standardize and manage trait nomenclature is one to edit ontology term definitions and ­ not only a technical issue, but a community tionships, and to export data in Open issue, in the sense that it has to be commonly Biological/Biomedical Ontologies (080) format recognized, mutually agreed upon, and widely to share data. shared. (standard Genetic Nomendat~re 005

Trait and phenotype nomenclature example, ApoB1/2, ApoB2/3, etc. (Note the difference from recording genotypes, where Until an international committee issues rules italics or asterisks are required.) for trait and phenotype nomenclature, a good practice with wide acceptability is to follow the 'norm' in published materials. Listed in Table 24.1 Future Prospects are some of the best trait reference resources available to date (see table footnote for details). The Gene Ontology and Mammalian Since this has been an active research area in Phenotype Ontology are already playing a role recent years, it is highly recommended that in robust annotation of mammalian genes and users check multiple databases for the best and phenotypes in the context of , quan­ most up-to-date information. titative trait loci, etc. (Smith et a/., 2005). Phenotype is the actual manifestation of Undoubtedly, a standardized cattle genetic observable traits. A phenotype is a trait nomenclature will more effectively facilitate observed in an individual. It usually consists of efficient cattle genome annotation and transfer a trait with characteristic features (e.g. twin­ of knowledge from information-rich species ning), variations that can be described (e.g. such as and mouse, and make it pos­ black spots on the body) or qualities that can be sible for new bioinformatics tools to easily measured (e.g. birth weight of 30 kg). Since streamline data management and genetic ana­ there are so many variations as to how a pheno­ lysis. Meanwhile, it is noteworthy to mention type can be 'observed' (often such observation that the term 'phene' for 'trait' is being used is made indirectly with instruments or more frequently in the scientific literature in through tests) and obtained, a technical guide recent years. It is interesting that in terms of for recording each trait might be ideal. Often etymology lineage, 'phene' is to 'phenotype' a description of comments for a phenotype and 'phenome' as 'gene' is to 'genotype' and record may be necessary to correctly under­ 'genome' (Wikipedia, 2012), where 'phene' is stand and use the data. For example, when an equivalent term for 'trait'. However, blood samples are taken, the number of hours Dr Frank Nicholas from the University of Sydney the animal is fasted might be an important has used the term 'phene' in OMIA in a slightly co-factor for the measurement of blood choles­ different but more concise context, namely terol concentration. 'phene is to gene as phenotype is to geno­ When a phenotype is a reflection of a cer­ type', where 'phene' refers to a set of pheno­ tain genotype, the phenotype symbol should types that correspond to a set of genotypes be the same as the genotype symbol. The dif­ determined by a gene. This is practically very ference is that the characters should not be useful in of the future structured genetic underlined or in italics, and they should be writ­ terminology standardization in the genomics era. ten with a space between locus characters and Several genome databases, such as allele characters, instead of an asterisk. Square ArkDB, Animal QTLdb, Bovine Genome brackets [ ] may also be used. Database, Ensembl and NCB! GeneDB, have In classical genetics, phenotypes were played a role in the usage of commonly accepted sometimes used to denote Mendelian geno­ gene/trait notations. Undoubtedly, existing types. This was done using an abbreviation of and new genome databases and tools will fur­ the trait, post-fixed with a plus(+) or minus(-) ther develop and evolve. As such, a standard­ sign to represent 'presence' or 'absence' of ized genetic nomenclature in cattle will certain trait features. For example, halothane­ definitely become crucial for information shar­ negative was denoted as 'Hal-', and halothane­ ing and comparisons between different research positive as 'Hal+'. A phenotype denotation can groups, across experiments and even across also be used to represent genetic haplotypes, species. Recently the Animal Genetics journal such that 'K88ab+, ac+, ad-' are written has updated its Author Guidelines insisting that together as an entire denotation. Likewise, proper gene nomenclature be followed: 'All numbers or letters may be used to denote gene names and symbols should be italicized alleles when polymorphisms are observed, for throughout the text, table and figures'; 'Locus Table 24.1. Internet URL addresses for the web resources used in this chapter and cattle trait glossary information.

Data source URL

AgBASE http://www.agbase.msstate.edu/cgi-bin/information/Cow.pl Animal QTLdb http://www.animalgenome.org/QTLdb Animal Trait Ontology project http://www.animalgenome.org/bioinfo/projects/ATO/ ATOL http://www.atol-ontology.com Cattle trait hierarchy http://www.animalgenome.org/QTLdb/exporVcattle_traits CMO project http://bioportal.bioontology.org/ontologies/CMO (BioPortal) http://www.animalgenome.org/bioinfo/projects/cmo DAVID http://david.abcc.ncifcrf.gov Ensembl http://www.ensembl.org Entrez Gene http://www.ncbi.nlm.nih.gov/gene Gene Ontology Tools http://neurolex.org/wiki/Category: Resource:Gene_Ontology_ Tools Genetic glossaries http://www.animalgenome.org/genetics_glossaries GO Consortium http://www.geneontology.org GO structure http://www.geneontology.org/GO.ontology.structure.shtml HGNC guidelines http://www.genenames.org/guidelines.html Mouse genome nomenclature guidelines http://www.informatics.jax.org/mgihome/nomen/gene.shtml OMIA http://omia.angis.org.au/ PT project http://www.animalgenome.org/bioinfo/projects/pt UniProt http://www.uniprot.org VT project http://bioportal.bioontology.org/ontologies/VT (BioPortal) http://www.animalgenome.org/bioinfo/projects/vt

VT, Vertebrate Trait Ontology is a controlled vocabulary for the description of traits (measurable or observable characteristics) pertaining to the morphology, physiology or development of vertebrate . CMO, Clinical Measurement Ontology is designed to be used to standardize morphological and physiological measurement records generated from clinical and research and health programmes. PT, Product Trait Ontology is a controlled vocabulary for the description of traits (measurable or observable characteristics) pertaining to products produced by or obtained from the body of an agricultural animal or bird maintained for use and profit. QTLdb, Animal QTLdb is a database to house all QTL data for all livestock species. OMIA, Online Mendelian Inheritance in Animals is a comprehensive collection of phenotypic information on heritable animal traits and genes in a comparative context, relating traits to genes where possible. ATOL, Animal Trait Ontology for Livestock is aimed at defining livestock traits, with a focus on the main types of animal production in line with societal priorities.

------~------~---- (stand&'4 ~~No!J'I~~clatut$ 607

symbols used in Animal Genetics publications facilitate data comparisons between experi­ must be confirmed with HGNC' and 'non­ ments, laboratories, even species. human gene names should be checked against NCBI's Entrez Gene database'. This is a good move towards educating the community on the Acknowledgements proper use of standardized genetic nomencla­ tures. Active development and use of a stand­ The authors wish to thank Dr Frank Nicholas ardized genetic nomenclature will surely help from the University of Sydney for useful discus­ to improve data quality and reusability, and sions, inputs and kind review of the draft.

References

Bovine Genome Sequencing and Analysis Consortium eta/. (2009) The genome sequence of taurine cattle: a window to ruminant biology and . Science 324, 522-528. Broad, T.E., Dolling, C.H.S., Lauvergne, J.J. and Millar, P. (1999) Revised COGNOSAG guidelines for gene nomenclature in ruminants 1998. Genetics, Selection, Evolution 31, 263-268. Bruford, E.A. (201 0) Highlights of the 'Gene Nomenclature Across Species' meeting. Human Genomics 4, 213-217. Dolling, C.H.S. (1999) Standardized genetic nomenclature for cattle. In: Fries, R. and Ruvinsky, A. (eds) The Genetics of Cattle. CAB International, Wallingford, UK, pp. 657-666. Flicek, P., Ahmed, 1., Amode, M.R., Barrell, D., Beal, K., Brent, S., Carvalho-Silva, D., Clapham, P., Coates, G., Fairley, S. eta/. (2013) Ensembl 2013. Nucleic Acids Research 41 (Database issue), D48-D55. Gene Ontology Consortium (2000) Gene Ontology: tool for the unification of biology. Nature Genetics 25, 25-29. Golik, W., Dameron, 0., Bugeon, J., Fatet, A., Hue, 1., Hurtaud, C., Reichstadt, M., Meunier-Salaun, M.C., Vernet, J., Joret, L. eta/. (2012) ATOL: the multi-species livestock trait ontology. 6th International Conference on Metadata and Semantic Research (MTSR'12), Cadiz, Spain, 28-30 November. Hu, Z.-L., Dracheva, S., Jang, W.-H., Maglott, D., Bastiaansen, J., Rothschild, M.F. and Reecy, J.M. (2005) A QTL resource and comparison tool for cattle: PigQTLDB. Mammalian Genome 16, 792-800. Hu, Z.-L., Fritz, E.R. and Reecy, J.M. (2007) AnimaiQTLdb: a livestock QTL database tool set for positional QTL information mining and beyond. Nucleic Acids Research 35 (Database issue), D604-D609. Hu, Z.-L., Park, C.A., Wu, X.-L. and Reecy, J.M. (2013) Animal QTLdb: an improved database tool for live­ stock animal QTUassociation data dissemination in the post-genome era. Nucleic Acids Research 41' D871-D879. Huang, D:W., Sherman, B.T. and Lempicki, R.A. (2009) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nature Protocols 4, 44-57. Hughes, L.M., Bao, J., Hu, Z.-L., Honavar, V.G. and Reecy, J.M. (2008) Animal Trait Ontology (ATO): The importance and usefulness of a unified trait vocabulary for animal species. Journal of Animal Science 86, 1485-1491. McCarthy, F.M., Bridges, S.M., Wang, N., Magee, G.B., Williams, W.P., Luthe, D.S. and Burgess, S.C. (2007) AgBase: a unified resource for functional analysis in agriculture. Nucleic Acids Research 35 (Database issue), D599-D603. Park, C.A., Bello, S.M., Smith, C.L., Hu, Z.-L., Munzenmaier, D.H., Nigam, R., Smith, J.R., Shimoyama, M., Eppig, J.T. and Reecy, J.M. (2013) The Vertebrate Trait Ontology: a controlled vocabulary for the anno­ tation of trait data across species. Journal of Biomedical Semantics 4, 13. Reese, J.T., Childers, C.P., Sundaram, J.P., Dickens, C.M., Childs, K.L., Vile, D.C. and Elsik, C.G. (2010) Bovine Genome Database: supporting community annotation and analysis of the Bos taurus genome. BMC Genomics 11, 645. Rodgers, B.D., Roalson, E. H., Weber, G.M., Roberts, S.B. and Goetz, F.W. (2007) A proposed nomencla­ ture consensus for the myostatin . American Journal of Physiology- Endocrinology and Metabolism 292, E371-E372. 008 l.-L.. Hu et al.)

Sayers, E.W., Barrett, T., Benson, D.A., Bolton, E., Bryant, S.H., Canese, K., Chetvernin, V., Church, D.M., Dicuccio, M., Federhen, S. eta/. (2012) Database resources of the National Center for Biotechnology Information. Nucleic Acids Research 40 (Database issue), D13-D25. Shimoyama, M., Nigam, R., Mcintosh, L.S., Nagarajan, R., Rice, T., Rao, D.C. and Dwinell, M.R. (2012) Three ontologies to define phenotype measurement data. Frontiers in Genetics 3, 87. Smith, C.L., Goldsmith, C.A. and Epcattle, J.T. (2005) The Mammalian Phenotype Ontology as a tool for annotating, analyzing and comparing phenotypic information. Genome Biology6, R7. UniProt Consortium (201 O) The Universal Protein Resource (UniProt) in 2010. Nucleic Acids Research 38 (Database issue), D142-D148. Wikipedia (2012) Phene. Available at:http://en.wikipedia.org/wiki/Phene (accessed 30 March 2013).