GAPGOM—An R Package for Gene Annotation Prediction Using GO Metrics Casper Van Mourik1,2, Rezvan Ehsani3,4,5* and Finn Drabløs1*

Total Page:16

File Type:pdf, Size:1020Kb

GAPGOM—An R Package for Gene Annotation Prediction Using GO Metrics Casper Van Mourik1,2, Rezvan Ehsani3,4,5* and Finn Drabløs1* van Mourik et al. BMC Res Notes (2021) 14:162 https://doi.org/10.1186/s13104-021-05580-1 BMC Research Notes RESEARCH NOTE Open Access GAPGOM—an R package for gene annotation prediction using GO metrics Casper van Mourik1,2, Rezvan Ehsani3,4,5* and Finn Drabløs1* Abstract Objective: Properties of gene products can be described or annotated with Gene Ontology (GO) terms. But for many genes we have limited information about their products, for example with respect to function. This is particularly true for long non-coding RNAs (lncRNAs), where the function in most cases is unknown. However, it has been shown that annotation as described by GO terms to some extent can be predicted by enrichment analysis on properties of co- expressed genes. Results: GAPGOM integrates two relevant algorithms, lncRNA2GOA and TopoICSim, into a user-friendly R package. Here lncRNA2GOA does annotation prediction by co-expression, whereas TopoICSim estimates similarity between GO graphs, which can be used for benchmarking of prediction performance, but also for comparison of GO graphs in general. The package provides an improved implementation of the original tools, with substantial improvements in performance and documentation, unifed interfaces, and additional features. Keywords: Gene ontology, Annotation, Prediction, Long non-coding RNAs Introduction processes of gene regulation [3], but any useful annota- Te properties of gene products like proteins or non-pro- tion is in most cases missing. However, it has been shown tein coding RNAs (ncRNAs) can be described with Gene for example by Jiang et al. that annotation by GO terms Ontology (GO) terms as provided by the GO Consor- to some extent can be predicted [4]. A typical approach tium [1]. GO resources describe a gene product by anno- will identify sets of genes with correlated expression pat- tating it with standardized terms organized as a DAG tern across several experiments, based on the assump- (Directed Acyclic Graph). Such annotation should ideally tion that genes with correlated expression pattern may be be based on experimental data. However, the amount of involved in similar processes. It will then use enrichment experimental information that is available varies a lot, analysis of GO terms for the gene set to identify these and for many genes very little is known about their func- processes. Tis approach was recently benchmarked tion. Tis is particularly true for most ncRNAs, includ- using an improved strategy for comparing known and ing long ncRNAs (lncRNAs). Te number of lncRNAs predicted GO terms for genes with known annotation seems to be comparable to the number of protein-coding [5]. Although the benchmarking had to be done on pro- genes [2], and they are known to be important in several tein-coding genes (due to the lack of annotated lncRNA genes), the results indicate that the same approach can be used also on other classes of genes, like lncRNAs. *Correspondence: [email protected]; [email protected] Here we present a user-friendly and well-documented 1 Department of Cancer Research and Molecular Medicine, NTNU- Norwegian University of Science and Technology, 7491 Trondheim, implementation of two tools for annotation predic- Norway tion and benchmarking, lncRNA2GOA (lncRNA to GO 5 Present Address: Department of Informatics, University of Bergen, Annotation) [5] and TopoICSim (Topological Informa 5020 Bergen, Norway - Full list of author information is available at the end of the article tion Content Similarity) [6], integrated into the R package © The Author(s) 2021. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/. The Creative Commons Public Domain Dedication waiver (http:// creat iveco mmons. org/ publi cdoma in/ zero/1. 0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. van Mourik et al. BMC Res Notes (2021) 14:162 Page 2 of 5 GAPGOM (Gene Annotation Prediction using GO Met- (Fisher, Sobolev) measures. Te tool will then do an rics). Tis package provides a workbench for exploring enrichment analysis on GO terms for the most correlated annotation prediction. We also show how these tools can genes, by default using the top 250 genes, which previ- be the basis for alternative approaches, by using other ously has been found to be the optimal cutof [5]. Te types of annotation terms or in combination with GO- enriched terms are used as predicted annotation for the based defnition of gene sets for enrichment analysis. query gene. Te predicted GO annotation can optionally be compared to actual GO annotation on benchmark sets Main text by using TopoICSim, for example in a simple leave-one- Overview out approach. Te overall performance will be indicative GAPGOM integrates the tools lncRNA2GOA and of the level of performance that can be expected also for TopoICSim together with libraries on properties like novel genes. Both alternative similarity measures and gene expression and annotation data into a pipeline as more advanced cross-validations can easily be integrated illustrated in Fig. 1. Te main use of GAPGOM will be with GAPGOM through the R framework. to go from expression data for a query gene to predicted Te lncRNA2GOA tool can also be used in alternative GO annotation using lncRNA2GOA, by frst defning a approaches as indicated in Table 1. Te main approach set of co-expressed genes by correlating the expression of described above is Approach 1. A similar approach can the query gene to other genes across several experiments. be used for predicting “class” rather than GO annotation Te available methods for correlation represent both sta- (Approach 2). Te “class” can be any type of annotation tistical (Pearson, Spearman, Kendall) and geometrical term for which we have good training data, as for exam- ple whether the gene may be associated with specifc types of cancer. Tis is basically the same as the stand- ard approach, except that we need annotation of “class” Query rather than GO for the training data. Here we do not Property Input need TopoICSim for benchmarking as it will not involve t comparisons between complex graph structures of GO Library terms. Instead, we will typically use a 2 × 2 confusion Property matrix where predictions on the benchmark set are clas- sifed as true or false positive or negative predictions. Similarity Te TopoICSim tool can also be used in another alter- Define gene se native approach (Approach 3) by predicting “class” from gene sets defned through similarity of GO terms rather Library than gene expression. Tis means that we can use GO Annotaon annotation to predict “class”, which can be more specifc and informative regarding the property we are inves- Enrichment tigating, compared to a set of GO terms. Again, bench- Find enrichment mark scoring will typically be based on a 2 × 2 confusion Query matrix. Annotaon Te use of lncRNA2GOA and TopoICSim has been integrated into the R package GAPGOM, with improved usability and user-friendliness, function documentation and vignettes, signifcant speed improvements, a more (Benchmark) Similarity unifed input interface, unit tests, and extra features. Te Predicon Performance Annotaon Table 1 Main alternative approaches for annotation prediction Result with GAPGOM tools Fig. 1 The GAPGOM pipeline for annotation prediction. The fowchart shows the main steps of GAPGOM. For a query gene and a Approach Tool Property Annotation Benchmarka certain property type, a gene set is defned based on similarity over a property library. The gene set is then populated with data from an 1 lncRNA2GOA Expression GO TopoICSim annotation library and enriched terms of the gene set are identifed. 2 lncRNA2GOA Expression Class 2 2 × The predicted annotation may optionally be compared to actual 3 TopoICSim GO Class 2 2 annotation if the query gene is part of a benchmark dataset (i.e., with × a 2 2 represents benchmarking using a confusion matrix over true and false known annotation) × positive and negative predictions van Mourik et al. BMC Res Notes (2021) 14:162 Page 3 of 5 main speedup is seen in TopoICSim with more efcient In our frst approach to the challenge (the “direct” use of R-specifc data structures and improvements in approach in [9]) the diferent cancer types were used as the algorithm. Integration through R also makes it easy to annotation terms for the lncRNAs in the training set. For add new features. Te package is available on Bioconduc- benchmarking each lncRNA was then “re-annotated” tor and GitHub. Documentation and vignettes are made with lncRNA2GOA by identifying co-expressed lncRNA available in Rmarkdown and HTML (webpage) form and genes of the training set and doing enrichment analysis includes both an extensive introduction to the package on cancer type for the co-expressed set. Tis gave a sta- (An introduction to GAPGOM) and examples of bench- tistically signifcant enrichment for the expected cancer marking (Benchmarks and other GO similarity methods). type, but the enrichment was not very strong. Tis prob- ably indicates that cancer type may not be strongly asso- Applications ciated with co-expression in normal cells.
Recommended publications
  • Supplementary Table 1: Adhesion Genes Data Set
    Supplementary Table 1: Adhesion genes data set PROBE Entrez Gene ID Celera Gene ID Gene_Symbol Gene_Name 160832 1 hCG201364.3 A1BG alpha-1-B glycoprotein 223658 1 hCG201364.3 A1BG alpha-1-B glycoprotein 212988 102 hCG40040.3 ADAM10 ADAM metallopeptidase domain 10 133411 4185 hCG28232.2 ADAM11 ADAM metallopeptidase domain 11 110695 8038 hCG40937.4 ADAM12 ADAM metallopeptidase domain 12 (meltrin alpha) 195222 8038 hCG40937.4 ADAM12 ADAM metallopeptidase domain 12 (meltrin alpha) 165344 8751 hCG20021.3 ADAM15 ADAM metallopeptidase domain 15 (metargidin) 189065 6868 null ADAM17 ADAM metallopeptidase domain 17 (tumor necrosis factor, alpha, converting enzyme) 108119 8728 hCG15398.4 ADAM19 ADAM metallopeptidase domain 19 (meltrin beta) 117763 8748 hCG20675.3 ADAM20 ADAM metallopeptidase domain 20 126448 8747 hCG1785634.2 ADAM21 ADAM metallopeptidase domain 21 208981 8747 hCG1785634.2|hCG2042897 ADAM21 ADAM metallopeptidase domain 21 180903 53616 hCG17212.4 ADAM22 ADAM metallopeptidase domain 22 177272 8745 hCG1811623.1 ADAM23 ADAM metallopeptidase domain 23 102384 10863 hCG1818505.1 ADAM28 ADAM metallopeptidase domain 28 119968 11086 hCG1786734.2 ADAM29 ADAM metallopeptidase domain 29 205542 11085 hCG1997196.1 ADAM30 ADAM metallopeptidase domain 30 148417 80332 hCG39255.4 ADAM33 ADAM metallopeptidase domain 33 140492 8756 hCG1789002.2 ADAM7 ADAM metallopeptidase domain 7 122603 101 hCG1816947.1 ADAM8 ADAM metallopeptidase domain 8 183965 8754 hCG1996391 ADAM9 ADAM metallopeptidase domain 9 (meltrin gamma) 129974 27299 hCG15447.3 ADAMDEC1 ADAM-like,
    [Show full text]
  • Table S6. Names and Functions of 44 Target Genes and 4 Housekeeping Genes Assessed for Gene Expression Measurements
    Table S6. Names and functions of 44 target genes and 4 housekeeping genes assessed for gene expression measurements. Gene Gene name Function Category Reference CD45 CD45 (leukocyte common antigen) Regulation of T-cell and B-cell antigen receptor signaling Adaptive NCBI, UniProt HIVEP2 Human immunodeficiency virus typeI enhancer2 Transcription factor, V(D)J recombination, MHC enhancer binding Adaptive (Diepeveen et al. 2013) HIVEP3 Human immunodeficiency virus typeI enhancer3 Transcription factor, V(D)J recombination, MHC enhancer binding Adaptive (Diepeveen et al. 2013) IgM-lc Immunoglobulin light chain Recognition of antigen or pathogen Adaptive NCBI, UniProt Integ-Bt Integrin-beta 1 Cell signaling and adhesion of immunoglobulin Adaptive NCBI, UniProt Lymph75 Lymphocyte antigen 75 Directs captured antigens to lymphocytes Adaptive (Birrer et al. 2012) Lympcyt Lymphocyte cytosolic protein 2 Positive role in promoting T-cell development and activation Adaptive NCBI, UniProt TAP Tap-binding protein (Tapasin) Transport of antigenic peptides, peptide loading on MHC I Adaptive NCBI, UniProt AIF Allograft inflammation factor Inflammatory responses, allograft rejection, activation of macrophages Innate (Roth et al. 2012) Calrcul Calreticulin Chaperone, promotes phagocytosis and clearance of apoptotic cells Innate NCBI, UniProt Cf Coagulation factor II Blood clotting and inflammation response Innate (Birrer et al. 2012) IL8 Interleukin 8 Neutrophil chemotactic factor, phagocytosis, inflammatory activity Innate NCBI, UniProt Intf Interferon induced transmembrane protein 3 Negative regulation of viral entry into host cell, antiviral response Innate NCBI, UniProt Kin Kinesin Intracellular transport Innate (Roth et al. 2012) LectptI Lectin protein type I Pathogen recognition receptors (C-type lectin type I) Innate NCBI, UniProt LectpII Lectin protein type II Pathogen recognition receptors (C-type lectin type II) Innate NCBI, UniProt Nramp Natural resistance-assoc macrophage protein Macrophage activation Innate (Roth et al.
    [Show full text]
  • Mygene.Info R Client
    MyGene.info R Client Adam Mark, Ryan Thompson, Chunlei Wu May 19, 2021 Contents 1 Overview ..............................2 2 Gene Annotation Service ...................2 2.1 getGene .............................2 2.2 getGenes ............................2 3 Gene Query Service ......................3 3.1 query ..............................3 3.2 queryMany ...........................4 4 makeTxDbFromMyGene....................5 5 Tutorial, ID mapping .......................6 5.1 Mapping gene symbols to Entrez gene ids ........6 5.2 Mapping gene symbols to Ensembl gene ids .......7 5.3 When an input has no matching gene ...........8 5.4 When input ids are not just symbols ............8 5.5 When an input id has multiple matching genes ......9 5.6 Can I convert a very large list of ids?............ 11 6 References ............................. 11 MyGene.info R Client 1 Overview MyGene.Info provides simple-to-use REST web services to query/retrieve gene annotation data. It’s designed with simplicity and performance emphasized. mygene is an easy-to-use R wrapper to access MyGene.Info services. 2 Gene Annotation Service 2.1 getGene • Use getGene, the wrapper for GET query of "/gene/<geneid>" service, to return the gene object for the given geneid. > gene <- getGene("1017", fields="all") > length(gene) [1] 1 > gene["name"] [[1]] NULL > gene["taxid"] [[1]] NULL > gene["uniprot"] [[1]] NULL > gene["refseq"] [[1]] NULL 2.2 getGenes • Use getGenes, the wrapper for POST query of "/gene" service, to return the list of gene objects for the given character vector of geneids. > getGenes(c("1017","1018","ENSG00000148795")) DataFrame with 3 rows and 7 columns 2 MyGene.info R Client query _id X_version entrezgene name <character> <character> <integer> <character> <character> 1 1017 1017 4 1017 cyclin dependent kin.
    [Show full text]
  • Solar and Ultraviolet Radiation
    SOLAR AND ULTRAVIOLET RADIATION Solar and ultraviolet radiation were considered by a previous IARC Working Group in 1992 (IARC, 1992). Since that time, new data have become available, these have been incorpo- rated into the Monograph, and taken into consideration in the present evaluation. 1. Exposure Data 1.1 Nomenclature and units For the purpose of this Monograph, the Terrestrial life is dependent on radiant energy photobiological designations of the Commission from the sun. Solar radiation is largely optical Internationale de l’Eclairage (CIE, International radiation [radiant energy within a broad region Commission on Illumination) are the most of the electromagnetic spectrum that includes relevant, and are used throughout to define ultraviolet (UV), visible (light) and infrared the approximate spectral regions in which radiation], although both shorter wavelength certain biological absorption properties and (ionizing) and longer wavelength (microwaves biological interaction mechanisms may domi- and radiofrequency) radiation is present. The nate (Commission Internationale de l’Eclairage, wavelength of UV radiation (UVR) lies in the 1987). range of 100–400 nm, and is further subdivided Sources of UVR are characterized in radio- into UVA (315–400 nm), UVB (280–315 nm), metric units. The terms dose (J/m2) and dose rate and UVC (100–280 nm). The UV component (W/m 2) pertain to the energy and power, respec- of terrestrial radiation from the midday sun tively, striking a unit surface area of an irradi- comprises about 95% UVA and 5% UVB; UVC ated object (Jagger, 1985). The radiant energy and most of UVB are removed from extraterres- delivered to a given area in a given time is also trial radiation by stratospheric ozone.
    [Show full text]
  • Kin Discrimination Promotes Horizontal Gene Transfer Between Unrelated Strains in Bacillus Subtilis
    ARTICLE https://doi.org/10.1038/s41467-021-23685-w OPEN Kin discrimination promotes horizontal gene transfer between unrelated strains in Bacillus subtilis ✉ Polonca Stefanic 1,5,6 , Katarina Belcijan1,5, Barbara Kraigher 1, Rok Kostanjšek1, Joseph Nesme2, Jonas Stenløkke Madsen2, Jasna Kovac 3, Søren Johannes Sørensen 2, Michiel Vos 4 & ✉ Ines Mandic-Mulec 1,6 Bacillus subtilis is a soil bacterium that is competent for natural transformation. Genetically 1234567890():,; distinct B. subtilis swarms form a boundary upon encounter, resulting in killing of one of the strains. This process is mediated by a fast-evolving kin discrimination (KD) system consisting of cellular attack and defence mechanisms. Here, we show that these swarm antagonisms promote transformation-mediated horizontal gene transfer between strains of low related- ness. Gene transfer between interacting non-kin strains is largely unidirectional, from killed cells of the donor strain to surviving cells of the recipient strain. It is associated with acti- vation of a stress response mediated by sigma factor SigW in the donor cells, and induction of competence in the recipient strain. More closely related strains, which in theory would experience more efficient recombination due to increased sequence homology, do not upregulate transformation upon encounter. This result indicates that social interactions can override mechanistic barriers to horizontal gene transfer. We hypothesize that KD-mediated competence in response to the encounter of distinct neighbouring strains could maximize the probability of efficient incorporation of novel alleles and genes that have proved to function in a genomically and ecologically similar context. 1 Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia.
    [Show full text]
  • Mygene.Info R Client
    MyGene.info R Client Adam Mark, Ryan Thompson, Chunlei Wu May 19, 2021 Contents 1 Overview ..............................2 2 Gene Annotation Service ...................2 2.1 getGene .............................2 2.2 getGenes ............................2 3 Gene Query Service ......................3 3.1 query ..............................3 3.2 queryMany ...........................4 4 makeTxDbFromMyGene....................5 5 Tutorial, ID mapping .......................6 5.1 Mapping gene symbols to Entrez gene ids ........6 5.2 Mapping gene symbols to Ensembl gene ids .......7 5.3 When an input has no matching gene ...........8 5.4 When input ids are not just symbols ............8 5.5 When an input id has multiple matching genes ......9 5.6 Can I convert a very large list of ids?............ 11 6 References ............................. 11 MyGene.info R Client 1 Overview MyGene.Info provides simple-to-use REST web services to query/retrieve gene annotation data. It’s designed with simplicity and performance emphasized. mygene is an easy-to-use R wrapper to access MyGene.Info services. 2 Gene Annotation Service 2.1 getGene • Use getGene, the wrapper for GET query of "/gene/<geneid>" service, to return the gene object for the given geneid. > gene <- getGene("1017", fields="all") > length(gene) [1] 1 > gene["name"] [[1]] NULL > gene["taxid"] [[1]] NULL > gene["uniprot"] [[1]] NULL > gene["refseq"] [[1]] NULL 2.2 getGenes • Use getGenes, the wrapper for POST query of "/gene" service, to return the list of gene objects for the given character vector of geneids. > getGenes(c("1017","1018","ENSG00000148795")) DataFrame with 3 rows and 7 columns 2 MyGene.info R Client query _id X_version entrezgene name <character> <character> <integer> <character> <character> 1 1017 1017 4 1017 cyclin dependent kin.
    [Show full text]
  • Bioinformatics - 2 - Investigation of C
    Advanced Skills in Biochemistry: Bioinformatics - 2 - Investigation of C. elegans protein kinase polypeptides Exercise 1 The purpose of this workshop is to familarise you with some of the databases and tools that are available for the analysis of proteins. For the purposes of this exercise, you will again be investigating the family of catalytic subunits of the cyclic AMP-dependent protein kinase (PK-A) in the free-living nematode, C. elegans. 1. The C. elegans polypeptide sequences we are interested in are encoded by the F47F2.1 gene on the X chromosome and the kin-1 gene on chromosome 1. In addition, we need the sequence of the mouse PK-A - catalytic subunit. The easiest way of obtaining each of these sequences is by making use of the Sequence Retrieval System (SRS). You should therefore visit the SRS Homepage: http://srs6.ebi.ac.uk/ 2. From the home page, select the Library Page tab. Then, under the UniProt Universal Protein Resource section, select the UniProtKB and UniParc databases. Then select the Standard Query Form. You should then carry out separate searches for the following accession numbers (i.e. select AccessionNumber from the drop-down menu in place of the default AllText): P05132, UPI000002AC77 and Q7JP68. These searches should find records for the polypeptide sequence for the mouse -catalytic subunit, the polypeptide sequence of the catalytic subunit encoded by the kin-1 gene and the F47F2.1b polypeptide sequence. Each of the records should be saved locally (e.g. on the desktop). 3. You should now prepare a simple text
    [Show full text]
  • Karyomegalic Interstitial Nephritis and DNA Damage-Induced Polyploidy in Fan1 Nuclease-Defective Knock-In Mice
    Downloaded from genesdev.cshlp.org on September 24, 2021 - Published by Cold Spring Harbor Laboratory Press RESEARCH COMMUNICATION Fancd2 ubiquitylation (K561) causes defective ICL repair Karyomegalic interstitial (Garcia-Higuera et al. 2001; Knipscheer et al. 2009; Deans nephritis and DNA damage- and West 2011). Fan1 is a 5′ flap endonuclease recruited to sites where induced polyploidy in Fan1 replisomes stall—for example, at ICLs—in a manner that requires Fancd2 ubiquitylation (Kratz et al. 2010; Liu et al. nuclease-defective knock-in 2010; MacKay et al. 2010; Smogorzewska et al. 2010). The mice nuclease activities of Fan1 are mediated by a C-terminal “ ” 1 1 VRR_nuc domain (Iyer et al. 2006), which is conserved Christophe Lachaud, Meghan Slean, in all orthologs as far back as yeasts. Fan1 also has a SAP- Francesco Marchesi,2 Claire Lock,3 Edward Odell,3 type DNA-binding domain and a UBZ4-type ubiquitin- Dennis Castor,1,4 Rachel Toth,1 binding domain, which interacts with ubiquityl-Fancd2, and John Rouse1 thereby recruiting Fan1 to sites of replisome stalling. Fan1 is required for efficient ICL repair, but, surprisingly, 1MRC Protein Phosphorylation and Ubiquitylation Unit, College this does not require interaction with ubiquityl-Fancd2 of Life Sciences, Sir James Black Centre, University of Dundee, (Lachaud et al. 2016). Similarly, genetic analyses in worms Dundee DD1 5EH, United Kingdom 2School of Veterinary and human cells showed that Fan1 and Fancd2 mutations Medicine, College of Medical, Veterinary, and Life Sciences, are not epistatic with respect to hypersensitivity to ICL- University of Glasgow, Glasgow G61 1QH, United Kingdom inducing drugs, suggesting that the two genes are not 3Department of Head and Neck Pathology, Guy’s Hospital, equivalent in function when it comes to ICL repair (Yosh- London SE1 9RT, United Kingdom ikiyo et al.
    [Show full text]
  • Somamer Reagents Generated to Human Proteins Number Somamer Seqid Analyte Name Uniprot ID 1 5227-60
    SOMAmer Reagents Generated to Human Proteins The exact content of any pre-specified menu offered by SomaLogic may be altered on an ongoing basis, including the addition of SOMAmer reagents as they are created, and the removal of others if deemed necessary, as we continue to improve the performance of the SOMAscan assay. However, the client will know the exact content at the time of study contracting. SomaLogic reserves the right to alter the menu at any time in its sole discretion. Number SOMAmer SeqID Analyte Name UniProt ID 1 5227-60 [Pyruvate dehydrogenase (acetyl-transferring)] kinase isozyme 1, mitochondrial Q15118 2 14156-33 14-3-3 protein beta/alpha P31946 3 14157-21 14-3-3 protein epsilon P62258 P31946, P62258, P61981, Q04917, 4 4179-57 14-3-3 protein family P27348, P63104, P31947 5 4829-43 14-3-3 protein sigma P31947 6 7625-27 14-3-3 protein theta P27348 7 5858-6 14-3-3 protein zeta/delta P63104 8 4995-16 15-hydroxyprostaglandin dehydrogenase [NAD(+)] P15428 9 4563-61 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase gamma-1 P19174 10 10361-25 2'-5'-oligoadenylate synthase 1 P00973 11 3898-5 26S proteasome non-ATPase regulatory subunit 7 P51665 12 5230-99 3-hydroxy-3-methylglutaryl-coenzyme A reductase P04035 13 4217-49 3-hydroxyacyl-CoA dehydrogenase type-2 Q99714 14 5861-78 3-hydroxyanthranilate 3,4-dioxygenase P46952 15 4693-72 3-hydroxyisobutyrate dehydrogenase, mitochondrial P31937 16 4460-8 3-phosphoinositide-dependent protein kinase 1 O15530 17 5026-66 40S ribosomal protein S3 P23396 18 5484-63 40S ribosomal protein
    [Show full text]
  • Self-Identity Barcodes Encoded by Six Expansive Polymorphic Toxin Families Discriminate Kin in Myxobacteria
    Self-identity barcodes encoded by six expansive polymorphic toxin families discriminate kin in myxobacteria Christopher N. Vassalloa,1 and Daniel Walla,2 aDepartment of Molecular Biology, University of Wyoming, Laramie, WY 82071 Edited by Christine Jacobs-Wagner, Yale University, West Haven, CT, and approved October 24, 2019 (received for review July 20, 2019) Myxobacteria are an example of how single-cell individuals can dynamic OM proteins, and, when 2 compatible cells touch, transition into multicellular life by an aggregation strategy. For these multiple receptor complexes from each cell coalesce into distinct and all organisms that consist of social groups of cells, discrimination foci that bridge the boundary between the 2 cells. This transient against, and exclusion of, nonself is critical. In myxobacteria, TraA is interaction culminates in an apparent membrane fusion and bi- a polymorphic cell surface receptor that identifies kin by homotypic directional transfer of proteins and lipids before cells separate by binding, and in so doing exchanges outer membrane (OM) proteins gliding motility (5–7). This striking and robust behavior is and lipids between cells with compatible receptors. However, TraA thought to help rejuvenate and maintain homeostasis of the cell variability alone is not sufficient to discriminate against all cells, as envelope in a population that ages or encounters insults in traA allele diversity is not necessarily high among local strains. To constantly fluctuating environments (8, 9). increase discrimination ability, myxobacteria include polymorphic In nutrient-rich soils, myxobacteria populations are numerous OM lipoprotein toxins called SitA in their delivered cargo, which and diverse (10, 11). Local strains compete with each other and poison recipient cells that lack the cognate, allele-specific SitI immu- must establish and maintain a group identity by recognizing and nity protein.
    [Show full text]
  • Genome-Wide Association Studies in Alzheimer Disease
    NEUROLOGICAL REVIEW Genome-Wide Association Studies in Alzheimer Disease Stephen C. Waring, DVM, PhD; Roger N. Rosenberg, MD he genetics of Alzheimer disease (AD) to date support an age-dependent dichotomous model whereby earlier age of disease onset (Ͻ60 years) is explained by 3 fully penetrant genes (APP [NCBI Entrez gene 351], PSEN1 [NCBI Entrez gene 5663], and PSEN2 [NCBI Entrez gene 5664]), whereas later age of disease onset (Ն65 years) representing most cases Tof AD has yet to be explained by a purely genetic model. The APOE gene (NCBI Entrez gene 348) is the strongest genetic risk factor for later onset, although it is neither sufficient nor necessary to ex- plain all occurrences of disease. Numerous putative genetic risk alleles and genetic variants have been reported. Although all have relevance to biological mechanisms that may be associated with AD patho- genesis, they await replication in large representative populations. Genome-wide association studies have emerged as an increasingly effective tool for identifying genetic contributions to complex dis- eases and represent the next frontier for furthering our understanding of the underlying etiologic, bio- logical, and pathologic mechanisms associated with chronic complex disorders. There have already been success stories for diseases such as macular degeneration and diabetes mellitus. Whether this will hold true for a genetically complex and heterogeneous disease such as AD is not known, al- though early reports are encouraging. This review considers recent publications from studies that have successfully applied genome-wide association methods to investigations of AD by taking advantage of the currently available high-throughput arrays, bioinformatics, and software advances.
    [Show full text]
  • Mouse Models of Inherited Retinal Degeneration with Photoreceptor Cell Loss
    cells Review Mouse Models of Inherited Retinal Degeneration with Photoreceptor Cell Loss 1, 1, 1 1,2,3 1 Gayle B. Collin y, Navdeep Gogna y, Bo Chang , Nattaya Damkham , Jai Pinkney , Lillian F. Hyde 1, Lisa Stone 1 , Jürgen K. Naggert 1 , Patsy M. Nishina 1,* and Mark P. Krebs 1,* 1 The Jackson Laboratory, Bar Harbor, Maine, ME 04609, USA; [email protected] (G.B.C.); [email protected] (N.G.); [email protected] (B.C.); [email protected] (N.D.); [email protected] (J.P.); [email protected] (L.F.H.); [email protected] (L.S.); [email protected] (J.K.N.) 2 Department of Immunology, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand 3 Siriraj Center of Excellence for Stem Cell Research, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand * Correspondence: [email protected] (P.M.N.); [email protected] (M.P.K.); Tel.: +1-207-2886-383 (P.M.N.); +1-207-2886-000 (M.P.K.) These authors contributed equally to this work. y Received: 29 February 2020; Accepted: 7 April 2020; Published: 10 April 2020 Abstract: Inherited retinal degeneration (RD) leads to the impairment or loss of vision in millions of individuals worldwide, most frequently due to the loss of photoreceptor (PR) cells. Animal models, particularly the laboratory mouse, have been used to understand the pathogenic mechanisms that underlie PR cell loss and to explore therapies that may prevent, delay, or reverse RD. Here, we reviewed entries in the Mouse Genome Informatics and PubMed databases to compile a comprehensive list of monogenic mouse models in which PR cell loss is demonstrated.
    [Show full text]