Network-Based Functional Prediction Augments Genetic Association to Predict Candidate Genes for Histamine Hypersensitivity in Mice
Total Page:16
File Type:pdf, Size:1020Kb
G3: Genes|Genomes|Genetics Early Online, published on October 23, 2019 as doi:10.1534/g3.119.400740 INVESTIGATIONS Network-based functional prediction augments genetic association to predict candidate genes for histamine hypersensitivity in mice Anna L. Tyler1, Abbas Raza2, Dimitry N. Krementsov3, Laure K. Case1, Rui Huang4, Runlin Z. Ma4, Elizabeth P. Blankenhorn5, Cory Teuscher2,6 and J. Matthew Mahoney7,8,∗ 1The Jackson Laboratory, 600 Main St. Bar Harbor, ME, 04609, United States., 2Department of Medicine, University of Vermont, Burlington, VT, United States., 3Department of Biomedical and Health Sciences, University of Vermont, Burlington, VT, United States., 4School of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China, 5Department of Microbiology and Immunology, Drexel University College of Medicine, Philadelphia, PA, United States, 6Department of Pathology, University of Vermont, Burlington, VT, United States, 7Department of Neurological Sciences, University of Vermont Larner College of Medicine, Burlington, VT, United States, 8Department of Computer Science, University of Vermont, Burlington, VT, United States 1 2 ABSTRACT Genetic mapping is a primary tool of genetics in model organisms; however, many quantitative KEYWORDS 3 trait loci (QTL) contain tens or hundreds of positional candidate genes. Prioritizing these genes for validation is Gene prioritiza- 4 often ad hoc and biased by previous findings. Here we present a technique for prioritizing positional candidates tion 5 based on computationally inferred gene function. Our method uses machine learning with functional genomic machine learning 6 networks, whose links encode functional associations among genes, to identify network-based signatures quantitative trait 7 of functional association to a trait of interest. We demonstrate the method by functionally ranking positional locus 8 candidates in a large locus on mouse Chr 6 (45.9 Mb to 127.8 Mb) associated with histamine hypersensitivity histamine hyper- 9 (Histh). Histh is characterized by systemic vascular leakage and edema in response to histamine challenge, sensitivity 10 which can lead to multiple organ failure and death. Although Histh risk is strongly influenced by genetics, little Clarkson’s Dis- 11 is known about its underlying molecular or genetic causes, due to genetic and physiological complexity of the ease 12 trait. To dissect this complexity, we ranked genes in the Histh locus by predicting functional association with 13 multiple Histh-related processes. We integrated these predictions with new single nucleotide polymorphism 14 (SNP) association data derived from a survey of 23 inbred mouse strains and congenic mapping data. The 15 top-ranked genes included Cxcl12, Ret, Cacna1c, and Cntn3, all of which had strong functional associations 16 and were proximal to SNPs segregating with Histh. These results demonstrate the power of network-based 17 computational methods to nominate highly plausible quantitative trait genes even in highly challenging cases 18 involving large QTLs and extreme trait complexity. 19 7 1 INTRODUCTION produces positional candidate genes. Rigorously narrowing a QTL by fine mapping with congenic strains can take years or decades, 8 2 Identifying causal variants within quantitative trait loci (QTLs) is particularly in organisms like mice that have long generation times. 9 3 a central problem of genetics, but genetic linkage often prevents Moreover, high-resolution congenic mapping often reveals that 10 4 narrowing QTLs to less than several megabases (Mb). Thus, QTLs the overall QTL effect is due to multiple linked genes within the 11 5 may contain hundreds of candidate genes. Instead of revealing the QTL rather than a single gene (Parker et al. 2013; Yazbek et al. 2011). 12 6 exact gene (or genes) responsible for trait variation, QTL mapping Thus, positional data alone are generally insufficient to nominate 13 candidate genes for subsequent biological follow up. To overcome 14 Manuscript compiled: Tuesday 22nd October, 2019 the limitations of mapping data, researchers look within a QTL for 15 ∗ Address for correspondence: University Of Vermont Larner College of Medicine, 95 plausible candidate genes. However, these selections are typically 16 Carrigan Drive, Stafford 118, Burlington, VT 05401. Email: [email protected] 1 © The Author(s) 2013. Published by the Genetics Society of America. 1 done by ad hoc criteria using prior knowledge or a literature search. functional predictions of Histh-related genes. By augmenting posi- 63 2 This strategy is strongly biased toward prior knowledge and is tional data with functional predictions, we dramatically reduced 64 3 highly error prone due to missing annotations. There is a need for the candidate gene list to a tractable set of high-quality candidates 65 4 rigorous and systematic strategies to distinguish among positional that are implicated in Histh-related processes. 66 5 candidate genes for mechanistic follow up. 6 We developed a novel approach to rank positional candidates MATERIALS AND METHODS 67 7 based on functional association with a trait. To avoid annotation 68 8 or literature bias, we use functional genomic networks (FGNs), As a supplement to the computational portion of the methods 69 9 which encode predicted functional associations among all genes section, this paper includes an executable workflow (See Data 70 10 in the genome. FGNs such as the Functional Networks of Tissues Availability). An outline of the computational workflow is shown 71 11 in Mouse (FNTM) (Goya et al. 2015) and HumanBase (Greene et al. in Figure 1. The workflow includes all files and parameters re- 72 12 2015), are Bayesian integration networks that combine gene co- quired to recreate the computational portions of this study. 13 expression, protein-protein binding data, ontology annotation and 14 other data to predict functional associations among genes. With Animals 73 15 these networks we can expand on known gene-trait associations to A total of 23 mouse strains (129X1/SvJ, A/J, AKR/J, B10.S- 74 16 identify genes that were not previously associated with the trait. H2s/SgMcdJ (B10.S), BALB/cJ, BPL/1J, BPN/3J, C3H/HeJ, 75 17 Recent studies with functional genomic networks, for example C57BL/6J, CBA/J, CZECHII/EiJ, DBA/1J, DBA/2J, FVB/NJ, 76 18 FNTM, have demonstrated their power to generate novel associ- JF1/MsJ, MOLF/EiJ, MRL/MpJ, MSM/MsJ, NOD/ShiLtJ, NU/J, 77 19 ations between genes and specific phenotype terms (Guan et al. PWD/PhJ, PWK/PhJ, SJL/J and SWR/J were purchased from the 78 20 2010) or biological processes (Ju et al. 2013). For example, Guan et al. Jackson Laboratory (Bar Harbor, ME). All mice, including B10.S- 79 21 (2010) used a support vector machine (SVM) classifier to identify HisthSJL and B10.S-HisthSJL ISRC lines, were generated and main- 80 22 a gene network associated with bone mineralization. They pre- tained under specific pathogen-free conditions in the vivarium of 81 23 dicted and validated novel associations between genes and bone the Given Medical Building at the University of Vermont according 82 24 mineralization phenotypes for genes that lay outside of all pub- to National Institutes of Health guidelines. All animal studies were 83 25 lished QTLs for bone mineralization phenotypes (Guan et al. 2010). approved by the Institutional Animal Care and Use Committee of 84 26 Subsequent studies using similar network-based techniques have the University of Vermont. 85 27 made novel predictions of hypertension- and autism-associated 28 genes (Greene et al. 2015; Krishnan et al. 2016). We have expanded Histh Phenotyping 86 29 these methods to rank genes in a mapped QTL based on multi- On day (D) 0 mice were injected i.p. with complete Freund’s ad- 87 30 ple putative functional terms and to integrate these rankings with juvant (CFA) (Sigma-Aldrich, St. Louis, MO) supplemented with 88 31 genetic association p values from strain surveys. We generated 200 mg of Mycobacterium tuberculosis H37Ra (Difco Laboratories, 89 32 a ranked list for all genes in the QTL that incorporated both the Detroit, MI). On D30 histamine hypersensitivity was determined 90 33 functional and positional scores of each candidate gene. by i.v. injection of histamine (mg/kg dry weight free base) in 91 34 Our strategy first built trait-associated gene lists from struc- phosphate buffered saline (PBS). Deaths were recorded at 30 min 92 35 tured biological ontologies (e.g., the Gene Ontology (Ashburner post injection and the data are reported as the number of animals 93 36 et al. 2000; Gene Ontology Consortium 2018) and the Mammalian dead over the number of animals studied. Significance of observed 94 37 Phenotype Ontology (Smith and Eppig 2012)) and public tran- differences was determined by Chi-square with p-values <0.05 95 38 scriptomic data from the Gene expression Omnibus (GEO) (Edgar significant. 96 39 et al. 2002; Barrett et al. 2012). We then applied machine learning 40 classifiers to the functional networks of tissues in mice (FNTM) DNA extraction and genotyping 97 41 (Goya et al. 2015) to identify network-based signatures of the trait- DNA was isolated from mouse tail clippings as previously de- 98 42 related gene lists. This strategy allowed us to predict gene-trait scribed (Sudweeks et al. 1993). Briefly, individual tail clippings 99 43 associations that were not annotated within a structured ontology, were incubated with 300mL cell lysis buffer (125m g/mL proteinase 100 44 overcoming the missing annotation problem. K, 100 mM NaCl, 10mM Tris-HCl (pH 8.3), 10 mM EDTA, 100mM 101 ◦ 45 We applied our approach to a large QTL associated with his- KCl, 0.50% SDS) overnight at 55 C. The next day, 150mL of 6M 102 ◦ 46 tamine hypersensitivity (Histh) in mice. Histh in mice is a lethal NaCl were added followed by centrifugation for 10 min at 4 C. 103 47 response to a histamine injection. In insensitive mice, a histamine The supernatant layer was transferred to a fresh tube containing 104 48 injection produces an inflammatory response that resolves without 300mL of isopropanol.