Comprehensive Analyses of 723 Transcriptomes Enhance Genetic and Biological Interpretations for Complex Traits in Cattle
Total Page:16
File Type:pdf, Size:1020Kb
Downloaded from genome.cshlp.org on October 3, 2021 - Published by Cold Spring Harbor Laboratory Press Resource Comprehensive analyses of 723 transcriptomes enhance genetic and biological interpretations for complex traits in cattle Lingzhao Fang,1,2,3,4,8 Wentao Cai,2,5,8 Shuli Liu,1,5,8 Oriol Canela-Xandri,3,4,8 Yahui Gao,1,2 Jicai Jiang,2 Konrad Rawlik,3 Bingjie Li,1 Steven G. Schroeder,1 Benjamin D. Rosen,1 Cong-jun Li,1 Tad S. Sonstegard,6 Leeson J. Alexander,7 Curtis P. Van Tassell,1 Paul M. VanRaden,1 John B. Cole,1 Ying Yu,5 Shengli Zhang,5 Albert Tenesa,3,4 Li Ma,2 and George E. Liu1 1Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, Maryland 20705, USA; 2Department of Animal and Avian Sciences, University of Maryland, College Park, Maryland 20742, USA; 3The Roslin Institute, Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian EH25 9RG, United Kingdom; 4Medical Research Council Human Genetics Unit at the Medical Research Council Institute of Genetics and Molecular Medicine, The University of Edinburgh, Edinburgh EH4 2XU, United Kingdom; 5College of Animal Science and Technology, China Agricultural University, Beijing 100193, China; 6Acceligen, Eagan, Minnesota 55121, USA; 7Fort Keogh Livestock and Range Research Laboratory, Agricultural Research Service, USDA, Miles City, Montana 59301, USA By uniformly analyzing 723 RNA-seq data from 91 tissues and cell types, we built a comprehensive gene atlas and studied tissue specificity of genes in cattle. We demonstrated that tissue-specific genes significantly reflected the tissue-relevant biol- ogy, showing distinct promoter methylation and evolution patterns (e.g., brain-specific genes evolve slowest, whereas testis- specific genes evolve fastest). Through integrative analyses of those tissue-specific genes with large-scale genome-wide asso- ciation studies, we detected relevant tissues/cell types and candidate genes for 45 economically important traits in cattle, including blood/immune system (e.g., CCDC88C) for male fertility, brain (e.g., TRIM46 and RAB6A) for milk produc- tion, and multiple growth-related tissues (e.g., FGF6 and CCND2) for body conformation. We validated these findings by us- ing epigenomic data across major somatic tissues and sperm. Collectively, our findings provided novel insights into the genetic and biological mechanisms underlying complex traits in cattle, and our transcriptome atlas can serve as a primary source for biological interpretation, functional validation, studies of adaptive evolution, and genomic improvement in livestock. [Supplemental material is available for this article.] Over the last decade, genome-wide association studies (GWAS) unprecedented potential to discover trait-/disease-relevant tissues have been successful at discovering trait-/disease-associated geno- or cell types, which is crucial for understanding the molecular un- mic variants (Visscher et al. 2012, 2017). However, such studies derpinnings of complex traits and diseases (Finucane et al. 2018; provided limited information about novel molecular mechanisms Hormozdiari et al. 2018). For instance, the Roadmap Epigenomics underlying complex traits and diseases, partly due to the lack of Consortium (2015) showed that GWAS hits of many traits and dis- knowledge of in what tissues or cell types those genomic variants eases are significantly enriched in epigenomic marks (e.g., would act. Recently, researchers have been actively pursuing a H3K4me1) of trait-/disease-relevant tissues and cell types in hu- comprehensive map of functional elements, aiming to identify mans. Finucane et al. (2018) recently explored disease-relevant which genes and regulatory factors (e.g., promoters and enhanc- tissues and cell types for various human diseases by examining ers) are functional or active in a large range of tissues and cell their heritability enrichments among diverse tissues and cell types, types—for example, Roadmap Epigenomics (Roadmap Epigenom- such as inhibitory neurons for bipolar disorder (Finucane et al. ics Consortium et al. 2015), GETx (The GTEx Consortium 2017), 2018). and Cell Atlas (Regev et al. 2017) projects in human, as well as In livestock, due to the limited amount of functional genome the Functional Annotation of Animal Genomes (FAANG) project data available (Fang et al. 2019), to our knowledge no previous in livestock (Andersson et al. 2015). Integrative analyses of func- publication has systematically reported the causal tissues or cell tional genome information with large-scale GWAS data provide types for complex traits and diseases of economic importance. A comprehensive map linking complex traits with their specifically 8These authors contributed equally to this work. © 2020 Fang et al. This article is distributed exclusively by Cold Spring Harbor Corresponding authors: [email protected], Laboratory Press for the first six months after the full-issue publication date (see [email protected], [email protected], [email protected] http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is avail- Article published online before print. Article, supplemental material, and publi- able under a Creative Commons License (Attribution-NonCommercial 4.0 Inter- cation date are at http://www.genome.org/cgi/doi/10.1101/gr.250704.119. national), as described at http://creativecommons.org/licenses/by-nc/4.0/. 30:1–12 Published by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/20; www.genome.org Genome Research 1 www.genome.org Downloaded from genome.cshlp.org on October 3, 2021 - Published by Cold Spring Harbor Laboratory Press Fang et al. relevant tissues will offer valuable information for fine-mapping Detection and functional characterization of tissue-/ causal genes/variants, for functionally validating of GWAS hits cell type–specific genes (i.e., selecting the “right” tissues and cell types), and for under- standing of adaptive evolution (Quiver and Lachance 2018), as We calculated a t-statistic to measure the specific expression of a well as for the design of genome editing experiments (Ruan et al. gene in a given tissue/cell type (Methods). We found that tissues 2017). Additionally, a better understanding of the genetic architec- and cell types within the same system highly positively correlated ture underlying complex traits may make a contribution to the ge- based on these t-statistics (Supplemental Fig. S3), indicating the netic improvement programs among livestock species (Goddard high similarity of their tissue-specific expression. Of special inter- and Hayes 2009; Georges et al. 2019). For instance, Fang et al. est, we found that mammary gland highly negatively correlated (2017a,b) reported improved genomic prediction accuracy for with corpus luteum and endometrium (Pearson’s r = −0.88 and mastitis and milk production traits in cattle by incorporating bio- −0.85, respectively) (Supplemental Fig. S3). This may reflect the logical priors and gene expression information relevant to bacte- well-known, antagonistic relationship between milk yield and fer- rial infection into genomic prediction models (Fang et al. 2017a,b). tility in dairy animals (Veerkamp et al. 2001; Berry et al. 2003). Here, we uniformly assembled and analyzed 723 (156 newly Additionally, liver and rumen epithelial cells negatively correlated generated and 567 existing) RNA-seq data sets to build a new with several immune tissues and cell types, including CD4 cells, gene atlas in cattle (Supplemental Code), which included 91 tis- CD8 cells, white blood cells, and thymus (the averaged Pearson’s sues and cell types from 447 individuals (http://cattlegeneatlas r = −0.62) (Supplemental Fig. S3). This may support the observed .roslin.ed.ac.uk). We summarized the global design of this study connections between feed efficiency and immune responses in in Supplemental Figure S1. We first detected genes that were high- cattle (Hou et al. 2012). However, the underlying molecular mech- ly and specifically expressed in each tissue or cell type and then ex- anisms of these negative correlations are largely unknown and re- plored their biological characteristics in terms of biological quire further investigations. function, DNA methylation, and evolution. We detected relevant We detected tissue-specific genes for each tested tissue based tissues/cell types and candidate genes for 45 complex traits of eco- on the rank of t-statistics (i.e., top 5%). We showed the top tissue- nomic importance in cattle, including 18 body conformation, six specific genes in brain (GRM5), liver (SLC22A9), white blood cell milk production, 12 reproduction, eight health, and one feed effi- (FCRL3), uterus (TDGF1), and testis (TRIM69) as examples in ciency traits, by integrating those tissue-specific genes with large- Figure 1B. The functional annotation of tissue-specific genes vali- scale (n = 27,214 bulls) GWAS data. We validated our findings by dated the known tissue-relevant biology (Fig. 1C; Supplemental analyzing whole-genome DNA methylation data across major Table S2). For instance, brain-specific genes significantly enriched − somatic tissues and sperm in cattle. In addition, we tested whether for nervous system development (FDR = 1.67 × 10 48, enrichment the tissue-specificity information of genes can improve genomic fold = 3.24), liver for organic acid metabolism process (FDR = 1.33 × − prediction. Our results, for the first time, systemically establish 10