Identifying Sex-Speci C Genetic Effects Across 733 Traits in UK Biobank
Total Page:16
File Type:pdf, Size:1020Kb
Identifying sex-specic genetic effects across 733 traits in UK Biobank James Han Yale University Wei Jiang Department of Biostatistics, Yale School of Public Health Yixuan Ye Yale University https://orcid.org/0000-0002-2643-665X Hongyu Zhao ( [email protected] ) Yale University https://orcid.org/0000-0003-1195-9607 Article Keywords: sex-specicity, diseases, traits, polygenic risk Posted Date: July 20th, 2021 DOI: https://doi.org/10.21203/rs.3.rs-701876/v1 License: This work is licensed under a Creative Commons Attribution 4.0 International License. Read Full License Page 1/21 Abstract Sex-specicity has been reported in a wide range of diseases and complex traits. While sex-specic genetic effects have been documented for certain traits, the genetic mechanisms underlying sex differences in most traits remain largely unexplored. With its large sample size and wide range of diseases and traits, the UK Biobank—a large, prospective cohort study containing health history, phenotypic measurements, and genetic data for over 500,000 individuals— provides an opportunity to explore sexually dimorphic genetic architectures in a large number of traits and diseases. Here, we present a sex-specic analysis of 733 sex-stratied complex trait GWAS for 361,194 white British individuals in the UK Biobank, and report 16 traits with signicant sex-specic differences in heritability. These 16 candidate traits with sex-specic genetic effects belong to 5 distinct groups: body fat mass and distribution, blood pressure, creatinine levels, snoring, and birth weight. Using a systematic sex-specic discovery-replication analysis, we identify 47 (31 novel) loci showing sex-specic effects on the traits related with body fat mass/distribution, blood pressure, and birth weight, and discover 74 potential sex-specic biological pathways from the enrichment analyses based on associated genes from QTL analysis. In addition, we present further evidence for signicant sex-specic genetic effects in 13 traits spanning three trait groups (body fat mass/distribution, blood pressure, and birth weight) by comparing the prediction performance of sex-specic polygenic risk scores. Introduction There are signicant sex differences for many traits and diseases1, from cardiovascular diseases, asthma, autoimmune diseases, and mental illnesses to anthropometric traits such as BMI, body fat composition, and blood pressure2-8. Sex differences can originate from a wide range of factors, from genetic, hormonal, and other biological factors to environmental and sociological factors9,10. Understanding the mechanisms contributing to sex differences in various diseases and traits can aid in our understanding of the biological origins of diseases. These sex differences also present opportunities for improved therapies, where sex-specic disease etiologies may require different treatment strategies, and are an important factor for equitable medical care1. Here we focus on the genetic basis of sex differences for a range of common traits. Previous studies have already demonstrated the existence of sexually dimorphic genetic effects for certain traits. For example, Weiss et al discovered sex-specic genetic architectures in some quantitative traits in Hutterites, such as blood lipid levels, blood pressure, and height11. Rawlik et al identied sex-specic genetic architectures for 14 complex traits such as basal metabolic rate, waist-hip ratio, and blood pressure, among others9. Using sex-stratied Genome-Wide Association Studies (GWAS) data collected by the GIANT consortium, Randall et al discovered 7 sexually dimorphic genetic loci for waist phenotypes12. More recently, Rask-Andersen et al observed genetic sex-heterogeneity specically in body fat distribution, identifying 37 variants showing stronger effects in females13. Sex-specic genetic effects have also been observed in complex diseases, with recent studies showing sex-specic risk alleles for asthma, coronary artery disease, diabetes, and Crohn’s disease14-16. While sex-specic genetic differences have been discovered for some traits in these studies, a thorough scan of the specic genetic mechanisms conferring sexual dimorphism is still lacking for the majority of common traits and diseases1. Even for the traits that have been studied, most were designed to search for global evidence of sex-specic genetic architectures, without identifying loci showing sex-specic effects (SSE)13,17, while others were underpowered in detecting sex-specic genetic loci9,15, many of which may only have small effect differences through varying disease etiology and mechanism. The UK Biobank is a large, prospective cohort study of over 500,000 individuals, with rich data including questionnaire responses, phenotypic measurements, disease and health information, and genetic data18. This provides an opportunity to scan a large number of common traits and diseases for potential different genetic architectures between males and females and identify loci having SSE with comparatively higher statistical power. Although there were published studies analyzing selected traits within the UK Biobank, these studies either considered a subset of the samples9 or used methods that are not specically designed to detect sexually dimorphic loci13,19. With recent releases of more participant genomic and phenotypic data, as well as comprehensive GWAS analyses of these participants, the UK Biobank presents a promising resource for a more comprehensive investigation of sex-specic genetic effects20,21. In this manuscript, we present a sex-specic analysis of 733 complex traits in the UK Biobank. A total of 16 traits have signicant differences in heritability between males and females. These 16 traits belong to 5 distinct groups: body fat mass/distribution, blood pressure, creatinine levels, snoring, and birth weight. Using sex stratied GWAS analysis, we initially discovered 360 SSE loci across these traits. To validate these ndings, we conduct a replication study using an independent set of individuals within the same genetic population from the UK Biobank, yielding 47 replicated SSE loci for traits related with body fat mass/distribution and blood pressure, of which 31 are novel. We then investigate possible biological interpretations of the SSE loci through pathway enrichment analysis, and report 74 possible sex-specic pathways. We further present evidence for signicant sex-specic genetic effects in 13 of the 16 traits through polygenic risk prediction. Using two different polygenic risk prediction methods, namely pruning and thresholding (P+T)22 and PRS-CS23, we report signicant differences in risk prediction between models trained on male- and female-specic training sets for traits related to fat mass/distribution, blood pressure, and birth weight. Results An overview of our analysis procedure, as well as our results, can be found in Figure 1. Due to the large number of phenotypes presented in the UK Biobank, as well as the presence of redundant phenotypes, we rstly chose 733 traits belonging to 7 categories, including lifestyles, clinical measurements, health and medical status, cognitive functions, biomarkers, and diagnosed diseases. Then we conducted sex-stratied GWAS analyses in an initial discovery set for the 733 traits, and tested GWAS summary statistics for sex-specic heritability difference. For the traits selected based on heritability difference tests, we identied candidate sex-specic loci. Then we used a replication set to validate the ndings, and conducted a literature review, QTL analysis, and pathway Page 2/21 analysis based on the replicated loci. Finally, we used the discovery and replication sets to compare sex-specic and sex-agnostic PRS models to demonstrate further evidence for sex-specic genetic effects. The methodology details can be found in Methods Section. Heritability Differences and Genetic Correlations Of the 733 traits analyzed, 15 traits showed signicant sex-differences in heritability: whole body fat mass (WFM, Field NO.: 23100), right leg fat percentage (RLFP, Field NO.: 23111), left leg fat percentage (LLFP, Field NO.: 23115), right leg fat mass (RLFM, Field NO.: 23112), right arm fat percentage (RAFP, Field NO.: 23119), left arm fat percentage (LAFP, Field NO.: 23123), right arm fat mass (RAFM, Field NO.: 23120), left arm fat mass (LAFM, Field NO.: 23124), trunk fat mass (TFM, Field NO.: 23128), diastolic blood pressure (DBP, Field NO.: 4079), systolic blood pressure (SBP, Field NO.: 4080), high blood pressure (HBP, Field NO.: 6150_4), snoring (SNRG, Field NO.: 1210), creatinine in urine (CRT, Field NO.: 30510), and birth weight (BW, Field NO.: 20022). Both sides of a bilateral symmetrical trait are presumed to have similar genetic basis. We noticed that left leg fat mass (LLFM, Field NO.: 23116), which pairs with right leg fat mass (RLFM, Field NO.: 23112), also showed a large sex difference in heritabilities with a p-value of 0.002. Although it did not pass the FDR threshold, we still included it in the remaining analyses, since it is the other side of a symmetrical trait. Figure 2 shows the scatter plot of the heritabilities for these traits in males and females. Like previous studies, which have reported comparatively higher heritabilities for many traits in females over males6,9,17, our results also showed that the heritability estimates are higher in females than in males for most traits. Genetic correlations between males and females showed high similarity between the male and female summary statistics, with genetic correlation estimates