bioRxiv preprint doi: https://doi.org/10.1101/501049; this version posted December 20, 2018. The copyright holder for this preprint (which was not certified by ) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Genetic analyses of medication-use and implications for precision medicine 2 3 Yeda Wu1, Enda M. Byrne1, Zhili Zheng1,2, Kathryn E. Kemper1, Loic Yengo1, Andrew J. 4 Mallett1,3, Jian Yang1,2,4, Peter M. Visscher1,4,*, Naomi R. Wray1,4,* 5 6 1 Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland 4072, Australia. 7 2 Institute for Advanced Research, Wenzhou Medical University, Wenzhou, Zhejiang 325027, China. 8 3 Department of Renal Medicine, Royal Brisbane and Women’s Hospital, Herston, Queensland 4029, Australia. 9 4 Queensland Brain Institute, The University of Queensland, Brisbane, Queensland 4072, Australia. 10 * These authors jointly supervised this work; corresponding authors: [email protected] and 11 [email protected] 12 13 Abstract 14 It is common that one medication is prescribed for several indications, and conversely that 15 several medications are prescribed for the same indication, suggesting a complex biological 16 network for risk and its relationship with pharmacological function. Genome-wide 17 association studies (GWASs) of medication-use may contribute to understanding of 18 disease etiology, generation of new leads relevant for drug discovery and quantify prospects 19 for precision medicine. We conducted GWAS to profile self-reported medication-use from 23 20 categories in approximately 320,000 individuals from the UK . A total of 505 21 independent genetic loci that met stringent criteria for statistical significance were identified. 22 We investigated the implications of these GWAS findings in relation to biological 23 mechanism, drug target identification and genetic risk stratification of disease. Amongst the 24 medication-associated were 16 known therapeutic-effect target genes for medications 25 from 9 categories. 26 27 Introduction 28 Susceptibility to most common human is complex and multifactorial, involving both 29 genetic, environmental and stochastic factors1. During the last decade, large-scale genome- 30 wide association studies (GWASs) have identified thousands of single nucleotide 31 polymorphisms (SNPs) associated with diseases and related traits, consistent with a 32 polygenic genetic architecture of common disease. These results add useful human-relevant 33 information to drug development, drug repurposing and clinical trial pipelines2. Here, we 34 have turned the tables, aiming to identify genetic loci associated with medication-taking. In 35 the context of electronic health record data, medication-use may be an easy route to identify 36 disease-case subjects. However, in clinical practice, it is common that one medication is 37 prescribed for several indications, but conversely, several medications can be prescribed for 38 the same indication. It is likely that medication-use reflects not only similarity between 39 different clinical manifestations3 and/or comorbidity4 of diseases but also heterogeneity of 40 clinical manifestation (symptoms and signs) and of intervention response (for example, from 41 lifestyle change to the combination of treatments). 42 43 We hypothesise that genetic variants associated with taking medications categorised based on 44 anatomical and therapeutic classifications may add additional relevant information to 45 understanding the underlying biological mechanism of diseases and drug development 46 approaches. Here, we study genetic variation in current medication-use. We report 505 loci 47 independently associated with medication categories. We explore these GWAS findings for 48 biological mechanisms and as drug targets. We estimate the genetic correlation between the 49 23 medication traits, and with other diseases and traits using published GWAS results. We 50 use Mendelian Randomization to investigate putative causal relationships among diseases bioRxiv preprint doi: https://doi.org/10.1101/501049; this version posted December 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

51 and traits. We show that genetic predisposition to common disease predicts likelihood of 52 taking relevant medications, a significant finding in relation to future practice of precision 53 medicine for common disease. 54 55 Results 56 Case-control GWAS of medication-use 57 Medications taken by UKB participants were classified using the Anatomical Therapeutic 58 Chemical Classification System5 and provided in Figure 1 and S1, Table S1. Figure S2 59 shows the demographics of participants with medication records. The full phenotype 60 extraction pipeline for UKB participants is summarised in Figure S3. An overview of 61 analyses is provided in Figure S4. The medication-use case-control GWASs identify 910 62 within-trait independent SNPs significantly associated (P < 5x10-8) across 23 medication 63 traits (Figure 2 and S5). After applying a more stringent multiple testing threshold (P < 1e- 64 8/23) 6, a total of 505 SNPs remain (Table S2 and S3), with per-trait associations ranging 65 from 0 (C02: hypertensives, N02A:opioids, N06A: antidepressants) to 103 (C09: agents 66 acting on renin-angiotensin system) SNPs. Many of the associated SNPs may simply be a 67 reflection of the primary indication for which the medication is prescribed (Table S4). For 68 example, C09 medications have therapeutic effect on hypertension; of the 103 independent 69 SNPs associated with C09 medications (P < 10-8/23), we identified SNPs previously linked to 70 hypertension (7 SNPs)7, systolic pressure (32 SNPs)8, diastolic (5 71 SNPs)9 and pulse pressure (2 SNPs)9. Of the 55 independent SNPs associated with C10AA 72 (HMG CoA reductase inhibitors)-associated SNPs (P < 10-8/23), 19 SNPs have been reported 73 to be significantly associated with low-density lipoprotein cholesterol (LDLC)10, supporting 74 the known biological mechanism that statins are effective in lowering LDLC. However, for 3 75 medication-taking traits either small or no GWAS have been conducted for the medication- 76 relevant indications, including A02B (drugs for peptic ulcer and gastro-oesophageal reflux 77 disease), H03A (thyroid preparations) and N02BE (anilides). 78 79 Genetic predisposition to common disease predicts medication-taking 80 We undertook polygenic risk prediction analyses using GWAS summary statistics from 8 81 published disease/traits (Table S5) as discovery data to predict disease risk in 9 medication- 82 taking phenotypes as target data. Participants in the UK Biobank with a high GRS for 83 different diseases/traits have a higher odds of taking corresponding medications than those 84 with a low GRS (Figure 3; Table S6). The top decile of individuals ranked on risk prediction 85 for depression had an odds ratio (OR) of 1.7 in taking anti-depressants compared to the 86 bottom decile. Similarly comparing top and bottom deciles, we find an OR of 3.1 for taking 87 anti-diabetic medication (A10) for individuals ranked on genetic risk for type 2 and 88 of 3.3 for taking immunosuppressants (L04) for individuals ranked on their genetic risk for 89 rheumatoid arthritis (RA). The OR increased to 5.2 for taking L04 medications specific to 90 RA (Table S1). 91 92 GWAS results and biological mechanisms 93 First, we estimated SNP-heritability of the 23 traits using linkage disequilibrium (LD) score 94 regression11 (Figure S6; Table S7), all traits showed SNP-heritability (proportion of variance 95 attributed to genome-wide SNPs) significantly different from zero to a maximum of 0.15 (s.e. 96 0.008) for N02A (opioid medications) on the estimated scale. Second, to identify medication- 97 relevant tissue/cell types, we partitioned the SNP-heritability12 based on annotations of SNPs 98 to genes, and genes to differential expression between tissues. Among the 23 99 medication-taking traits, 8 traits showed significantly enriched association with genes 100 expressed in at least one tissue at a false discovery rate (FDR) < 5% (Figure S7). GWAS bioRxiv preprint doi: https://doi.org/10.1101/501049; this version posted December 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

101 associations for thyroid preparations (H03A), immunosuppressants (L04), adrenergics 102 inhalants (R03A), glucocorticoid (R06BA) and antihistamines for systemic use (R06A) were 103 enriched in immune cell types. Those of opioid analgesics (N02A) were enriched in central 104 nervous system tissues, such as limbic system, those of antimigraine preparations (N02C) 105 were enriched in cardiovascular tissue, and those of drugs affecting bone structure and 106 mineralization (M05B) were enriched in digestive cell type (Table S8). 107 108 Third, we investigated whether associations between SNPs and medication-taking traits were 109 consistent with mediation through gene expression, based on associations between SNPs and 110 gene expression (eQTLs). We identified 177 unique genes for which expression is 111 significantly associated with 19 medication-taking categories (Table S9) using summary 112 data-based Mendelian Randomization (SMR) analysis13. Gene-based association tests were 113 conducted using MAGMA14 from the GWAS SNP results for each of the 23 medication- 114 taking traits and a total of 1,841 significantly associated unique genes were identified (Table 115 S10). To provide biological insights from the GWAS associated loci, we used the gene-based 116 association test summary statistics to test for enrichment in 10,891 gene sets from MSigDB 117 (v5.2)15,16. All 23 medication-taking traits were enriched in at least one gene set at FDR < 5% 118 (Table S11). Several of the results showed plausible relevant biological mechanisms. For 119 example, the genetic associations for taking A10 (drugs used in diabetes) were enriched for 120 the glucose homeostasis gene set, those for taking C10AA (statins) were enriched in the 121 cholesterol homeostasis gene set, C09 (agents acting on renin-angiotensin system) for 122 cardiovascular-related gene sets, M05B (drugs affecting bone structure and mineralization) 123 for skeletal system development, chondrocyte differentiation gene sets, N02A for gene sets of 124 behavioural response to cocaine and neurogenesis and lastly H03A, L04, R03A, R03BA 125 medications for immune-related gene sets. Interestingly, genes associated with taking A02B 126 (drugs for peptic ulcer and gastro-oesophageal reflux disease) are enriched in gene sets of 127 central nervous system neuron differentiation and of neurogenesis, highlighting the 128 connection between gut and brain17. 129 130 Linking genes associated with medication-taking to drug targets 131 Secondary analyses of GWAS results not only provide insights into the biological complexity 132 of common diseases, but also offer opportunities relevant to drug development and 133 repurposing2,18,19. To determine whether genes associated with medication-taking could 134 provide clues relevant to drug target identification, we performed analyses using drug-target 135 lists from Santos et al.5 , ChEMBL20 and ClinicalTrial.gov (https://www.clinicaltrials.gov/) 136 database as reference. First, for each UKB medication category, we investigated whether 137 there are therapeutic-effect target genes for medications classified in that medication 138 category; a total of 9 genes were identified (Table S12). For example, we find HMGCR 139 (Entrez ID: 3156) is, as expected21, associated with taking C10AA medications (statins) and 140 encodes the HMGCR protein which is targeted by medications from C10AA category. 141 Second, we tested whether there are therapeutic-effect target genes for treating indications 142 relevant to taking medications of each category; a total of 7 genes were identified (Table 143 S12). PCSK9 (Entrez ID: 255738) in our analyses is also associated with taking C10AA 144 medications, and encodes the protein mediating lowering-cholesterol effect of evolocumab 145 (ATC code : C10AX13) and alirocumab (ATC code: C10AX14). Third, we looked at 146 whether there are therapeutic-effect target genes (ever or currently in clinical trial and not 147 approved by FDA yet) for treating indications relevant to medications of each category; a 148 total of 8 genes were identified (Table S13). For example, TSLP (Entrez ID: 85480) is 149 associated with R03A (adrenergics), R03BA (glucocorticoids) and R06A (antihistamines) 150 and also mediates the effect of tezepelumab for the treatment of uncontrolled asthma22. bioRxiv preprint doi: https://doi.org/10.1101/501049; this version posted December 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

151 Hence, among our associated genes are 24 genes with some known evidence of therapeutic 152 effect. Therefore, we anticipate that novel genes that are associated with medication may help 153 to prioritise other putative therapies23. In Table S14 we provide additional analyses for two 154 genes, IDE and AGT that we believe merit further study for type 2 diabetes and C07/C09 155 related disorders, respectively. 156 157 Shared genetic architecture between medication-taking traits and relevant complex 158 traits 159 The genetic correlation (rg) between the 23 medication-taking traits and 21 traits/diseases 24 160 (Table S5) related to them were calculated using bivariate LD score regression . Many rg 161 estimated were significantly different from zero. For example, body mass index, educational 162 attainment (EA), former/current smoker and coronary artery disease were significantly 163 correlated with most of the medication categories in expected directions. Major depression 164 and neuroticism showed positive rg with A02B (gastro-oesophageal reflux drugs), suggesting 165 a link between the brain and the digestive system. Type 2 diabetes showed correlations with 166 taking medications C02, C03, C07~C09 and C10AA, implying a shared genetic architecture 167 of type 2 diabetes, hypertension and hypercholesterolemia. The rg between B01A and 168 N02BA show similar patterns of rg with other diseases/traits are similar to those for N02BA 169 medications with other diseases/traits because the original medication aspirin (code number: 170 1140868226, 59150 individuals in our analysis) has multiple ATC codes (A01AD05, 171 B01AC06 and N02BA01). Full results are presented in Figure 4 and Table S15. 172 173 Putative causal relationship of diseases for using medication 174 It is reasonable to assume that having a disease is causal for taking the associated medication 175 (rather than reverse causation). Therefore, we used Mendelian Randomisation (MR) in a 176 proof-of-principle analysis to quantify causality. Independent SNPs (P-value < 5E-8) 177 associated with 15 selected diseases/traits (Table S5) were used as instruments to evaluate 178 putative causal relationships25 among these 15 diseases/traits and the 23 medication-taking 179 traits (Table S16 and Figure 5). Increasing BMI increases the likelihood of taking A10, 180 B01A, C01D, C02, C03, C07, C08, C09, C10AA, R03A medications, consistent with the role 181 of BMI across diseases related to these medications25. The effect of obesity on bone health is 182 controversial26. However, results from our analysis clearly show that increasing BMI 183 decreases the likelihood of taking M05B (bone-associated) medications (OR 0.68 per SD of 184 BMI). Major depression (MD) increases the likelihood of taking A02B medication (drugs for 185 peptic ulcer and gastro-oesophageal reflux disease; 1.23-fold increase per SD in liability to 186 MD), capturing a link between the brain and the digestive system. In addition to this, MD 187 increases the likelihood of taking N02BE (1.23-fold increase per SD in liability to MD) 188 medication, which is consistent with comorbidity of pain in some MD patients27. 189 190 Discussion 191 To our knowledge, this is the first paper profiling genetic contributions to medication-use. 192 Traditional GWAS identify DNA variants associated with disease, with a goal that these 193 discoveries ultimately may open the door to new drug treatments. Here, we have taken the 194 reverse approach, aiming to identify DNA variants associated with medication-taking, in 195 recognition that underlying biology may contribute to the same medication being prescribed 196 for several indications, and conversely that only some of those with a given diagnosis may 197 take a particular medication. As expected, some of our results for medication-taking 198 recapitulate GWAS results of the disease traits for which the medication is prescribed. 199 However, we have also identified some novel associations that may be worthy of follow-up. bioRxiv preprint doi: https://doi.org/10.1101/501049; this version posted December 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

200 We identified 505 linkage disequilibrium independent SNPs associated (P<1e-8/23) with 201 different medication-taking traits. For some of our traits, large GWAS for the medication 202 relevant indications have not been conducted, such as A02B (drugs for peptic ulcer and 203 gastro-oesophageal reflux disease, 2 SNPs) and N02BE (anilides, 4 SNPs). Notably, 76 SNPs 204 were associated with H03A (thyroid preparations – the main indication is hypothyroidism), 205 only 11 of these loci have been previously reported to be associated with hypothyroidism. 206 Conditional (mtCOJO) analysis suggested that these 76 SNPs associated with taking H03A 207 medication are indeed associated with hypothyroidism. We showed that individuals with 208 higher genetic risk of disease have higher likelihood to take relevant medications, for 209 example, individuals with higher GRS for RA have an OR of 3.3 to take immunosuppressants 210 compared with lower GRS individuals (Figure 3), thereby providing a proof-of-principle 211 validation of precision medicine based upon risk prediction of common diseases, since 212 individuals with high genetic risk of disease can be identified well before the onset of 213 symptoms and the time of medication prescription. 214 215 To provide biological insight to the SNP associations for medication-taking28, we linked 216 GWAS findings to relevant biological gene sets and drug target efficacy. These analyses 217 generated a series of expected or plausible results, such as genes associated with taking A10 218 (drugs used in diabetes) enriched in gene sets for glucose homeostasis. Our analyses also 219 generate new hypotheses; genes associated with taking N06A (antidepressants) showed 220 enrichment in the gene set for the synthesis and secretion and diacylation of ghrelin, a gut- 221 derived hormone29. Previous studies have described an antidepressant-like role of ghrelin30,31. 222 This line of evidence suggests that testing a pharmacological effect of ghrelin on depression 223 may be worthwhile. Although medication-associated genes overlapped with only a small 224 proportion of current drug target genes, the framework of genetic association studies provides 225 a potentially valuable resource for new drug target identification and prediction of 226 unfavourable side effects18. 227 228 Comorbidity is commonly observed in clinical practice, which means the presence of 229 additional diseases in relation to an index disease32. Results from genetic correlation and 230 disease-medication (exposure-outcome) MR highlight potential shared aetiology, and may 231 help explain medication use in clinical practice. Our analysis showed that major depression 232 increased the likelihood of taking A02B (drugs for peptic ulcer and gastro-oesophageal reflux 233 disease) and N02BE (anilides), the latter consistent with reports that antidepressant 234 prescriptions are not only indicated for depression, but also for pain33. 235 236 There are a number of limitations in our study. First, although the medication-use data were 237 obtained by trained nurses during interviews, the self-reported nature may limit the accuracy 238 of information. Second, the ambiguous names of medications may limit the accurate 239 classification of medications. The reasons (e.g. disease diagnosis) for taking medication were 240 not recorded and hence not available for further analysis, nor were duration and dosage of 241 medications. Third, our findings are specific to the UK biobank participants, which are 242 recognized to be a non-random sample of the UK population. Fourth, the medication-taking 243 in UK biobank participants may be more representative of medication-taking in the UK and 244 may not translate to other populations and different health systems. 245 246 In summary, we identified 505 independent loci associated with different medication-use in 247 318,177 individuals from UKB, with implications for biological mechanisms, drug target 248 identification and precision medicine for common disease. 249 bioRxiv preprint doi: https://doi.org/10.1101/501049; this version posted December 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

250 Methods 251 Medication data 252 We used self-report data of regular medication (prescription and over-the-counter) and health 253 supplements taken weekly, monthly or three-monthly from participants in the United 254 Kingdom Biobank (UKB) study (http://www.ukbiobank.ac.uk)34, mainly aged 37-73 years 255 when recruited between 2006 and 2010. Medication and health supplements data were coded 256 and manually mapped to their corresponding active ingredients and then to their Anatomical 257 Therapeutic Chemical (ATC) Classification System5 codes (Table S1). In total, medications 258 were classified into 1,752 categories, collapsing to 184 subgroups according to the first three 259 ATC levels (Figure 1, S1). 23 of these medication subgroups (based on participant numbers) 260 were selected for analysis. Detailed methods are provided in the Supplementary Appendix. 261 262 Genome-wide association study design and statistical analysis 263 Analyses used genome-wide genotypes for 318,177 participants of white European descent 264 (Figure S2). 23 medication-taking case groups and their corresponding control groups were 265 generated. Case groups were defined as those taking medications classified at the same ATC 266 level. Control groups comprised participants taking neither the case medication nor similar 267 medications. Similar medications were defined at those sharing the first two ATC levels as 268 the case medication or medications containing the case medication active ingredients. 269 Following standard quality control and genotype imputation methods (see Supplementary 270 Appendix), 7,288,503 SNPs with minor allele frequency (MAF) > 0.01 were used in 271 analyses. Case-control genome-wide association analyses were conducted using BOLT- 272 LMM35 with age, sex, assessment centre and 20 genetic principal components fitted as 273 covariates. Conditional analyses tested if SNPs associated with taking medications have been 274 previously linked to their corresponding medication-specific related indications/traits36,37. 275 Multi-trait-based conditional and joint (mtCOJO) analyses tested if medication-taking 276 associated SNPs are also associated with their relevant main indications in UKB25. 277 Genetic risk score (GRS) for UKB individuals were generated for 8 diseases using SNP 278 effect size estimates from published GWAS summary statistics (discovery sample data) 279 (Table S2). These GRS were used to predict the medication use traits related to these 280 diseases (asthma mapped to two medication use traits). Selection of the discovery samples 281 data was based on relationship to the medication-taking traits, availability of GWAS 282 summary statistics, cohort ancestry and no sample overlap with UKB. GRS were generated 283 for a range of discovery data association P value thresholds (5×10-8, 1×10-5, 1×10-4, 0.001, 284 0.01, 0.05, 0.1, 0.5). The GRS were evaluated as medication-taking odds ratio for each GRS 285 decile (relative to the 1st decile). 286 LD score regression11,24 was used to estimate the proportion of variance attributable to % 287 genome-wide SNPs (ℎ"#$) and to quantify genetic sharing at common variants across the 23 288 medication-taking traits and other traits. LD score regression for cell type specific analysis12 % 289 was applied to test the ℎ"#$ enrichment in different tissues for each of the 23 medication- 290 taking traits. Gene expression data of 205 tissues (53 from GTEx38 and 152 from other 291 sources39,40) were used for analyses. Summary-data-based Mendelian Randomization 292 (SMR)13 was used to integrate our trait association with blood expression quantitative trait 293 loci (eQTL, i.e., SNP-gene expression association) data41. Gene-based association analyses 294 were conducted using MAGMA (v1.06)14 to identify genes associated with different 295 medication-taking traits. Gene sets association analyses were conducted using MAGMA 296 (v1.06)14 with curated gene sets (c2.all) and gene ontology sets (c5.bp, c5.cc, c5.mf) from 297 MSigDB (v5.2)15,16. 298 Mendelian Randomization (MR) was used to investigate a putative causal relationship 299 between the 23 medication-taking traits and 15 significantly correlated traits (selected from bioRxiv preprint doi: https://doi.org/10.1101/501049; this version posted December 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

300 Table S2), using Generalized Summary-data-based MR (GSMR)25. We required that all 301 analyses had at least 7 genome-wide significant loci to use as MR instruments; the median 302 number of SNP instruments was 65. 303 304 Analyses linking GWAS results to drugs and disease 305 To check whether associated genes from MAGMA and SMR encode effect-mediating 306 targets for FDA-approved medications or corresponding indications, we used information 307 from Santos et al.5, based on medication approved by the FDA before June 2015. For those 308 approved later, we used the ChEMBL database20. To check whether associated genes encode 309 trait-relevant effect-mediating targets for drugs in clinical trial, we used ClinicalTrial.gov 310 (https://www.clinicaltrials.gov/). The CLUE Touchstone tool (https://clue.io/)42 was used to 311 check the correlation between signatures of drugs and knocking down a gene. 312 313 URLs 314 UK Biobank: http://www.ukbiobank.ac.uk; ClinicalTrial.gov: https://www.clinicaltrials.gov/; 315 CLUE Touchstone tool: https://clue.io/. 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 bioRxiv preprint doi: https://doi.org/10.1101/501049; this version posted December 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

345 References 346 347 1 Hunter, D. J. Gene-environment interactions in human diseases. Nature reviews. 348 Genetics 6, 287-298, (2005). 349 2 Nelson, M. R. et al. The support of human genetic evidence for approved drug 350 indications. Nature genetics 47, 856, (2015). 351 3 Zhou, X., Menche, J., Barabási, A.-L. & Sharma, A. Human symptoms–disease 352 network. Nature Communications 5, 4212, (2014). 353 4 Rzhetsky, A., Wajngurt, D., Park, N. & Zheng, T. Probing genetic overlap among 354 complex human phenotypes. Proceedings of the National Academy of Sciences of the 355 United States of America 104, 11694-11699, (2007). 356 5 Santos, R. et al. A comprehensive map of molecular drug targets. Nature reviews. 357 Drug discovery 16, 19-34, (2017). 358 6 Wu, Y., Zheng, Z., Visscher, P. M. & Yang, J. Quantifying the mapping precision of 359 genome-wide association studies using whole-genome sequencing data. Genome 360 Biology 18, 86, (2017). 361 7 Ehret, G. B. et al. Genetic variants in novel pathways influence blood pressure and 362 risk. Nature 478, 103-109, (2011). 363 8 Wain, L. V. et al. Novel Blood Pressure Locus and Gene Discovery Using Genome- 364 Wide Association Study and Expression Data Sets From Blood and the Kidney. 365 Hypertension, (2017). 366 9 Warren, H. R. et al. Genome-wide association analysis identifies novel blood pressure 367 loci and offers biological insights into cardiovascular risk. Nature genetics 49, 403- 368 415, (2017). 369 10 Willer, C. J. et al. Discovery and refinement of loci associated with lipid levels. 370 Nature genetics 45, 1274-1283, (2013). 371 11 Bulik-Sullivan, B. K. et al. LD Score regression distinguishes confounding from 372 polygenicity in genome-wide association studies. Nature genetics 47, 291, (2015). 373 12 Finucane, H. K. et al. Heritability enrichment of specifically expressed genes 374 identifies disease-relevant tissues and cell types. Nature genetics 50, 621-629, (2018). 375 13 Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts 376 complex trait gene targets. Nature genetics 48, 481, (2016). 377 14 de Leeuw, C. A., Mooij, J. M., Heskes, T. & Posthuma, D. MAGMA: Generalized 378 Gene-Set Analysis of GWAS Data. PLoS Computational Biology 11, e1004219, 379 (2015). 380 15 Subramanian, A. et al. Gene set enrichment analysis: A knowledge-based approach 381 for interpreting genome-wide expression profiles. Proceedings of the National 382 Academy of Sciences 102, 15545-15550, (2005). 383 16 Liberzon, A. et al. The Molecular Signatures Database Hallmark Gene Set Collection. 384 Cell Systems 1, 417-425, (2015). 385 17 Powell, N., Walker, M. M. & Talley, N. J. The mucosal immune system: master 386 regulator of bidirectional gut-brain communications. Nature reviews. 387 Gastroenterology & hepatology 14, 143-159, (2017). 388 18 Finan, C. et al. The druggable genome and support for target identification and 389 validation in drug development. Science translational medicine 9, (2017). 390 19 Nguyen, P. A., Deaton, A. M., Nioi, P. & Ward, L. D. Phenotypes associated with 391 genes encoding drug targets are predictive of clinical trial side effects. bioRxiv, 392 (2018). 393 20 Gaulton, A. et al. The ChEMBL database in 2017. Nucleic Acids Research 45, D945- 394 D954, (2017). bioRxiv preprint doi: https://doi.org/10.1101/501049; this version posted December 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

395 21 Visscher, P. M. et al. 10 Years of GWAS Discovery: Biology, Function, and 396 Translation. American journal of human genetics 101, 5-22, (2017). 397 22 Corren, J. et al. Tezepelumab in Adults with Uncontrolled Asthma. New England 398 Journal of Medicine 377, 936-946, (2017). 399 23 Collins, F. S. Reengineering Translational Science: The Time Is Right. Science 400 translational medicine 3, 90cm17-90cm17, (2011). 401 24 Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and 402 traits. Nature genetics 47, 1236-1241, (2015). 403 25 Zhu, Z. et al. Causal associations between risk factors and common diseases inferred 404 from GWAS summary data. Nature Communications 9, 224, (2018). 405 26 Palermo, A. et al. BMI and BMD: The Potential Interplay between Obesity and Bone 406 Fragility. International Journal of Environmental Research and Public Health 13, 407 544, (2016). 408 27 Jaracz, J., Gattner, K., Jaracz, K. & Górna, K. Unexplained Painful Physical 409 Symptoms in Patients with Major Depressive Disorder: Prevalence, Pathophysiology 410 and Management. CNS Drugs 30, 293-304, (2016). 411 28 de Leeuw, C. A., Neale, B. M., Heskes, T. & Posthuma, D. The statistical properties 412 of gene-set analysis. Nature Reviews Genetics 17, 353, (2016). 413 29 Lutter, M. et al. The orexigenic hormone ghrelin defends against depressive 414 symptoms of chronic stress. Nature neuroscience 11, 752, (2008). 415 30 Bali, A. & Jaggi, A. S. An Integrative Review on Role and Mechanisms of Ghrelin in 416 Stress, Anxiety and Depression. Current drug targets 17, 495-507, (2016). 417 31 Huang, H.-J. et al. Ghrelin alleviates anxiety- and depression-like behaviors induced 418 by chronic unpredictable mild stress in rodents. Behavioural Brain Research 326, 33- 419 43, (2017). 420 32 Valderas, J. M., Starfield, B., Sibbald, B., Salisbury, C. & Roland, M. Defining 421 comorbidity: implications for understanding health and health services. Annals of 422 family medicine 7, 357-363, (2009). 423 33 Wong, J. et al. Treatment indications for antidepressants prescribed in primary care in 424 quebec, canada, 2006-2015. JAMA 315, 2230-2232, (2016). 425 34 Sudlow, C. et al. UK biobank: an open access resource for identifying the causes of a 426 wide range of complex diseases of middle and old age. PLoS medicine 12, e1001779, 427 (2015). 428 35 Loh, P.-R., Kichaev, G., Gazal, S., Schoech, A. P. & Price, A. L. Mixed-model 429 association for biobank-scale datasets. Nature genetics, (2018). 430 36 Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome- 431 wide complex trait analysis. American journal of human genetics 88, 76-82, (2011). 432 37 Yang, J. et al. Conditional and joint multiple-SNP analysis of GWAS summary 433 statistics identifies additional variants influencing complex traits. Nature genetics 44, 434 369, (2012). 435 38 GTEx Consortium. The Genotype-Tissue Expression (GTEx) pilot analysis: 436 Multitissue gene regulation in humans. Science (New York, N.Y.) 348, 648-660, 437 (2015). 438 39 Pers, T. H. et al. Biological interpretation of genome-wide association studies using 439 predicted gene functions. Nature Communications 6, 5890, (2015). 440 40 Fehrmann, R. S. N. et al. Gene expression analysis identifies global gene dosage 441 sensitivity in . Nature genetics 47, 115, (2015). 442 41 Westra, H. J. et al. Systematic identification of trans eQTLs as putative drivers of 443 known disease associations. Nature genetics 45, 1238-1243, (2013). bioRxiv preprint doi: https://doi.org/10.1101/501049; this version posted December 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

444 42 Subramanian, A. et al. A Next Generation Connectivity Map: L1000 Platform and the 445 First 1,000,000 Profiles. Cell 171, 1437-1452.e1417, (2017). 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 bioRxiv preprint doi: https://doi.org/10.1101/501049; this version posted December 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

120 A02B: Drugs for peptic ulcer and 90 gastro-oesophageal reflux disease

60

S01E: Antiglaucoma preparations and miotics 30 A10: Drugs used in diabetes

V06D V04C V03A S03C R06A: Antihistamines for systemic use S03B S03A S02C S02B

S02A A01A A02A A02B S01X A03A A03B S01K A03C S01G A03F A04A S01F A05A S01E A06A A07A R03BA: Glucocorticoids S01C A07D S01B A07E S01A V A08A A09A A10A R06A A10B R05D S A11A R05C A A11B R03A: Adrenergics, inhalants R03D A11C A11D R03C A11E B01A: Antithrombotic agents R03B A11G R03A A11H R02A A11J R01B A12A R01A A12B R A12C P03A A16A P01B A: Alimentary tract and metabolism B01A P01A B02A B02B C01D: Vasodialtors used N06A: Antidepressants N07C P B: Blood and blood forming organs B03A N07B B B03B N07A B03X in cardiac glycosides N06D C: Cardiovascular system B05C N06C B05X N06B D: Dermatologicals N02C: Antimigraine preparations N06A C01A N05C G: Genito urinary system and sex hormones C01B N05B C01C C01D N02BA: Salicylic acid and derivatives N05A N H: Systemic hormonal preparations C01E C02: Antihypertensives N04B C02A N04A C02C N02BE: Anilides N03A J: Antiinfectives for systemic use C02D N02C C03A N02B L: Antineoplastic and immunomodulating agents C03B C03: Diuretics N02A C03C N01A C03D N02A: Opioids M: Musculo-skeletal system C03E M09A C04A

M05B N: Nervous system C C05A M04A M C05B M03C C05C P: Antiparasitic products, insecticides and C07A M03B C07B M05B: Drugs affecting bone M02A repellents C07C M01C C07D M01A C07F C07: Beta blocking agents structure and mineralization R: Respiratory system C08C L04A L C08D L03A C08G L02B S: Sensory organs C09A L02A C09B C08: Calcium channel blockers L01X C09C L01B V: Various C09D C10A M01A: Antiinflammatory and L01A D01A J07B D01B J05A J D02A C09: Agents acting on the antirheumatic products, non-steroids J04B D02B J04A D03A J02A D03B D D04A J01X renin-angiotensin system D05A J01M D05B J01G D06A J01F D06B D07A J01E H D07C L04: Immunosuppressants J01D D07X J01C D08A D09A J01B G D10A J01A D10B D11A G01A H05B G02A G02B G02C H04A G03A

G03G G03B G03D G03C H03B G03F H03A H02A H01C H01B H01A

G04C

G04B

G03X G03H C10AA: HMG CoA reductase inhibitors H03A: Thyroid preparations

Disease/Organ syst em groupings

Auto-immune diseases and allergies Paracetamol Acetaminophen Anatomic main group Chemical substance Cardiovascular disease

Digestive disease Diabetes (incl. complications) and endocrine disorders N 02 B E 01 Eye diseases Hypertension Therapeutic main subgroup Chemical subgroup Ki dney di sea ses Pharmacologic subgroup Pain Psychiatric disorders Medication Active ingredient ATC code

489 Figure 1. Distribution of 1,752 UKB medications at the first three ATC level. 490 The inner ring corresponds to the 1st level of the ATC code. The outer ring represents the first 491 3 level of the ATC code (184 subgroups). The length of the bar represents the number of 492 classified UKB medications assigned to that subgroup (numbers of participants are shown in 493 Figure 2). Red bars are the 23 medication-taking traits used in analyses (selected based on 494 participant numbers). The 23 medication-taking traits are grouped into 9 diseases and organ 495 system categories according to the main indications, which is highlighted using different 496 colours (legend bottom left). The legend at the bottom right shows how ATC codes are 497 assigned to each UKB medication. 498 499 500 501 502 503 bioRxiv preprint doi: https://doi.org/10.1101/501049; this version posted December 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

A02B Drugs for peptic ulcer and gastro-oesophageal reflux disease (GORD) A10 Drugs used in diabetes B01A Antithrombotic agents

C01D Vasodilators used in cardiac diseases C02 Antihypertensives C03 Diuretics C07 Beta blocking agents C08 Calcium channel blockers C09 Agents acting on the renin-angiotensin system C10AA HMG CoA reductase inhibitors

Group H03A Thyroid preparations Case L04 Immunosuppressants Control M01A Anti-inflammatory and antirheumatic products, non-steroids M05B Drugs affecting bone structure and mineralization N02A Opioids N02BA Salicylic acid and derivatives N02BE Anilides N02C Antimigraine preparations N06A Antidepressants R03A Adrenergics, inhalants R03BA Glucocorticoids R06A Antihistamines for systemic use S01E Antiglaucoma preparations and miotics 300 000 200 000 100 000 0 0 50 100 150 Number of participants Category Number of independently associated SNPs 504 Figure 2. Summary of UKB medication-taking GWAS analyses. 505 Text on the right side of each bar represents the meaning of each medication-taking ATC coded trait. 506 507 bioRxiv preprint doi: https://doi.org/10.1101/501049; this version posted December 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

A10 C09 C10AA 2.2 3.0 2.5 2.0 2.5 1.8 2.0

1.6 2.0 Decile odds ratio 1.4 1.5 1.5 1.2

P < 1e−05 P < 1e−05 P < 5e−08 1.0 1.0 1.0

1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10

GRS decile of type 2 diabetes GRS decile of systolic blood pressure GRS decile of low−density lipoprotein cholesterol Nagelkerke R 2: 0.014; AUC: 0.58 Nagelkerke R 2: 0.023; AUC: 0.58 Nagelkerke R 2: 0.013; AUC: 0.56

L04 M05B N02C 3.5 2.5 3.5 3.0 3.0 2.0 2.5

2.5 2.0 Decile odds ratio 2.0 1.5 1.5 1.5

P < 0.001 P < 1e−05 P < 1e−05 1.0 1.0 1.0

1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10

GRS decile of rheumatoid arthritis GRS decile of femoral neck bone mineral density GRS decile of migraine Nagelkerke R 2: 0.013; AUC: 0.59 Nagelkerke R 2: 0.007; AUC: 0.57 Nagelkerke R 2: 0.013; AUC: 0.59

N06A R03A R03BA 2.4 1.8 2.4 2.2 2.2 1.6 2.0 2.0 1.8 1.8

1.4 1.6 1.6 Decile odds ratio 1.4 1.4 1.2 1.2 1.2

P < 0.05 P < 1e−04 P < 5e−08 1.0 1.0 1.0

1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10

GRS decile of major depression GRS decile of asthma GRS decile of asthma Nagelkerke R 2: 0.004; AUC: 0.54 Nagelkerke R 2: 0.013; AUC: 0.57 Nagelkerke R 2: 0.009; AUC: 0.56

508 Figure 3. Odds ratio (OR) by genetic risk score (GRS) profile decile (1 = lowest, 10 = 509 highest GRS), with OR reported relative to decile 1 as the reference. 510 OR and 95% confidence intervals (blue bars) were estimated using logistic regression. The P 511 value in the bottom right hand corner of each plot refers to the P-value threshold in the 512 discovery sample used to generate the GRS. Note: An increased GRS of femoral neck bone 513 mineral density implies a lower density. 514 515 516 517 518 519 520 521 522 bioRxiv preprint doi: https://doi.org/10.1101/501049; this version posted December 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

steroids −

angiotensin system −

Medication

A02B:A10: Drugs DrugsB01A: for C01D: pepticusedAntithromboticC02: inVasodilatorsulcer diabetes AntihypertensivesC03: and C07:Diureticsagents GORD used BetaC08: in C09:blockingCalcium cardiac AgentsC10AA: agentschanneldiseasesH03A: acting HMGL04: Thyroidblockers onCoA ImmunosuppressantsM01A: the reductase preparations reninM05B: AntiinflammatoryN02A: Drugs inhibitorsN02BA: Opioids affectingN02BE: SalicylicandN02C: bone antirheumatic AnilidesN06A: acidAntimigraninestructureR03A: and Antidepressants derivatives R03BA:products,and Adrenergics mineralizationR06A:preparations Glucocorticoid nonS01E: Antihistamines inhalants Antiglaucoma forfor obstructive obstructivefor systemic preparations airwayairway use and diseases,diseases miotics inhalants 1 BMI * * * * * * * * * * * * * * * * * * EA * * * * * * * * * * * * * * * * Ever Smoked * * * * * * * * 0.75 Former/Current Smoker * * * * * * * T2D * * * * * * * * * * 0.5 HDLC * * * * * * * * * * * * LDLC * * * * TC * * * * 0.25 TG * * * * * * * * * * * * CAD * * * * * * * * * * * * * SBP * * * * * * * * * * * 0 DBP

Trait/Disease/Disorder * * * * * * * * * * PP * * * * * * * −0.25 RA * * * Forearm BMD Lumbar Spine BMD * −0.5 Femoral Neck BMD * MD * * * * * * * * * * * * * Neuroticism * * * * * * * −0.75 Asthma * * * * * * * * IOP * −1

523 Figure 4. Genetic correlation of the 23 medication-taking traits and 21 diseases/traits related 524 to them. 525 Abbreviations: Body mass index (BMI), Education attainment (EA), Type 2 diabetes (T2D), 526 High-density lipoprotein cholesterol (HDLC), Low-density lipoprotein cholesterol (LDLC), 527 Total cholesterol (TC), Triglyceride (TG), Coronary artery disease (CAD), Systolic blood 528 pressure (SBP), Diastolic blood pressure (DBP), Pulse pressure (PP), Rheumatoid arthritis 529 (RA), Bone mineral density (BMD), Major depression (MD), Intraocular pressure (IOP). 530 531 532 533 534 535 536 537 538 539 540 541 bioRxiv preprint doi: https://doi.org/10.1101/501049; this version posted December 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

542

543 A02B A10 B01A C01D C02 C03 C07 C08 C09 C10AA H03A L04 M01A M05B N02A N02BA N02BE N02C N06A R03A R03BA R06A S01E 3.11 0.93 1.12 1.08 2.36 2.22 1.54 544 Asthma (8e−07) (7e−06) (3e−07) (<1e−50) (<1e−50) (8e−38) 545 3.08 1.27 1.43 1.73 1.70 1.33 1.46 1.67 1.39 1.31 0.68 1.32 1.23 1.30 BMI 2.47 (<1e−50) (2e−14) (4e−05) (1e−11) (1e−42) (3e−11) (3e−20) (<1e−50) (4e−31) (4e−09) (1e−07) (4e−08) (5e−09) (1e−09)

546 1.19 1.78 1.11 1.36 1.46 1.39 1.21 0.89 1.14 0.94 CAD (7e−16) (9e−26) (6e−08) (9e−37) (4e−25) (7e−32) (1e−24) (3e−09) (5e−10) (5e−05) 547 1.96 0.99 1.03 1.06 1.11 1.17 1.16 1.18 1.17 1.05 0.98 0.97 0.98 0.98 0.98 DBP 548 (5e−05) (2e−17) (7e−11) (5e−34) (<1e−50) (<1e−50) (<1e−50) (<1e−50) (<1e−50) (6e−05) (1e−19) (3e−05) (1e−13) (5e−06) 0.83 0.32 Fomoral Neck BMD 1.56 549 (4e−08) (1e−50)

0.85 0.86 0.77 0.91 0.92 0.86 0.88 0.87 550 HDLC (9e−09) (3e−16) (3e−09) (1e−05) (1e−04) (2e−17) (2e−11) (1e−12) 1.24 551 1.40 1.42 1.14 1.24 1.20 2.36 0.90 1.16 1.41 0.91 LDLC (<1e−50) (2e−19) (6e−14) (1e−37) (2e−40) (<1e−50) (9e−13) (1e−11) (<1e−50) (7e−12)

552 0.89 0.36 Lumbar Spine BMD 0.99 553 (2e−05) (<1e−50) 1.23 1.23 1.63 MD 554 (4e−06) (7e−08) (6e−24) 0.78 1.02 1.05 1.08 1.09 1.08 1.12 1.06 1.03 PP 555 (4e−05) (2e−05) (2e−10) (2e−34) (8e−35) (1e−47) (2e−30) (3e−15)

1.06 1.34 1.42 1.05 1.08 1.09 556 RA 0.62 (3e−13) (<1e−50) (<1e−50) (1e−06) (2e−35) (5e−30)

557 0.99 1.02 1.03 1.08 1.11 1.09 1.11 1.10 1.03 0.99 0.99 0.99 0.99 0.99 SBP (3e−07) (2e−19) (2e−10) (4e−47) (<1e−50) (<1e−50) (<1e−50) (<1e−50) (<1e−50) (1e−05) (3e−15) (3e−05) (2e−09) (2e−05) 558 0.5 2.28 1.12 1.08 1.08 1.11 1.19 1.11 T2D 559 (<1e−50) (1e−15) (2e−05) (5e−05) (4e−15) (6e−44) (9e−11) 0.90 1.34 1.36 1.13 1.27 1.29 2.31 0.91 1.16 1.36 0.90 TC 0.39 560 (2e−05) (<1e−50) (5e−14) (7e−12) (2e−42) (<1e−50) (<1e−50) (3e−11) (1e−11) (<1e−50) (7e−15)

1.40 1.34 1.10 1.15 1.37 1.36 2.01 1.18 1.35 561 TG (<1e−50) (2e−08) (1e−05) (5e−09) (1e−36) (9e−49) (<1e−50) (5e−08) (1e−37) 562 0.31 563 564 Figure 5. Mendelian Randomisation results using published SNPs associated with 15 diseases/traits as instrument. Rows are the exposure traits 565 and columns are the outcome medication traits. −4 566 Rows represent exposure and columns represent outcome. The significant effects after correcting for 345 tests (P value ≤ 1.4 × 10 ) are labelled 567 with OR (P value). The OR is per SD in liability when the exposure is disease. Abbreviation: Body mass index (BMI), Coronary artery disease 568 (CAD), Diastolic blood pressure (DBP), Bone mineral density (BMD), High-density lipoprotein cholesterol (HDLC), Low-density lipoprotein 569 cholesterol (LDLC), Major depression (MD), Pulse pressure (PP), Rheumatoid arthritis (RA), Systolic blood pressure (SBP), Type 2 diabetes 570 (T2D), Total cholesterol (TC), Triglyceride (TG). bioRxiv preprint doi: https://doi.org/10.1101/501049; this version posted December 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

571 Acknowledgments 572 We thank members of The University of Queensland Program in Complex Trait Genomics 573 group. This research was supported by the Australian National Health and Medical Research 574 Council (Grant Number: 1113400, 1078901, 1078037). Yeda Wu is supported by the F.G. 575 Meade Scholarship and UQ Research Training Scholarship from the University of 576 Queensland. This study makes use of data from UK Biobank (Project ID: 12514) and we 577 thank the UK Biobank participants and the UK Biobank team for generating an important 578 research resource. 579 580 Author contributions 581 P.M.V. , N.R.W. and Y.W conceived and designed the experiment. Y.W. performed the 582 analysis with assistance and guidance from E.M.B., Z.Z., K.E.K., J.Y., Z.Z., K.E.K. and L.Y. 583 contributed to data quality of UKB data. Y.W., P.M.V. and N.R.W. wrote the manuscript 584 with the participation of all authors. 585 586 Declaration of interests 587 We declare that all authors have no competing interests.