ADIPOSE AND ISOFORM EXPRESSION EFFECTS ON CARDIOMETABOLIC TRAITS

Chelsea Kaitlin Raulerson

A dissertation submitted to the faculty at the University of North Carolina at Chapel Hill in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Bioinformatics and Computational Biology program in the School of Medicine.

Chapel Hill 2019

Approved by:

Karen L. Mohlke

Terrence S. Furey

Yun Li

Michael I. Love

Joel S. Parker

Daniel Pomp

© 2019 Chelsea Kaitlin Raulerson ALL RIGHTS RESERVED

ii

ABSTRACT

Chelsea Kaitlin Raulerson: Adipose gene and isoform expression effects on cardiometabolic traits (Under the direction of Karen L. Mohlke and Terrence S. Furey)

Genome-wide association studies (GWAS) have identified thousands of genetic loci associated with cardiometabolic traits including type 2 diabetes (T2D), lipid levels, body fat distribution, and adiposity, although most causal remain unknown. We used subcutaneous adipose tissue RNA-seq data from

434 Finnish men from the Metabolic Syndrome in Men (METSIM) study to identify 9,687 primary and

2,785 secondary cis-expression quantitative trait loci (eQTL; <1 Mb from transcription start site, false discovery rate FDR<1%). Compared to primary eQTL signals, secondary eQTL signals were located further from transcription start sites, had smaller effect sizes, and were less enriched in adipose tissue regulatory elements compared to primary signals. Among 2,843 cardiometabolic GWAS signals, 262 colocalized by linkage disequilibrium (LD) and conditional analysis with 318 transcripts as primary and conditionally distinct secondary eQTLs, including some across ancestries. Of cardiometabolic traits examined for adipose tissue eQTL colocalizations, waist-hip ratio (WHR) and circulating lipid traits had the highest percentage of colocalized eQTLs (15% and 14%, respectively). Among alleles associated with increased cardiometabolic GWAS risk, approximately half (53%) were associated with decreased level. Mediation analyses of colocalized genes and cardiometabolic traits within the 434 individuals provided further evidence that gene expression influences variant-trait associations. We further quantified exons and splice junctions to identify cis-exon-level eQTLs and cis-splice eQTLs

(sQTLs). Using pathway analysis, we identified 46 pathways that potentially affect the transcriptional program of adipose tissue, including several that impact mRNA splicing, lipid droplet formation, and inflammation. Among the sQTLs, we identified 104 variants associated with 132 splice junctions found in

74 genes colocalized with cardiometabolic GWAS loci. Further, we highlighted plausible mechanisms for two novel splice junctions in ALDH3A2 and NR1H3 that may impact oxidative stress and cholesterol

iii homeostasis. These results identify hundreds of candidate genes and mechanisms that may act in adipose tissue to influence cardiometabolic traits.

iv

To my parents, Craig and Gail, whose encouragement to “keep my eye on the ball” got me this far. To my fiancée, Chris, who endured a lot of nervous “but what ifs” and still loves me. And to my mentors, Karen and Terry, without whose guidance I’d still be lost in the weeds. Thank you all for everything.

v

ACKNOWLEDGEMENTS

First, I would like to thank my mentors, Karen Mohlke and Terry Furey. I couldn’t ask for a better team to support me. Through many trials and tribulations with cell-type heterogeneity and PEER factors, you taught me the critical scientific skill of perseverance. I have been astonished to find that I can now predict your questions and future analysis suggestions, because of the care you put into explaining your thought processes over the years. Thank you for your wisdom and intuition and thank you for letting me try to do the analysis both ways.

I would also like to extend thanks to my committee members, Yun Li, Michael Love, Joel Parker, and Daniel Pomp. Your questions and suggestions in our meetings have been extremely helpful to my development as a scientist, especially in shaping the way I approached new problems and questions as they arose. Special thanks to Mike Love for developing the “unmix” segment of DESeq2 to help me with the heterogeneity of our adipose tissue.

I would also like to thank the members of the Mohlke and Furey labs past and present. Your encouragement through the rough times and your advice on various aspects of analysis have been critical to this dissertation. In particular, I would like to thank my “science big brothers” Jeremy Simon,

Damien Croteau-Chonka, Bryan Quach, and Jim Davis, who gave me a bunch of important and unsolicited advice. I would also like to thank Cassie Spracklen, Ying Wu, and Alaine Broadaway for letting me pick their brains endlessly about statistical models, career development, and for time saving code.

Financial support for this dissertation was provided by NIH T32GM067553.

Finally, I would like to thank my friends and family. Without your reassurance, I wouldn’t have gotten very far. To my Mom and Dad, thank you for making your home a quiet escape for visits and for all the delicious meals, home-cooked and otherwise. To Bob and Vicki, thank you for extending your love and hospitality to me through many “tours of Texas”. To my friends near and far, thank you for DnD and boardgame sessions for welcome distractions and staying in touch. Thank you to my animals,

Schrödinger and Barkimedes, for the cuddles and for telling me when it’s too late to be working by

vi herding me to bed. And last, but not at all least, thank you to my fiancé Chris. You have been my sounding board through many problems, my encouragement, and a wonderful, loving partner. Thank you.

vii

TABLE OF CONTENTS

LIST OF TABLES ...... x LIST OF FIGURES ...... xi LIST OF ABBRIEVIATIONS ...... xii CHAPTER 1: INTRODUCTION ...... 1 Cardiometabolic traits and GWAS ...... 1 Expression Quantitative Trait Loci (eQTLs) and RNA-sequencing ...... 2 Adipose tissue and gene expression ...... 4 Deconvolution and cell-type heterogeneity...... 5 Colocalization of GWAS and eQTL signals ...... 5 Expression-trait associations and mediation analysis ...... 6 METabolic Syndrome in Men (METSIM) Study ...... 7

CHAPTER 2: ADIPOSE TISSUE GENE EXPRESSION ASSOCIATIONS REVEAL HUNDREDS OF CANDIDATE GENES FOR CARDIOMETABOLIC TRAITS ...... 12 Introduction ...... 12 Methods ...... 13 METSIM study participants and sample characteristics ...... 13 RNA extraction and sequencing ...... 14 Expression quantification and sample level QC ...... 14 eQTL mapping ...... 15 eQTL validation ...... 16 eQTL enrichment in adipose tissue open chromatin and Roadmap adipose nuclei chromatin states 16 Selection of GWAS loci and GWAS-eQTL colocalization ...... 17 Trait-gene expression association ...... 18 Mediation analysis ...... 19 Results ...... 20

viii Analysis of tissue heterogeneity ...... 20 eQTL identification and characterization...... 20 Colocalization of eQTLs and cardiometabolic GWAS loci ...... 22 Cross-ancestry colocalization analysis ...... 24 Cardiometabolic trait association with expression level of eQTL-GWAS colocalized genes ...... 26 Mediation analysis ...... 26 Discussion ...... 27 Acknowledgements ...... 30

CHAPTER 3: SPLICE EQTL DETECTION REVEALS ISOFORM VARIATION IN CARDIOMETABOLIC TRAITS ...... 117 Introduction ...... 117 Methods ...... 118 METSIM study participants, genotypes and sequencing ...... 118 Gene, exon, and splice junction quantification ...... 119 Gene eQTL, exon eQTL, and sQTL detection ...... 119 Pathway analysis ...... 120 Colocalization with cardiometabolic GWAS loci ...... 120 Validation of a novel NR1H3 splice junction ...... 121 Results ...... 121 sQTL identification and characterization ...... 121 Colocalization of sQTLs and cardiometabolic GWAS loci ...... 123 Colocalizations at MADD-NR1H3 locus, by trait and quantification ...... 124 Discussion ...... 125 Acknowledgements ...... 126 CHAPTER 4: DISCUSSION ...... 147 REFERENCES ...... 152

ix LIST OF TABLES

Table 2.1: Characteristics of the METSIM study participants in 434 sample subset ...... 47 Table 2.2: Comparison of cis-eQTL discovery between sample subsets ...... 48 Table 2.3: Comparison of known adipose cis-eQTLs between sample subsets ...... 49 Table 2.4: Replication of lead eQTLs between METSIM and GTEx ...... 50

Table 2.5: Enrichment in Roadmap adipose nuclei marks and METSIM adipose tissue ATAC-seq peaks ...... 51 Table 2.6: Trans-eQTL discovery at three significance thresholds ...... 52 Table 2.7: Replication of METSIM trans-eQTL for KLF14 target genes in TwinsUK ...... 55

Table 2.8: 93 cardiometabolic diseases and quantitative traits key terms for which GWAS Catalog loci were examined ...... 58

Table 2.9: Primary eQTLs (FDR <1%) colocalized with GWAS variants based on linkage disequilibrium and conditional analysis between GWAS and lead eQTL variants ...... 73 Table 2.10: Primary and secondary eQTLs colocalized at GWAS loci...... 74 Table 2.11: Proportion of tested GWAS signals colocalized ...... 75

Table 2.12: Secondary eQTLs (FDR <1%) colocalized with GWAS variants based on linkage disequilibrium and conditional analysis between GWAS and lead eQTL variants ...... 78 Table 2.13: Coloc2 summaries of 173 genes found to be colocalized by conditional analysis ...... 93 Table 2.14: Comparison of LD for loci with strongest GWAS signals for BMI identified in East Asians .... 95 Table 2.15: Gene expression - trait associations (unadjusted for BMI) for genes colocalized by LD ...... 98 Table 2.16: Gene expression - trait associations (adjusted for BMI) for genes colocalized by LD ...... 101 Table 2.17: Mediation of cardiometabolic traits through gene expression ...... 115 Table 2.18: Mediation analysis of GWAS variants ...... 116 Table 3.1: Gene, exon, and splice eQTL discovery with various PEER factors ...... 132 Table 3.2: A summary of gene-, exon-, and splice junction eQTL discovery ...... 133 Table 3.3: Pathway analysis of sQTL-specific genes ...... 134 Table 3.4: sQTL colocalizations with cardiometabolic traits ...... 140 Table:3.5: Distinct GWAS signals around the MADD-NR1H3 locus ...... 141 Table 3.6: Pairwise LD between nine GWAS signals in the MADD-NR1H3 region ...... 142 Table 3.7: Gene-level colocalizations at NR1H3 ...... 143 Table 3.8: Proxy variants for the lead sQTL variant at NR1H3 (rs1449626) ...... 146

x LIST OF FIGURES

Figure 1.1: Assignment of reads for different quantification methods ...... 9 Figure 1.2: Colocalization between GWAS and eQTL signals ...... 11 Figure 2.1: Percentages of different tissues across 550 samples ...... 32 Figure 2.2: Selection and identification of PEER factors ...... 33 Figure 2.3: Comparison of primary and secondary eQTL association signals...... 34 Figure 2.4: eQTL effect sizes for variants located in adipose tissue ATAC-seq peaks or ...... 35 Figure 2.5: Cardiometabolic GWAS and eQTL colocalizations, stratified by trait ...... 37

Figure 2.6: Extended Cloud City plots of cardiometabolic GWAS and eQTL colocalizations, stratified by trait ...... 41 Figure 2.7: A WHRadjBMI signal colocalizes with the secondary eQTL for DGKQ ...... 43 Figure 2.8: Comparison of GWAS and eQTLs found not to be colocalized by coloc2 ...... 44 Figure 2.9: Cross-ancestry comparison of colocalized GWAS – eQTL signals ...... 45 Figure 3.1: The number of significant splice QTLs observed per gene ...... 127 Figure 3.2: Overlap between genes identified by eQTLs on the gene-, exon- and splice-level ...... 128 Figure 3.3: Novel splice junction that removes exon 4 of NR1H3 ...... 129 Figure 3.4: Colocalization of multiple GWAS and eQTL signals at MADD-NR1H3 ...... 130 Figure 3.5: Proxy variants for rs1449626 in Finns ...... 131

xi LIST OF ABBRIEVIATIONS

BMI Body mass index

CI Confidence interval

COPD Chronic obstructive pulmonary disease

CPM Counts Per Million

CVD Cardiovascular disease eQTL Expression quantitative trait locus

EST Expressed sequence tag

FDR False discovery rate

FGlu Fasting glucose

GWAS Genome-wide association study

HDL High density lipoprotein

LD Linkage disequilibrium

LDL Low density lipoprotein

M Million

Mb Megabase

METSIM METabolic Syndrome in Men

PSI Percent spliced in

RNA Ribonucleic acid

RNAseq Ribonucleic acid sequencing sQTL Splice expression quantitation trait locus

TC Total cholesterol

TG Triglycerides

TMM Trimmed mean of M values

TPM Transcripts per million

TSS Transcription start site

T2D Type 2 diabetes

WHR Waist to hip ratio

xii WHRadjBMI Waist to hip ratio adjusted for BMI

2hrGlu 2 hour glucose test

xiii

CHAPTER 1: INTRODUCTION

Cardiometabolic traits and GWAS

Cardiovascular disease (CVD) and type 2 diabetes (T2D) present a substantial cost and public health burden in the US and worldwide1,2. In the United States, cardiovascular disease affects about

121.5M people and will cost the US an estimated 680 billion dollars in 20203. An estimated 26M

Americans are living with Type 2 diabetes and about 101.2M are undiagnosed or have prediabetes3.

Cardiovascular disease and T2D share a cluster of risk factors, sometimes called metabolic syndrome, that are thought to contribute to disease progression4,5. These risk factors include obesity and body fat distribution, measured by body mass index (BMI) and waist-to-hip ration (WHR), circulating lipid levels, such as high-density lipoprotein cholesterol (HDL), low-density lipoprotein cholesterol (LDL) and triglycerides, and clinical measures of glycemic traits, such as fasting insulin, proinsulin, and glucose6.

Risk factors for cardiometabolic diseases are heritable7–9 and are themselves complex traits, meaning that numerous genes are involved in the development of these clinical phenotypes. The complexity of these diseases and traits requires genome-wide efforts to detect the many genetic loci that contribute to disease etiology and progression.

Genome wide association studies (GWAS) provide an unbiased approach to identify variants that are associated with clinical traits and diseases10. To date, GWAS have identified thousands of variant-trait associations for cardiometabolic diseases and risk factors, such as CVD, T2D, WHR, circulating lipid levels and others11–15. Many of these GWAS associations include multiple distinct associations within a region16, henceforth referred to as signals. These signals can be separated from each other using statistical methods like step-wise conditional analysis or GCTA17 but are not necessarily fully independent. In a recent T2D GWAS meta-analysis, variants at 243 loci were associated with disease, but multiple distinct signals were found at some loci for a total of 403 genome-wide signals12. Historically,

1

European ancestry GWAS studies have had the largest sample sizes for complex traits, but GWAS of other ancestries, particularly East Asian ancestry, are rapidly increasing in sample size. This expansion is critical, because allele frequency can differ substantially between populations with different ancestries, resulting in changes in the lead variant(s) or even novel signals18.

While GWAS have identified regions of the genome that may influence the trait of interest, they do not provide evidence for which gene or genes are altered by the signal, or identify the mechanism of action. The identity of the putative genes and mechanisms at these loci is valuable because candidate genes could be analyzed biochemically to assess their effects on traits and disease and could eventually be used as potential targets for pharmaceutical interventions. To name these loci, some GWAS investigators attempt to identify a plausible biological candidate gene in the surrounding region but many simply name the locus after the nearest gene due to the high volume of signals detected genome-wide.

While a fraction of GWAS variants change the coding sequence of proteins, the majority of variants found in these analyses are non-coding and have unknown function10. One common hypothesis is that noncoding variants may influence clinical traits by altering the expression of nearby genes. Since gene expression levels have been found to be highly heritable19,20, one of the many methods have been proposed to focus on the plausible candidate genes is mapping expression quantitative trait loci

(eQTLs)21. This method uses the gene expression, assayed via microarray or RNA-sequencing, and genotypes from the same samples to perform association testing, ordinarily using linear regression, and identify eQTLs.

Expression Quantitative Trait Loci (eQTLs) and RNA-sequencing

An eQTL signal consists of one or more variants and a gene. For ease of use, eQTLs are sometimes counted by the number of genes significantly associated with at least one variant. Genetic variants that are significantly associated with genes are sometimes called eSNPs and their counterpart genes are sometimes called eGenes. eQTLs are detected locally (termed cis-eQTLs and which we define as within 1Mb of the transcription start site [TSS] of the gene) and distally (termed trans-eQTLs and defined as either >1 Mb or >5 Mb away from the TSS). Distal eQTLs identify genetic variants which influence the expression level of transcripts that are located distally on the same or on other

2

. Distal eQTLs also commonly have a local association between the lead variant and a transcript encoding a or other molecule that may modulate the expression level of other genes. Power to detect trans-eQTLs is still limited given current sample sizes due to multiple testing burden and modest effect sizes, especially when compared with model organisms22. Both cis- and trans- eQTL associations have been reported across many tissue types23. Like GWAS associations, eQTL studies can identify multiple distinct associations in a region, by conditional analysis. Primary eQTLs are the associations identified without any conditional analysis, and all subsequent signals are called conditional eQTLs. Studies have shown that conditional eQTL signals are common genome-wide23 and their associated lead variants are farther away from the TSS than primary eQTL signals24. Allelic heterogeneity, a process by which multiple variants act through a gene on a trait, may account for the

‘missing heritability’ observed at GWAS loci25, rendering the inclusion of conditional eQTL in current and future studies vital.

Since changes in exon usage have been linked to important biological processes26–28, identifying the exons and splices present in tissue is critical to understanding the active pathways. RNA-sequencing provides the opportunity to capture differences in exon usage by estimating the read counts on the splice junction and exon level in addition to gene-level counts (Figure 1.1). Gene-level counts combine all reads that map to the gene of interest based on outside gene annotation, such as GENCODE29 or RefSeq30.

These gene-level annotations frequently include different transcript types, such as protein-coding, long non-coding RNAs, pseudogenes, and others. By contrast, exon counts estimate the proportion of reads that fall within a specific exon31,32. Notably, Odhams et al. identified more genes involved in eQTL associations on the exon-level than on the gene-level33. Splice junction counts are based on the number of reads that span the exon junction, providing specific evidence of skipped exons34. Splice junction counts can then be adjusted for gene-level expression to mitigate the effect of gene level variation from splice junction eQTL (sQTL) detection34. Previous studies have identified and evaluated sQTLs or exon- level eQTLs in lymphoblastoid cell lines33 and cultured fetal retinal pigment epithelium cells35 as well as whole blood36,37, adipose38, and brain tissue from the dorsolateral prefrontal cortex39,40. The Genotype-

Tissue Expression (GTEx) project has also detected sQTLs across 9 tissue types41. Further application of eQTL and sQTL analyses to gene expression data sets may continue to identify novel associations.

3

Adipose tissue and gene expression

Adipose tissue is involved in a wide variety of cardiovascular and metabolic processes, including angiogenesis42 and blood pressure43, body weight homeostasis and appetite regulation44, and insulin sensitivity and glucose homeostasis45,46. Adipocytes expand to store excess lipids, buffer lipid energy balance45,47, and release adipokines to regulate energy levels48. Subcutaneous adipose tissue in particular has been suggested to play a protective role in metabolic risk49. Adipose tissue can also buffer energy balance by expanding either the size of individual adipocytes or by forming new adipocytes; larger adipocytes have been linked to insulin resistance45,46, while proliferation of adipocytes confers lower risk.

While an estimated 94% eQTLs are shared across two or more tissues50, some are tissue-specific due to the expression pattern of the gene, necessitating the study of tissues closely connected to disease etiology and progression.

Prior efforts have been made to identify the expression profile of subcutaneous adipose tissue and the variants that alter expression of genes and splices. Many studies have identified cis-eQTLs in adipose tissue51–55. Among these, several have compared adipose tissue with other metabolically active tissues such as liver and skeletal muscle53–55. Gamazon et al. identified adipose-specific eQTLs by comparing with other tissues within GTEx55. Studies have also mapped trans- eQTLs in adipose tissue, discovering several potential master regulators51–53,56,57. Small et al. identified a trans-eQTL signal originating at the KLF14 locus on chromosome seven in subcutaneous adipose tissue from the MuTHER cohort (n=776 healthy female twins)57. Civelek et al. extended the gene network of KLF14 to 55 genes, identified a novel trans-eQTL at ZNF800, using subcutaneous adipose tissue from 770 Finnish men from the METSIM study52. Recently, Small et al. expanded the set of genes associated with KLF14 to 385, including a subnetwork regulated by SREBF157. The GTEx Consortium found that subcutaneous adipose shared the most sQTL signals with whole blood and the least with lung and ventricular heart. Ma et al., used six tissues from GTEx, including subcutaneous adipose, to identify sQTLs affecting the FTO locus.

Differences in sample size, participant characteristics such as population and sex, data collection, and analysis methods between these studies have led to some differences between the discovery of cis- and trans-eQTLs on the gene-level as well as sQTLs.

4

Deconvolution and cell-type heterogeneity

Bulk tissue expression analysis presents a unique set of challenges to isolating gene expression profiles, such as contamination from other tissues or heterogeneity of expression between different cell- types native to the tissue of interest. Subcutaneous adipose tissue itself is heterogeneous and contains native cell-types, such as adipocytes, preadipocytes, and endothelial cells, as well as infiltrating macrophages, which enter adipose tissue as part of the inflammatory processes surrounding obesity58,59.

Additionally, the type of biopsy can contribute heterogeneity to the sample, with needle biopsies providing more potential for blood cell contamination than other methods60. Due to these challenges, deconvolution methods have been proposed to detect and remove contaminated samples to improve data quality or to attempt to isolate the signals of specific sub-populations of cells61–63. Deconvolution analyses can also be used to adjust for the cell-type component of the signal to allow more heterogeneous samples to be compared. Recently, Glastonbury et al. noted that adjustment for known and unknown factors that influence gene expression, using PEER or similar software, is sufficient to reduce the variability for reliable detection of gene-level cis-eQTLs64. However, studies with the goal of ascertaining a signal’s cell- type of origin require more careful deconvolution methods.

Colocalization of GWAS and eQTL signals

The mere presence of an eQTL within a GWAS region does not necessarily indicate that the eQTL gene is implicated as a potential candidate gene for the effect of this genomic region on a clinical trait. An estimated 78% of genes genome-wide have at least one variant associated with their expression level36, suggesting genetic variants affect expression level of essentially all genes. GWAS variants may be associated with eQTL genes, sometimes called eGenes, due to low or moderate linkage disequilibrium

(LD) with the lead eQTL variant. Linkage disequilibrium is a pattern of inheritance in which the observed frequency of genetic variants being inherited together differs from the expected frequency at which the variants would be inherited together at random. Due to LD, a GWAS variant may be significantly associated with expression level of the eQTL gene, but not among the variants that have the strongest effect on gene expression (Figure 1.2). To ensure that GWAS and eQTL signals do not coincide by

5

chance alone, colocalization methods are applied to formally test whether the GWAS and eQTL represent the same signal.

Numerous colocalization methods have been proposed, using both Bayesian and frequentist statistical models65–71. Some colocalization methods require access to primary genotypes and gene expression levels, while others require only summary-level data from both the GWAS and eQTL studies, that are often made publicly available. All of these methods are somewhat sensitive to the priors and assumptions 65,72, thresholds for significance, or to the presence of multiple signals66. Colocalization is further complicated due to changes in allele frequency when comparing eQTL studies of smaller sample size to large GWAS meta-analyses with heterogeneity of sample size from variant to variant and by comparing eQTLs of one ancestry to GWAS in another.

Several studies have identified cis-eQTLs that colocalize with GWAS loci for T2D, WHR, BMI, and other cardiometabolic traits51–54. Civelek et al. found evidence of 140 eQTLs colocalized with 109 cardiometabolic GWAS traits52. The STARNET study compared cardiometabolic disease GWAS loci to eQTLs from seven vascular and metabolic tissues, including subcutaneous and visceral adipose tissue

(n=600 patients with coronary artery disease) and found substantial overlap between adipose tissue and blood lipid phenotypes54. Sajuthi et al. mapped eQTLs in adipose and muscle tissue from 260 non- diabetic African Americans and found cis-eQTL associations with more than 54 genes at GWAS loci for

T2D and obesity53. Gamazon et al. found enrichment of adipose-specific cis-eQTLs for diastolic blood pressure GWAS signals in GTEx (version 6)55. Additional GWAS-eQTL colocalizations may be detected by using larger eQTL sample sizes and different methods of measuring gene expression, as well as by analyzing additional GWAS loci.

Expression-trait associations and mediation analysis

Measured clinical traits within the eQTL study population present the opportunity to detect associations between gene expression and cardiometabolic risk factors. These relationships can be detected by applying linear regression to the gene and trait levels. For cardiometabolic traits, however, inclusion of BMI as a covariate presents a challenge, since the levels of approximately half of all genes expressed in adipose tissue are associated with BMI52. This may introduce the potential for collider bias,

6

because BMI is also associated with other clinical traits and included as a covariate for traits like WHR, fasting glucose, and fasting insulin. Collider bias occurs when an artificial association is introduced due to both the exposure (gene) and the outcome (trait) associating with a third variable, in this case, BMI73.

When interpreted with appropriate consideration of collider bias, gene-trait association can help to elucidate the genes that are associated with clinical traits.

Mediation analyses incorporate genetic variants to perform a more sophisticated test for the effect of a variant on a trait level through gene expression level. Mediation analyses with individual level data for variant, trait and expression level is rare, because very few eQTL studies have been performed with tissue from individuals that have extensive clinical traits measured. Evidence of mediation can provide additional statistical support for the involvement of candidate genes in the modulation of traits, particularly at colocalized GWAS and eQTL signals.

METabolic Syndrome in Men (METSIM) Study

The METSIM cohort is a population study of 10,197 Finnish men (ages 45-73) with the goal of examining the genetic and nongenetic factors that influence T2D, CVD, and related risk factors74. To address these goals, the METSIM investigators collected whole blood for genotyping and exome- sequencing, subcutaneous adipose tissue for RNA-sequencing and DNA methylation, and extensive clinical phenotypes, including body fat distribution, CVD risk factors, glycemic, and lipid traits.

Subcutaneous adipose tissue was collected via needle-biopsy from near the umbilicus within three months of the initial clinical phenotyping. A previous microarray study has been conducted with a subset of the METSIM cohort (n=770) to measure expression in adipose tissue52. The RNA-sequencing study of

434 Finnish men presented here is also a subset of METSIM, with 331 samples shared with the prior microarray study. RNA-sequencing provides a greater dynamic range than microarray expression measurements, allowing for finer resolution of relative expression level among very lowly and highly expressed transcripts75, and the ability to quantify transcripts or portions of transcripts that are not yet annotated, such as novel splice junctions34.

In the following chapters, I describe the analysis of subcutaneous adipose tissue gene expression levels from 434 participants in the METSIM study. In Chapter 2, I describe primary and secondary eQTLs

7

and characterize these different sets for transcript types, distance to TSS, and enrichment in promoters and enhancers. I describe colocalizations between cardiometabolic GWAS loci and gene-level eQTL signals, including cross-ancestry analysis and complex loci that consist of multiple GWAS and eQTL signals. I also incorporate clinical trait data from the same samples to assess the potential for these candidate genes to mediate the relationship between genetic variants and cardiometabolic traits. In

Chapter 3, I describe many of the same analyses using splice junctions to define sQTLs. In that chapter, I describe an example of a complex locus that includes nine GWAS signals for four traits, two eQTL signals, and an sQTL. The sQTL is for a novel splice junction, not previously annotated in GENCODE, and I demonstrate the validity of this mapping and predict the functional consequence of the splicing event on HDL cholesterol levels. In Chapter 4, I summarize and compare the findings from both studies, consider their limitations, and propose future directions.

8

Gene-level Expression

Gene A counts: 3 Gene A Gene B Gene B counts: 6

Exon-level Expression

Exon 3 counts: 5 Exon 3 Exon 4 Exon 5 Exon 4 counts: 3 Exon 5 counts: 6

Splice junction Expression

Exon 3 Exon 4 Exon 5

Splice junction 3-4 counts: 2 Splice junction 4-5 counts: 2 Splice junction 3-5 counts: 5 Exon 3 Exon 4 Exon 5

Exon 3 Exon 5

Figure 1.1: Assignment of reads for different quantification methods: Read counts in RNA- sequencing can be quantified on different levels. For each model (gene-, exon-, and splice-level), the reads overlapping different features are shown (left) and the resulting read counts are shown (right).

9

GWAS value) - (p 10 log -

Gene A Gene C

Gene B Gene D

chromosomal position

Gene D eQTL (not colocalized) value) - (p 10 log -

Gene C eQTL (colocalized) value) - (p 10 log -

Gene A Gene C

Gene B Gene D

chromosomal position

10

Figure 1.2: Colocalization between GWAS and eQTL signals: This is a model of colocalization at a

GWAS signal (top, dark blue). The lead GWAS variant has a significant association with expression level of both Gene D (center) and Gene C (bottom). However, the eQTL signal for Gene D does not match the pattern of association at the GWAS signal, and the GWAS variant is not among the peak of variants with the strongest association with Gene D’s expression level. By contrast, the eQTL signal for Gene C does match the pattern of association at the GWAS signal, and the eQTL peak variants align with the GWAS peak variants, indicating colocalization.

11

CHAPTER 2: ADIPOSE TISSUE GENE EXPRESSION ASSOCIATIONS REVEAL

HUNDREDS OF CANDIDATE GENES FOR CARDIOMETABOLIC TRAITS1

Introduction

Excess adipose tissue, especially in central abdominal depots, is associated with increased cardiometabolic risk76,77 and mortality78. Subcutaneous adipose tissue expands to store additional lipids and serves as a buffering system for lipid energy balance, especially for fatty acids45,47, providing a protective role in metabolic risk49. However, expansion of adipocyte size, rather than formation of new adipocytes has also been linked to insulin resistance45,46.

Genome-wide association studies (GWAS) have identified thousands of loci for hundreds of human traits, but most functional variants, genes affected by variants, and mechanisms remain elusive.

Identification of genetic variants associated with gene expression level (eQTLs) in relevant tissues has proven useful to link non-coding GWAS variants to plausible candidate genes that may influence complex traits50. While 94% of eQTLs are shared across at least two tissues50, some are specific to one tissue or a subset of tissues, necessitating the study of tissues that potentially contribute to GWAS traits to identify candidate genes.

Recently, eQTL studies have begun to identify additional eQTL signals through conditional analysis23,24,79 in addition to the more commonly reported primary eQTLs. These additional conditionally distinct secondary eQTL signals are widespread and located more distal than primary signals from the transcription start sites of the associated genes23,24. The additional eQTL signals have also been shown to colocalize with GWAS loci24, suggesting that they can detect additional candidate genes.

1 This chapter previously appeared in the American Journal of Human Genetics. The original citation is as follows: Raulerson, C. K. et al. Adipose Tissue Gene Expression Associations Reveal Hundreds of Candidate Genes for Cardiometabolic Traits. Am. J. Hum. Genet. 105, 773–787 (2019).

12

Previous studies have identified adipose tissue cis-eQTLs and tested for colocalizations of eQTL signals with cardiometabolic GWAS loci52,54,80,81. Additionally, GWAS have reported colocalized subcutaneous adipose cis-eQTLs with loci for body mass index (BMI), waist-hip ratio (WHR), WHR adjusted for BMI (WHRadjBMI), type 2 diabetes (T2D), circulating lipid levels, and adiponectin, a hormone produced by adipocytes that regulates glucose levels and fatty acid breakdown12,15,82–84. However, colocalization presents its own challenges, particularly when the GWAS and eQTL studies are from different ancestries or when multiple, conditionally distinct signals exist.

Here, we describe the analysis of subcutaneous adipose tissue gene expression levels from 434 participants in the METabolic Syndrome in Men (METSIM) study. METSIM participants have been well characterized for detailed clinical phenotypes, including metabolic and cardiovascular traits such as plasma lipids, anthropometric, and glycemic traits74. We identified and characterized primary and secondary cis-eQTL genes that colocalize with GWAS loci for BMI, cholesterol and triglyceride levels,

WHR and WHRadjBMI, T2D, adiponectin, cardiovascular endpoints, and other cardiometabolic traits. We further associated gene expression level with cardiometabolic trait levels in the METSIM cohort and identify the genes that show the strongest evidence of mediating the variant to trait associations.

Methods

METSIM study participants and sample characteristics

METSIM is a population-based cohort composed of 10,197 males of Finnish ancestry from

Kuopio, Finland74. For this analysis, we used a subset of 550 participants from whom subcutaneous adipose tissue had been collected near the umbilicus by needle biopsy. The METSIM study was approved by the Ethics Committee of the University of Eastern Finland and Kuopio University Hospital in

Kuopio, Finland and carried out in accordance with the Helsinki Declaration. Written informed consent was obtained from all participants.

Genotypes were measured using the Illumina OmniExpress BeadChip array and the Illumina

HumanCoreExome at the Center for Inherited Disease Research (Baltimore, MD). Genotyping quality control (QC) and imputation to the Haplotype Reference Consortium (HRC) panel85 have been previously

13

described52. For this study, we filtered HRC-imputed genotypes to retain 7.8 million variants with minor allele frequency (MAF)>0.01 and imputation quality, r2>0.3.

RNA extraction and sequencing

Following the adipose tissue biopsies, total mRNA was isolated using the Qiagen miRNeasy kit

(Qiagen, Hilden, Germany), following the manufacturer’s instructions. mRNA was isolated with a polyA+ selection protocol (Illumina TruSeq RNA Sample Preparation Kit v2) and sequenced on the Illumina

HiSeq 2000 platform at the University of California Los Angeles Neuroscience Genomics Core (UNGC) to an average sequencing depth of 45 million paired-end 50 bp reads.

Reads were filtered using the Fastx-toolkit requiring 80% of bases to have phred quality > 20.

Reads containing linker and adapter sequences were removed using TagDust (see URLs). We implemented STAR (version 2.4.2a)86 to align reads to the hg19/GRCh37 reference sequence87, using GENCODE v19 (July 2013 freeze) as the annotation. Duplicate reads were retained.

Read pairs with unpaired alignments were removed. The average uniquely mapped reads across all samples was 82.2%.

To ensure RNA samples were matched with the correct genotypes, we applied MixupMapper88, verifyBamID89, and GATK best practice guidelines (see URLs) to call variants from RNA-seq data and assign best matches between the RNA-seq and genotype data. We retained samples that matched DNA genotypes corresponding to the expected self-sample for at least one method.

Expression quantification and sample level QC

To identify fragments mapping to transcript isoforms, we used Salmon (version 0.7.2)90 and provided GENCODE v19 as annotation; to correct for bias in isoform abundance estimates due to technical biases resulting from GC content, we used the --gcBias option. Isoform abundances were collapsed to the gene-level, and transcript per million (TPM) values were calculated using tximport

(version 1.0.3)91. To remove genes with limited evidence of expression, we retained genes with ³ 5 reads in > 25% of samples, resulting in using 21,735 genes of 57,821 for cis-eQTL analysis.

14

To assess the tissue heterogeneity of tissue samples, we used the unmix function in DESeq263 to estimate the percent composition of whole blood, Epstein Barr virus (EBV)-transformed lymphocytes, skeletal muscle, and subcutaneous adipose, using Genotype-Tissue Expression (GTEx) v7 median TPM values per tissue as reference55. We examined the expression of adipose-specific genes in three sample sets: all samples (n = 550) and the subsets comprised of samples with approximately >50% adipose composition (n = 434) and > 75% adipose composition (n = 387).

eQTL mapping

To adjust for the known and unknown technical factors that influence gene expression estimates, we inverse normal transformed gene-level TPMs and implemented probabilistic estimation of expression residuals (PEER)92 and inverse normal transformed again, since both PEER and eQTL detection require data in a normal distribution. To optimize for cis-eQTL discovery, we performed PEER analysis including differing numbers of factors (k = 10 to 90) at intervals of 10. We next performed cis-eQTL detection using

FastQTL (v2.184)93 on all variants within 1Mb of the TSS of each gene and assessed the total number of cis-eQTLs and the number of genes associated with ³ 1 variant with various PEER factor values. We selected 60 PEER factors to maximize the number of cis-eQTLs identified. We next implemented

FastQTL permutation testing to estimate adjusted p-values and calculated the P-value corresponding to

1% FDR using the qvalue package in R (P < 9.6 ´ 10-6) to account for genome-wide eQTL testing of variants within 1Mb of each transcript and determine the significance threshold.

To identify conditionally distinct cis-eQTLs, we first identified all genes with ³ 1 significantly associated variant at FDR < 1% and the lead variant for each of these genes. We then included the dosage values of the lead variant for each gene as a covariate for eQTL mapping using FastQTL.

Conditional secondary eQTLs were considered significant at FDR < 1% for primary cis-eQTLs (P < 9.6 ´

10-6).

To identify trans-eQTLs, we removed 3,182 pseudogenes94 and genes with low mappability as these have been observed to increase false positives in distal eQTL analyses36. We implemented

RSeQC95 to calculate transcript integrity number (TIN)96, the distribution of deletions across reads, and the distribution of inserted nucleotides across reads from BAM files. We inverse normal transformed the

15

gene expression for 18,553 remaining genes and adjusted for TIN, batch, age, insert size, deletion distribution, and either 0 or 3 PEER factors prior to analysis. To avoid collider bias, we used limited PEER factors in trans-eQTL mapping. Using QTLtools (v1.1)32, we performed association tests for variants > 1

Mb away from the transcription start site (TSS); the significance threshold for identifying trans-eQTLs was calculated using the Bonferroni correction (P < 3.4 ´ 10-13).

eQTL validation

For validation, we compared eQTLs with the significant eQTLs from the GTEx Project V7 release for both subcutaneous (n = 385) and visceral adipose tissue (n = 313). For variant-gene pairs available in both files, we matched the effect alleles to compare the of direction of effect. To ensure that groups of variants associated with the same gene did not inflate replication estimates, we limited this analysis to the lead variants for each gene in either METSIM or GTEx. We used the lead variant for each gene from one study and assessed whether that variant was significant for the same gene in the other study. We then counted the number of variant-gene pairs that showed a consistent direction of effect and met several different P-value thresholds between 5 ´ 10-3 and 5 ´ 10-8 in both studies.

eQTL enrichment in adipose tissue open chromatin and Roadmap adipose nuclei chromatin states

We compiled a set of adipose tissue transcriptional regulatory elements from tissue-derived adipose nuclei chromatin states from the Roadmap Epigenomics Consortium97 and METSIM adipose tissue accessible chromatin regions98 generated using an Assay for Transposase-Accessible Chromatin (ATAC- seq)99. For Roadmap chromatin states, we used the 18 state chromatin model and selected promoter states (1_TssA, 2_TssFlnk, 3_TssFlnkU, 4_TssFlnkD, 14_TssBiv) and enhancer states (7_EnhG1,

8_EnhG2, 9_EnhA1, 10_EnhA2, 11_EnhWk, 15_EnhBiv). We defined adipose tissue accessible chromatin regions as the union of ATAC-seq peaks across three METSIM subcutaneous adipose samples98. We also tested for enrichment among the subsets of ATAC-seq peaks that overlapped

Roadmap adipose nuclei promoters and enhancers97, requiring that a peak be completely contained within the epigenomic region. We pruned eQTL variants based on linkage disequilibrium (LD, r2 > 0.2) using swiss to prevent variants in nominal LD with each other from inflating enrichment estimates. We

16

selected background variants using GREGOR100; variants were matched on allele frequency, distance to gene, and LD (r2 > 0.8 1000 Genomes phase 1, in a 1 Mb window). Background variants were then pruned with swiss using the same parameters as for significant eQTLs. We removed background variants that were themselves significant eQTLs at FDR < 1% (primary or secondary signals) or that were in high

LD (r2 > 0.8) with an eQTL lead variant (primary or secondary signals). We tested for enrichment using logistic regression, regressing the eQTL status of a variant (1 = significant eQTL, 0 = not eQTL), against whether that variant overlapped a given genomic annotation (1 = overlapped, 0 = not overlapped). An eQTL or background variant was considered to overlap a genomic annotation if it or any of its LD proxies

(r2 > 0.8, 1000 Genomes phase 1) overlapped the annotation. Enrichment was defined as the beta of the regression model, which is the log of the odds ratio that a variant is an eQTL if it overlaps the given genomic annotation.

To compare the effect sizes of eQTL signals that overlap various genomic annotations, we used the lead eQTL variant and all variants in high LD (r2 > 0.8) to identify overlaps with promoters, enhancers, and ATAC-seq peaks, stratified by primary and secondary eQTLs. We then used the effect size of the lead variant of the overlapping eQTLs to test for differences in the strength of effect among different annotations using a Wilcoxon rank sum test.

Selection of GWAS loci and GWAS-eQTL colocalization

We downloaded the NHGRI-EBI GWAS catalog101 in December 2017 and extracted 4588 GWAS variants associated with one or more of 93 cardiometabolic traits (Table 2.8) at genome-wide significance

(P < 5 ´ 10-8) and then added results from five additional large GWAS studies for cardiometabolic traits12–

15,102. To ensure that GWAS variants representing the same signal did not inflate colocalization estimates, we performed LD pruning using swiss (see URLs) across all traits; variants were ranked by P-value, and variants in LD r2 ³ 0.8 in METSIM with the lead variant at each locus were removed. We defined the

GWAS risk allele as the variant associated with a disease state or with the higher quantitative trait level, except for adiponectin and high-density lipoprotein (HDL) cholesterol levels, for which we defined the risk effect allele as associated with a lower trait level as these indicate poorer health outcomes.

17

We performed initial colocalization analysis based on LD between a lead GWAS variant and a lead eQTL variant. We then performed conditional analysis in the eQTL data by providing genotypes for the lead GWAS variant to FastQTL as a covariate. We considered signals to be colocalized if: (i) the pairwise LD was high between the GWAS variant and eQTL variants (r2 ³ 0.8 in METSIM), and (ii) after conditioning on the GWAS variant, the lead eQTL variant no longer met the 1% FDR equivalent P-value

(P > 9.6 ´ 10-6).

For eQTLs that appeared colocalized with GWAS signals based on LD, we applied coloc224 to further assess evidence of colocalization. Coloc2 is a Bayesian method that uses summary statistics to test whether variants underlying a signal are shared between two studies. For each gene, we included all variants located within 1 Mb of the TSS for which summary statistics were available in both the GWAS and eQTL studies. Coloc2 estimates the posterior probabilities of five hypotheses concurrently (H0, no association signal in either the GWAS or eQTL; H1, only the GWAS has an association signal; H2, only the eQTL has an association signal; H3, both datasets have an association signal, but they are not the same; H4, the GWAS and eQTL associations signals are colocalized). To make our results more comparable with prior colocalization studies, we set prior probabilities for coloc2 to values comparable to coloc’s66 default settings, rather than allowing the software to generate its own priors. We used coloc’s default priors for p1 and p2, the prior probabilities that a variant is causal for either GWAS or eQTL respectively (p1 and p2 = 1 ´ 10-4) We set p12, the prior probability that a variant is causal to both GWAS and eQTL, to 1 ´ 10-6. We selected this more conservative value than coloc’s default p12 prior (1 ´ 10-5) because we are using coloc to provide additional support for conditional analysis results based on LD and conditional analysis. We evaluated the posterior probabilities that correspond to H3 and H4, that both the

GWAS and the eQTL have signals, but that they are not the same (PP3) and that the GWAS and eQTL signals colocalize (PP4), respectively. We considered the signals to have strong evidence of colocalization if PP4 > 0.8.

Trait-gene expression association

For 287 genes that showed evidence of GWAS signal colocalization with a primary eQTL signal based on LD and conditional analysis, we tested for association between gene expression level and each

18

of the 20 cardiometabolic traits measured in the 434 individuals (for a list of traits, see Tables 2.15 and

2.16). We adjusted traits for age and gene expression level for TIN, sequencing batch, and age.

Following inverse normal transformation of both trait and expression level residuals, we used linear regression to test for association between the gene expression levels and traits. Because adipose expression level of a large number of genes is associated with BMI, we also performed these analyses additionally adjusting both gene expression levels and traits for BMI. The P-value corresponding to 5%

FDR was calculated using the qvalue package in R.

Mediation analysis

Mediation analyses were conducted in 434 METSIM individuals using a procedure described by

Imai et. al103, similar to that of Huang et al.104, implemented in R and C++, assuming an additive effect and modified for a continuous outcome. We performed mediation analyses only when the eQTL and

GWAS data were colocalized and trait data for the GWAS trait itself or a similar trait was available for

METSIM individuals, specifically testing fasting glucose, fasting insulin, Matsuda index, and HOMA-β for

T2D. We determined the effect of a variant on a trait through the mediator of gene expression, the effect of the variant on the trait when the mediator is held constant, and the combined effect, which is the sum of the mediation and variant-trait effects. This method uses two least squares regression models, where the first model uses the variant genotype, coded as 0, 1, or 2, as the predictor and the mediator of gene expression as the response, and the second model uses both gene expression level and variant genotype as predictors and the trait outcome as the response. The mediation effect was estimated by the product of coefficients from the two linear regression models. We assumed that the individual effects were linear and that there is not an interactive effect between the variant and mediator on the outcome.

Confidence intervals for the mediating effect were calculated using Monte Carlo simulations. We set ! =

0.05 and calculated 95% confidence intervals corresponding to a possible range for the estimates of the effect. Confidence intervals that did not include 0 were considered evidence of a mediating effect.

19

Results

Analysis of tissue heterogeneity

To identify cis-eQTLs for subcutaneous adipose tissue, we performed RNA-seq on needle biopsy samples obtained from 550 Finnish men in the METSIM study (Table 2.1). Adipose tissue is comprised of adipocytes, pre-adipocytes, endothelial cells, and various immune cells58,59, and needle biopsies of adipose tissue can include whole blood and/or muscle. The resulting heterogeneity between samples can confound the analysis of bulk tissue transcriptomics105–107, but its effects can be mitigated by tissue deconvolution methods. We estimated tissue composition using tissue from GTEx as reference tissue profiles (Figure 2.1). We found wide variation in the estimated percentage of adipose tissue within our samples. To determine whether to limit the analysis to higher purity samples or to retain the larger sample size including lower quality samples, we compared eQTL results from the set of all 550 samples to subsets of samples with estimated adipose tissue percentage of approximately >50% (n=434) and >75%

(n=387). For all three sample sets, we performed cis-eQTL analyses with a range of PEER factor corrections using 7.8M variants and 21,735 genes. We then compared the strength of association at three previously described adipose eQTL loci52,80 and counted the total number of variant-gene associations observed (Tables 2.2, 2.3). The known eQTLs showed the strongest associations within the >50% adipose subset (Table 2.3). The 50% adipose subset also had the most significant cis-eQTL variants

(1.6M vs. 1.5M and 1.4M, 50%, all, and 75% sample sets, respectively) and the most variant-gene pairs

(3.0M vs. 2.8M and 2.5M, 50%, all, and 75% sample sets, Table 2.2). These results suggest that the heterogeneity between samples in the full sample set of 550 samples and the smaller sample size of the

387 samples with >75% adipose attenuated the association signals and that requiring samples to contain >50% adipose tissue (n=434) yielded a set of cis-eQTL results that best correspond to adipose tissue; thus, we performed all subsequent analyses using the subset of 434 samples. eQTL identification and characterization

We performed genome-wide cis-eQTL analyses using ~7.8 million genetic variants (MAF > 0.01) from the 434 samples with >50% adipose tissue and 21,735 genes. Of these, 9,687 genes were associated with at least one variant located within 1 Mb of the TSS (FDR < 1%, P < 9.6 ´ 10-6). The first

20

and second PEER factors correlated with TIN96 and estimated percent adipose tissue, respectively

(Figure 2.2).

To examine the validity of our results, we compared the METSIM subcutaneous adipose tissue eQTL results to those from the GTEx project’s subcutaneous (n=385) and visceral (n=313) adipose eQTL analyses. Overall, GTEx detected fewer cis-eQTLs (FDR<5%) than METSIM (Table 2.4). At a significance threshold of P < 5 ´ 10-5, which meets the FDR <5% threshold in both studies, 4,442 (95.4%) and 3,370 (96.4%) of the lead eQTL variants identified in METSIM also showed significant associations with the same gene and in a consistent direction of effect in the GTEx subcutaneous and visceral adipose data, respectively. Combined, these results demonstrated consistency of the METSIM subcutaneous adipose tissue eQTLs with GTEx eQTLs.

Secondary association signals are frequently identified at eQTL loci and may reflect more complex gene regulation23,24. Of 9,687 genes that exhibited primary eQTL signals, 2,785 genes (28.7%) showed a significant secondary cis-eQTL (FDR < 1%, P < 9.6 ´ 10-6). The proportions of transcript types with primary and secondary cis-eQTLs were not significantly different from each other (&2 P = 0.42, Figure

2.3A). Secondary eQTLs have been reported to be located farther away from the TSS than primary eQTLs23,24. We also observed this pattern; the distribution of distances to TSS for secondary eQTLs

(median=40.3 kb) was significantly higher than for primary eQTLs (median=26.7 kb, Mann Whitney U- test, P < 2.2 ´ 10-16; Figure 2.3B). The identification of secondary eQTL signals in subcutaneous adipose tissue greatly expanded the set of cis-eQTL signals for available further investigation.

To characterize the potential function of the primary and secondary eQTL signals, we evaluated the enrichment of and effect sizes of the lead variant and LD proxies compared to adipose nuclei chromatin states (promoters, enhancers) defined by the Roadmap Epigenomics Project97 and METSIM adipose tissue open chromatin as measured by ATAC-seq98. Lead variants of primary eQTL signals were more enriched in open chromatin, promoters, and enhancers than matched control variants, especially in open chromatin regions within promoters (P < 2 ´ 10-308; β = 2.83). Secondary eQTL variants showed a similar but weaker pattern of enrichment in all categories, even for a matched number of eQTL genes

(Table 2.5; Figure 2.3C). Further, effect sizes for primary eQTL variants were significantly larger than those of secondary eQTL variants found within promoters and enhancers, (Wilcoxon rank sum test, P =

21

4.3 ´ 10-131; Figure 2.3D). Including both primary and secondary eQTL variants, effect sizes of variants within promoters were significantly larger than variants within enhancers (P = 8.1 ´ 10-12; Figure 2.4), and effect sizes of variants within open chromatin in promoters were significantly larger than promoter variants generally (P = 3.2 ´ 10-5). We observed no significant difference in the effect sizes between variants within open chromatin in enhancers and all enhancer variants. These results indicate that eQTL variants within promoters, particularly those in open chromatin, have larger effects on expression levels of nearby associated genes than those variants in enhancers.

To identify variants that influence the expression of distal genes and contribute to regulatory networks in adipose tissue, we performed a trans-eQTL analysis using 7.8 M variants (MAF > 0.01) and

18,553 protein-coding and lncRNA transcripts, requiring variants to be > 1 Mb away from the TSS.

Accounting for the number of tests performed (Bonferroni, P < 3.4 ´ 10-13), we identified 4,432 target genes, and at a relaxed significance threshold (P < 5 ´ 10-8), we identified 13,953 target genes, representing 24 and 75% of genes tested, respectively (Table 2.6). At the known adipose-specific trans- eQTL hub near KLF1452,57, which is also a known GWAS locus for HDL-cholesterol108 and type 2 diabetes

(T2D)109, we identified rs4731702 as the lead KLF14 cis-eQTL variant and validated (P < 5 ´ 10-8) the distal association of this variant with expression level of one known target gene, PRMT2, and two additional target genes, SNX14 and RBBP7, not previously reported52,57 (Table 2.7). This male-only

METSIM study detected fewer trans-eQTL associations for KLF14 than the larger female-only TwinsUK study57, consistent with power and the reported stronger effects in females at this locus110.

Colocalization of eQTLs and cardiometabolic GWAS loci

While thousands of cardiometabolic GWAS loci have been identified, many of the genes underlying these associations remain unknown. We evaluated cis-eQTLs for evidence of colocalization with 2,843 GWAS signals (variants clumped by LD r2 ³ 0.8) associated with one or more of 93 cardiometabolic traits and available in our cis-eQTL data (Table 2.8). Of the 2,843 GWAS signals, 874

(30.7%) were significantly associated with at least one nearby (<1 Mb) gene at FDR < 1% (Table 2.9).

Using LD and conditional analysis (see Methods), we identified colocalizations of 231 GWAS variants with expression of 287 genes, described further below as GWAS-colocalized primary eQTLs.

22

Colocalized GWAS and eQTL signals suggest potential candidate genes for cardiometabolic risk.

Among the 287 primary eQTL genes colocalized with a GWAS trait, eQTLs for 116 genes were identified at BMI loci, 62 genes for lipid traits, 57 genes for WHRadjBMI, 37 genes for T2D, and 10 genes for cardiovascular disease (Table 2.10). These colocalizations were based on differing numbers of GWAS signals per trait; to identify the proportion of GWAS signals with colocalized eQTLs for each trait, we compared the total number of signals reported by five recent GWAS publications for BMI, lipids, WHR,

WHRadjBMI, and T2D, which reported between 79 to 851 conditionally distinct GWAS signals per trait, to the number of signals with a colocalized eQTL signal from those same studies. Across traits from five studies12–15,102, the proportion of signals found to be colocalized ranged from 8% for T2D and BMI to 12% for WHRadjBMI, 14% for all lipids, and 15% for WHR (Table 2.11). These results suggest that adipose tissue gene expression levels are relevant to a broad set of cardiometabolic and other GWAS traits, potentially reflecting the cell types present in adipose tissue.

Genes may influence disease risk by downregulating or upregulating expression level or function111. To assess the direction of effect of GWAS signals on gene expression level, we matched the

GWAS risk alleles to the gene expression effect alleles (see Methods). Across all primary colocalized loci, the GWAS trait risk allele corresponded to decreased gene expression for 53% of transcripts (Table 2.10;

Figure 2.5; Figure 2.6), with a range from 44% for LDL loci to 65% for WHR loci. These results show that for common complex traits, disease risk is associated with decreased gene expression level for about half of all colocalized signals.

We also evaluated secondary eQTL signals for colocalization with the initial 2,843 GWAS signals.

Of 154 lead GWAS variants associated (FDR < 1%) with a secondary eQTL signal for at least one nearby gene (Table 2.12), 31 eQTL signals for 31 transcripts were colocalized at GWAS loci. Of these, 10

(32.3%) variants showed evidence of colocalization with at least one primary eQTL gene and with a different secondary eQTL gene. The other 21 variants (67.7%) did not demonstrate any evidence of colocalization with a primary eQTL signal such that no gene would have been identified using primary eQTLs alone. We did not observe the same gene to be colocalized with a GWAS variant for both a primary and secondary eQTL. For example, the lead variant rs11724804 from a WHR and WHRadjBMI locus, is not colocalized with the primary eQTL signal for DGKQ via conditional analysis (rs11731377; LD

23

r2 = 0.26) but was colocalized with the secondary eQTL signal for DGKQ, rs13101828, based on LD and conditional analysis (r2 = 1.0; Figure 2.7). In hepatic cells, silencing DGKQ decreased the ability of synthetic FXR ligand to promote phosphorylation of mTOR, Akt and FoxO1112, and misregulation of the mTOR pathway has been shown to result in peripheral insulin resistance and to promote adipogenesis and in adipose tissue113. These results suggest that colocalization of secondary eQTLs can identify substantially more GWAS loci for which at least one potential candidate gene can provide insights into GWAS biology, than primary eQTL colocalizations alone.

For a subset of the 287 GWAS-colocalized primary eQTLs for which GWAS summary statistics were available (n=173; BMI, WHR, WHRadjBMI, and T2D), we further investigated colocalization of primary eQTL signals by applying coloc2, a Bayesian colocalization test that uses summary statistics24,66

(Table 2.13). Of 173 genes tested, 81 showed strong or moderate evidence of GWAS-eQTL colocalization (coloc2 posterior probability H4 > 0.8 and H4 > 0.5, respectively). An additional 82 genes showed strong or moderate posterior probabilities that did not support colocalized signals (coloc2 H3 >

0.8 and H3 > 0.5, respectively), which was especially unexpected for nine signals with the same lead variant in both the GWAS and eQTL data and 31 signals with very strong pairwise LD (r2 > 0.95) between lead GWAS and eQTL variants. A comparison of GWAS and eQTL association plots (e.g. Figure 2.8;

Table 2.13) showed the GWAS and eQTL signals are very similar and suggests that coloc2 is sensitive to additional nearby GWAS and eQTL signals.

Cross-ancestry colocalization analysis

Colocalization methods rely on examining association patterns in GWAS and eQTL studies for similarity; association patterns can differ due to the extent of LD in a region, which can vary by ancestry.

We further examined LD patterns in cis-eQTLs discovered in Finns for a subset of the BMI GWAS loci that were discovered in individuals from Japan (Table 2.9, Table 2.14, Figure 2.9). We evaluated the LD r2 values between the GWAS variant and eSNP in both the Finnish eQTL population and in East Asian

1000G data. Of the six loci (corresponding to nine eQTL transcripts) that met the GWAS variant-eSNP colocalization LD threshold (r2 > 0.8) in Finns, lead variants at four loci had similar pairwise LD in East

Asians and two had lower LD in East Asians (Table 2.14).

24

As an example of similar LD patterns across populations, the lead BMI-associated variant near

GON4L in Japanese individuals114, rs860295, was also the lead eSNP for DAP3 (Figure 2.9A). Similar to previous examples comparing GWAS and eQTL signals within one ancestry, the similar patterns of

GWAS and eQTL association signals here suggest DAP3 as a candidate gene for the observed association with BMI. DAP3 has been shown to mediate interferon gamma-induced cell death115 and

IFNG is involved in pro-inflammatory responses related to obesity116. Of note, a GWAS in Europeans102 also identified BMI-associated variants in this chromosomal region, although the European and Japanese

GWAS variants exhibited low pairwise LD (r2 = 0.32 in 1000 Genomes East Asians and r2 = 0.35 in Finns) and the European signal was not colocalized with any genes (Table 2.9). While further characterization of the locus may identify additional candidate genes, the shared LD patterns support cross-ancestry conclusions of colocalized GWAS and eQTL signals.

Two GWAS-eQTL signals we considered to be colocalized showed high LD (r2 ³ 0.8) between the lead GWAS and eSNP in Finns but not in East Asians. The lead GWAS variant at the IL27/NUPR1

East Asian BMI locus, rs62034325, is a cis-eQTL for SULT1A2 (P = 1.03 ´ 10-27; β = 0.67; Figure 2.9B).

LD between the lead GWAS variant and the eSNP, rs28698667, is high in Finns (r2 = 0.93) but low in

East Asians (r2 = 0.22). However, LD between the lead GWAS variant and other near-lead eQTL variants

(e.g. rs62034322, P= 1.02 x 10-27; β = 0.67) is high in both Finns (r2 = 1.0) and East Asians (r2 = 0.89),

-3 and conditional analyses in the eQTL data (Pcond = 1.61 ´ 10 ) showed that these signals are colocalized.

The SULT1A2 eQTL signal in Finns consists of more variants than the BMI signal in Japanese, and the

BMI-associated variants appear to be a subset of the eQTL variants. SULT1A2 catalyzes the sulfate conjugation of estrogens, estrogenic alkylphenols, and 17-β-estradiol, to facilitate their removal from the body117, and increased levels of 17-β-estradiol are associated with obesity118; in addition, SULT1A1 levels were altered in adipose tissue of rats on a high fat diet119. These results are consistent with the dependence of all colocalization methods on patterns of association that vary by ancestry and suggest that, regardless of GWAS study ancestry, LD-based methods to assess colocalization should focus on the eQTL study population.

25

Cardiometabolic trait association with expression level of eQTL-GWAS colocalized genes

To further investigate the 287 genes identified based on colocalized primary eQTL-GWAS signals, we tested for association of expression level with 20 cardiometabolic traits measured in the same

434 METSIM individuals. Because BMI is correlated with the expression level of a high percentage (51%) of all genes in our study, we performed gene-trait associations with and without adjustment for BMI (Table

2.15 & 2.16). Without adjustment for BMI, expression levels of 154 genes were associated with at least one trait, and with adjustment for BMI, expression levels of 49 genes were associated with a trait. The difference between the analysis with and without adjustment for BMI may demonstrate the influence of this trait on adipose tissue gene expression levels. This difference also could be due to gene expression and cardiometabolic traits influencing BMI (collider bias). We recommend caution interpreting association results when a gene’s expression level is associated with BMI (Table 2.15), as collider bias can alter the effect size and significance level of the association. The traits associated with expression level of the most genes were Matsuda index, WHR, and triglycerides. For example, at a WHRadjBMI locus, EIF4G2, also known as DAP5, was significantly associated with fasting free fatty acids (P = 3.0 x 10-5) and with

Matsuda index, a measure of insulin sensitivity120 (P = 7.2 x 10-5) (Table 2.16), and EIF4G2 was not associated with BMI (Table 2.15), suggesting that this gene may affect these traits. EIF4G2 is thought to play a role in the interferon gamma-induced cell death pathway121. Association between adipose gene expression level and clinical traits provides an additional line of evidence that changes in expression of the colocalized genes may influence quantitative traits and disease risk.

Mediation analysis

We next evaluated colocalized signals for evidence that gene expression mediates the effect of a

GWAS variant on a cardiometabolic trait. We estimated the mediating effect of a variant on a trait level through gene expression level, adjusted for age, relevant covariates, and BMI (except when evaluating

BMI as an outcome; see Methods for covariates). Of note, observing mediation depends on detecting effects of variants on traits in only 434 individuals. Among the colocalized loci, we only tested mediation paths at loci for which a trait comparable to the reported GWAS signal had been measured in the individuals with gene expression data; this choice resulted in testing the same variant and gene for

26

multiple traits in some cases, leading to more potential mediations than colocalized signals. Of 348 potential mediations, 90 (25.9%) showed evidence of gene expression mediating the effect of an association signal on a trait (the 95% CI of a mediation effect, ME, does not include 0; Table 2.17). We focused on signals with nominal evidence of the combined effect of mediation and a direct effect of the

GWAS variant on the trait (P < 0.05, Table 2.18). Of these, at least four loci have functions relevant to cardiometabolic traits. For example, at a well-characterized GWAS locus for adiponectin levels, adipose expression level of CDH13, which encodes a for high molecular weight adiponectin multimers122, mediates the effect of rs12051272 on plasma adiponectin levels (ME = 0.34; CI [0.086, 0.63]). At another

GWAS locus, MLXIPL, also known as ChREBP, mediates the variant effect on triglyceride levels (ME = -

0.12; CI [-0.21, -0.40]). In adipose tissue, MLXIPL has been shown to regulate glucose homeostasis and fatty acid synthesis123. In addition, RSPO3 mediates the effect of rs72959041 on triglycerides (ME = -

0.12; CI [-0.21, -0.043]), and RSPO3 activates the Wnt/b-catenin pathway which promotes angiogenesis124, suggesting a potential role in the expansion of adipose tissue. Finally, JAZF1 mediates the effect of rs1708302 on BMI (ME = 0.17; CI [0.059, 0.30]); in adipocytes, overexpression of JAZF1 inhibits lipid accumulation and regulates lipid metabolism125. In addition to these well characterized loci, expression level of CATSPERZ, also known as TEX40, mediates an association with BMI (ME= -0.23; CI

[-0.44, -0.065]), and expression level of ADH1A mediates an association with WHR (ME= -0.01; CI [-

0.016, -0.005]). CATSPERZ encodes a calcium ion channel and ADH1A encodes a member of the alcohol dehydrogenase family; while neither gene has yet been directly linked to body size or fat distribution, these mediation results support the eQTL colocalization and suggest that these genes play a role in obesity.

Discussion

We identified primary and secondary cis-eQTLs in subcutaneous adipose from 434 Finnish males from the METSIM study and linked their gene expression levels to cardiometabolic GWAS signals and measures of 20 cardiometabolic traits. By colocalizing eQTL signals with 262 cardiometabolic trait GWAS signals, we identified 318 candidate genes, the highest proportion of which corresponded to traits with multiple biological pathways relevant to adipose tissue: WHR (15% of GWAS signals), lipids (14% of

27

GWAS signals), and T2D (8% of GWAS signals). Additional integration of clinical trait information from the eQTL study participants through gene-trait association and mediation analysis provided further support of a role for these genes in cardiometabolic trait variation.

While GWAS have successfully identified thousands of genomic regions associated with cardiometabolic diseases and complex traits, the genes and mechanisms responsible for many of these loci remain unknown. Identifying candidate transcripts for GWAS loci is relatively straightforward when one of the lead GWAS variants is predicted to cause a loss of gene function, but identifying candidate genes with a testable link to function is more ambiguous when all candidate GWAS variants are noncoding. Colocalized eQTL signals provide one of several approaches to identify reasonable candidate genes126,127. The value of eQTLs derives from the low prior probability that GWAS variants will by chance also be functional variants associated with a gene’s expression level. For our main colocalization analysis, we assumed that candidate functional variants would be among those that show the strongest association with both a GWAS trait and gene expression level. Consistent with widespread evidence that expression levels of most genes are regulated by genetic variants128, 31% of lead GWAS variants in this study showed nominal evidence of association (FDR < 1%) with expression levels of at least one transcript. Further evaluation of signal colocalization using one or more methods65–69 is critical, because here only 1.3% of the associations showed the identical lead variant to be associated with both the

GWAS and gene expression traits. Interpretation of colocalization can be challenging as comparison of signals is affected by sampling heterogeneity, the presence of multiple and/or differing GWAS and eQTL signals per locus, and differences in LD patterns between GWAS and eQTL study participants.

The use of eQTLs to detect candidate genes at GWAS loci is becoming more widespread, but colocalizations often only consider primary eQTL signals, which are most easily available via summary data. While primary eQTLs identified 90% of the candidate genes detected here, analysis of secondary eQTLs detected another 21 genes that would have been missed by summary level analysis. These secondary eQTLs may reflect parallel biology with primary eQTL signals or they may derive from a different underlying cell type present in bulk tissue samples. As eQTL study sample sizes increase and single-cell or in silico cell-type deconvolution technologies improve, studies will have more potential to detect multiple regulatory effects per gene.

28

Colocalization methods are imperfect and sensitive to the thresholds applied. In the LD and conditional analysis approach, we applied a threshold for LD between the GWAS variant and the eSNP as well as a threshold for evidence of association after conditional analysis. These thresholds are arbitrary and sensitive to the strength of the original eQTL signal, the use of lead variants subject to sampling variability, and a conditional analysis step that requires individual-level data. In comparison, coloc2 requires setting prior probabilities and thresholds for interpreting posterior probabilities and is sensitive to the presence of multiple signals in either the eQTL or the GWAS data. Further development of colocalization methods may improve the use of eQTLs for identifying candidate genes for GWAS loci.

We provide evidence for cross-ancestry GWAS-eQTL colocalizations and demonstrated the importance of evaluating colocalization using the LD structure of the eQTL study population. When the lead variants for the GWAS and eQTL studies are the same or in high LD in both populations, colocalization is straightforward. At these loci, conditional analysis showed evidence of colocalization.

However, when the LD structure differs substantially between populations, closer comparison of the patterns of association in GWAS and eQTL data is required to ensure that the conclusion of colocalization is warranted. The results of colocalization across populations can be influenced especially when the pattern of association is broader due to LD in one population than another and the selection of a different lead variant within the set of highly associated variants would result in high LD between the GWAS variant and eSNP in both ancestries. Although some loci can be compared cleanly across ancestries, as eQTL studies in more diverse populations become available, comparing GWAS and eQTL signals within ancestries will likely provide stronger evidence of colocalization.

To further investigate causal relationships between genetic variants and clinical traits, we used the extensive phenotypic trait information available for the METSIM eQTL study participants and identified

49 genes associated with cardiometabolic traits at colocalized GWAS loci. Although Matsuda index,

WHR, and triglycerides showed the most associations with genes, these traits did not exhibit the most variation between individuals, suggesting that even small changes in traits can result in association with colocalized genes. Because BMI is associated with nearly half of all genes in our data, the direction of the relationship between BMI, related traits, and gene expression is hard to determine; expression level of a gene may affect BMI, or BMI may influence expression level of a gene and a trait independently. We

29

further tested for effects of the variants mediated through gene expression on the GWAS trait or a related trait. The six highlighted mediations (Table 2) show not only an association between these variant, gene, and trait, but also a directional, and potentially causal, path. This mediation analysis method is limited by statistical power because it uses only the hundreds of individuals with expression data to detect an effect of the genotype on the trait, whereas thousands of samples were used to originally detect these associations in GWAS studies. To mitigate this limitation, we confined our analysis to established GWAS loci and required the combined effect of the genotype on the trait and the mediation to be significant. The results of both gene-trait associations and mediation analysis can be used to guide efforts in functional tests of genes for mechanistic effects on cardiometabolic traits.

The 318 candidate genes identified here correspond to GWAS loci across a broad range of cardiometabolic traits. These genes may act to influence obesity, diabetes, and cardiovascular traits and should be prioritized for future functional analysis. As larger subcutaneous adipose data sets become available in diverse ancestries, multiple eQTL signals will become easier to identify and colocalize with

GWAS signals from across populations.

Acknowledgements

This work was made possible by the contributions of many other scientists. Arthur Ko resolved sample swapping in the original METSIM adipose tissue dataset. John C Kidd wrote and performed the mediation analysis portion of this work, under the direction of his advisor, Dan-Yu Lin. Kevin W Currin performed the enrichment analysis of eQTLs in promoters, enhancers, and METSIM open chromatin.

Sarah M Brotman ran the trans-eQTL analyses presented here. Maren E Cannon performed the ATAC- sequencing for open chromatin in METSIM adipose tissue that was used in the enrichment analysis. Ying

Wu provided code for portions of this work and gave advice. Other scientists provided advice to the project for eQTL detection and statistical rigor: Cassandra N Spracklen, Mete Civelek, Aldons J Lusis.

The METSIM study is run by Markku Laakso and Johanna Kuusisto, who are responsible for the tissue samples and the genotype collection in this study. The following people were involved in METSIM genotyping and imputation to the Haplotype Reference consortium: Anne U Jackson, Heather M

Stringham, Ryan P Welch, Christian Fuchsberger, Adam E Locke, Narisu Narisu, Francis S Collins,

30

Michael Boehnke, Laura J Scott. Päivi Pajukanta provided partial funding for the adipose tissue RNA- sequencing and advice throughout the duration of the project. Finally, this work was performed with the advice and direction of Terrence S Furey and Karen L Mohlke.

31

Figure 2.1: Percentages of different tissues across 550 samples: Stacked bar plot of estimated tissue composition from full set of 550 subcutaneous adipose tissue needle biopsy samples from METSIM study. For each individual (x-axis), the estimated percentages of subcutaneous adipose, skeletal muscle,

32

whole blood, and EBV-transformed lymphocytes identified using DESeq2 are shown.

Figure 2.2: Selection and identification of PEER factors: PEER factor curve (A) showing the number of PEER factors (x-axis) used to identify genes with a significant eQTL association (FDR <5%, y-axis).

For panels B & C, samples (n=434) are plotted by their values for PEER factors 1 (x-axis) and 2 (y-axis).

These plots are colored by median TIN value (B) and estimated percent of adipose tissue (C).

33

Figure 2.3: Comparison of primary and secondary eQTL association signals: (A) For each type of transcript (x-axis), the number of transcripts associated with at least one nearby variant that met the genome-wide FDR <1% is shown (y-axis). (B) Density plot showing the distance from the lead eQTL variant to the TSS. Dashed lines represent the median distance to the TSS for primary (purple) and secondary (green) eQTLs. (C) Enrichment of primary and secondary eQTLs in chromatin marks from

Roadmap adipose nuclei chromatin marks and open chromatin ATAC-seq peaks from adipose tissue.

The horizontal lines represent the logistic regression coefficient for enrichment with vertical lines representing the 95% confidence interval around the coefficient. D) Absolute value of the effect sizes of eQTL variants that overlap Roadmap adipose nuclei promoters and enhancers. Black horizontal lines represent the median effect size for primary and secondary eQTL variants.

34

Figure 2.4: eQTL effect sizes for variants located in adipose tissue ATAC-seq peaks or adipose nuclei chromatin marks: Absolute value of the effect sizes for variants from combined primary and secondary eQTL signals that overlap promoter or enhancer regions from Roadmap adipose nuclei and ATAC-seq adipose tissue. Black lines represent the median effect sizes of primary and secondary eQTL variants within each chromatin mark.

35

36

Figure 2.5: Cardiometabolic GWAS and eQTL colocalizations, stratified by trait: Each point represents the lead GWAS variant of a signal colocalized by LD and conditional analysis with an eQTL for the named gene. Plots show eQTLs at GWAS loci for (A) T2D, (B) WHRadjBMI, and (C) lipids. The x-axis shows chromosomal positions of colocalized signals and the y-axis shows the -log10 P-values of the

GWAS variant’s association with gene expression level in adipose tissue. After matching the effect allele from the eQTL study with the risk allele from the GWAS, risk alleles that are associated with increased gene expression level are shown in red and risk alleles that are associated with decreased gene expression level are shown in blue. Triangles indicate that more than one gene is colocalized with the

GWAS variant (other genes would appear at the same x-axis position). Variants colocalized as secondary eQTLs are designated by a star after the gene name. Only the strongest associations for the named traits are shown; full results can be found in Supplemental Table 9.

37

38

39

40

Figure 2.6: Extended Cloud City plots of cardiometabolic GWAS and eQTL colocalizations, stratified by trait: Each point represents a lead GWAS variant associated with a nearby gene associated with various cardiometabolic traits and colocalized by LD and conditional analysis. The x-axis shows chromosomal positions of colocalized signals and the y-axis shows the -log10 P-values of the GWAS variant’s association with gene expression level in adipose tissue. After matching the effect allele from the eQTL study with the risk allele from the GWAS, risk alleles that are associated with increased gene expression level are shown in red and risk alleles that are associated with decreased gene expression level are shown in blue. Triangles indicate that more than one gene is colocalized with the GWAS variant

(other genes would appear in columns). Variants colocalized as secondary eQTLs are designated by a star after the gene name. Variants associated with BMI are separated into two plots, stratified by increased expression (A) or decreased expression (B). Lipid traits were separated by HDL (C), LDL (D),

Triglycerides (E), Total Cholesterol (F). Additionally, variants from WHR (G), CVD (H) and Adiponectin (I) were plotted individually.

41

42

Figure 2.7: A WHRadjBMI signal colocalizes with the secondary eQTL for DGKQ Locus plots for the

DGKQ locus, colored by two distinct signals present in the METSIM eQTL data. The lead GWAS variant associated with WHRadjBMI at this locus, rs11724804 (top) is associated with DGKQ expression, but is not in LD with the variant most strongly associated with DGKQ expression, rs11731377 (eSNP; middle).

After conditional analysis, a second eQTL signal for DGKQ is apparent and is colocalized with the lead

GWAS variant (bottom). Variants are colored by the strength of their LD with the lead variant for each signal (diamond), with darker colors indicating stronger LD.

43

Figure 2.8: Comparison of GWAS and eQTLs found not to be colocalized by coloc2:

Locus plots showing the regional associations for GWAS (top) and gene expression level (bottom) results. (A) The EYA2 eQTL is colocalized with T2D by LD and reciprocal conditional analysis (LD

2 r =0.94, Pcond= 0.058), but the posterior probability that they were colocalized is more marginal

(PP4=0.67). There are two signals in the both the GWAS and eQTL data, which are not captured well by coloc2, according to the assumptions of the software. (B) At the FTO locus, we identify a colocalization

2 between the lead T2D GWAS variant and eSNP for AKTIP (LD r = 0.98, Pcond= 0.065). Results from coloc2 suggest that these two signals are not colocalized (PP3=1.0). There are two independent signals in the GWAS data, one near FTO and another near AKTIP. In the eQTL data, there is only signal for

AKTIP, since the association with FTO gene expression was tested separately. This is likely the cause of the discrepancy.

44

Figure 2.9: Cross-ancestry comparison of colocalized GWAS – eQTL signals: Locus plots showing the regional associations for BMI in Japanese individuals, colored by East Asian LD (EAS) (top) and gene expression in Finns (bottom), colored by Finnish LD. (A) A DAP3 eQTL signal is colocalized with a BMI signal near GON4L. The lead variant is shared between the GWAS and eQTL studies and the patterns of association are similar. (B) A SULT1A2 eQTL signal is colocalized with a BMI signal near IL27/NUPR1.

The LD between the lead eQTL and GWAS variants is low in East Asians (r2 = 0.22), but high in Finns (r2

= 0.93).

45

Abbreviation Metabolic trait Mean± standard deviation Median (25th, 75th percentile) Age Age (years) 54.6 ± 5.3 54.0 (51.0, 58.5) BMI Body mass index (kg/m2) 26.6 ± 3.5 26.1 (24.4, 28.4) WHR Waist-to-hip ratio 0.96 ± 0.06 0.96 (0.92, 1.00) WC Waist circumference (cm) 96.6 ± 10.3 95.0 (90.0, 102.0) HIP Hip circumference (cm) 100.8 ± 7.0 100.0 (96.0, 104.0) FFMASS Fat-free mass (%) 78.8 ± 5.1 79.2 (76.3, 82.0) FFA Fasting plasma free fatty acid (mmol/l) 0.35 ± 0.13 0.33 (0.25, 0.43) TG Total triglycerides (mmol/l) 1.37 ± 0.93 1.16 (0.87, 1.59) CHOL Total cholesterol (mmol/l) 5.50 ± 0.89 5.40 (4.88, 6.07) LDL-C LDL cholesterol (mmol/l) 3.48 ± 0.77 3.42 (2.96, 3.96) HDL-C HDL cholesterol (mmol/l) 1.50 ± 0.38 1.45 (1.21, 1.70) Adiponectin Plasma adiponectin (ug/ml) 7.44 ± 3.55 6.70 (5.00, 9.00)

Matsuda Matsuda composite insulin sensitivity index (mg/dl, mU/l) 7.52 ± 3.97 6.81 (4.33, 9.98) 46

HOMA-B Insulin resistance index based on homeostasis assessment 66.8 ± 41.3 54.1 (40.7, 80.8) Glucose Fasting plasma glucose (mmol/l) 5.71 ± 0.47 5.70 (5.40, 5.95) Insulin Fasting plasma insulin (mU/l) 44.1 ± 29.3 34.8 (24.6, 54.6) Proinsulin Fasting plasma proinsulin (pmol/l) 12.9 ± 6.38 11.5 (8.70, 15.2) HbA1c Blood glycated HbA1c (%) 5.64 ± 0.31 5.60 (5.40, 5.80) SBP Systolic blood pressure (mm Hg) 133.3 ± 14.7 132.0 (122.0, 140.7) DBP Diastolic blood pressure (mm Hg) 86.9 ± 8.80 86.7 (80.7, 92.7) CRP High sensitivity C-reactive protein (mg/l) 2.15 ± 5.20 0.99 (0.46, 2.18) IL1RA Interleukin-1 receptor antagonist (pg/ml) 200.4 ± 145.9 167.0 (126.4, 225.5) GFR Glomerular filtration rate 0.091 ± 0.01 0.089 (0.082, 0.098) TotFA Total fatty acids (mmol/l) 14.4 ± 3.28 13.7 (12.1, 16.0) T2D & IGT Type 2 diabetes & Impaired glucose tolerant (count) 32 NGT Normal glucose tolerant (count) 355

Table 2.1: Characteristics of the METSIM study participants in 434 sample subset This table contains the trait characteristics of 434 samples

from the METSIM study. An extended version of this table is available in Raulerson CK et al. (AJHG 2019). 47

n=550, >0% adipose (P < 6.07e-4) n=434, >50% adipose (P < 6.06e-4) n=387, >75% adipose (P < 3.79e-4) Number of PEER factors 20 40 60 80 20 40 60 80 20 40 60 80 # variant- gene pairs 2.55M 2.74M 2.82M 2.86M 2.74M 2.95M 3.02M 3.05M 2.30M 2.45M 2.50M 2.50M # variants 1.40M 1.48M 1.51M 1.53M 1.49M 1.57M 1.60M 1.61M 1.29M 1.35M 1.38M 1.38M # genes 15,999 16,286 16,356 16,393 15,788 16,043 16,135 16,183 14,442 14,600 14,715 14,712 Table 2.2: Comparison of cis-eQTL discovery between sample subsets cis-eQTL discovery statistics reported for chr 1-22 based on FDR<5%

calculated from FastQTL permutation test and qvalue package. Analyses were based on 21,735 genes and 7.8M variants (hg19, GENCODE v19

comprehensive, Haplotype Reference Consortium-imputed genotypes with MAF > 1%). The number of variant-gene pairs and variants are

displayed in the millions (M).

48

n=550, >0% adipose n=434, >50% adipose n=387, >75% adipose

(P < 6.07e-4) (P < 6.06e-4) (P < 3.79e-4) variant gene P β P β P β rs4731702 KLF14 1.30E-15 0.48 3.29E-16 0.54 4.06E-16 0.56

rs76786086 ADIPOQ 6.63E-04 -0.71 1.03E-09 -1.69 1.98E-09 -1.66 rs2239857 CDH13 1.86E-23 0.84 4.35E-26 1.00 7.71E-24 1.00 Table S2.3: Comparison of known adipose cis-eQTLs between sample subsets Results for the same variant-gene pair in the three sample

subsets. These eQTL genes were identified in Civelek et al. (2017). The named variant (column A) was the lead variant in each subset except for

CDH13 and n=550, where it was the second best variant, which is in high LD with the identified lead. Thresholds match the FDR < 5% in each

sample set.

49

P-value Threshold Discovery Comparison Number Number of these Study Study eQTLs in the eQTLs available discovery in the 5 x 10-3 5 x 10-4 5 x 10-5 5 x 10-6 5 x 10-7 5 x 10-8 study comparison (FDR<5%) study Number (%) of adipose cis-eQTLs observed in the comparison study METSIM GTEx 16,135 4,657 4,608 4,602 4,442 3,815 3,315 2,912 subcutaneous subcutaneous (99.0%) (98.8%) (95.4%) (81.9%) (71.2%) (62.5%) adipose adipose GTEx METSIM 11,862 4,869 4,804 4,754 4,339 3,774 3,277 2,817 subcutaneous subcutaneous (98.7%) (97.6%) (89.1%) (77.5%) (67.3%) (57.9%) adipose adipose

METSIM GTEx 16,135 3,493 3,457 3,455 3,370 2,902 2,493 2,171 subcutaneous visceral adipose (99.0%) (98.9%) (96.5%) (83.1%) (71.4%) (62.2%) adipose GTEx METSIM 8,990 3,786 3,731 3,704 3,448 2,977 2,529 2,162 50

visceral subcutaneous (98.6%) (97.8%) (91.1%) (78.6%) (66.8%) (57.1%) adipose adipose Table 2.4: Replication of lead eQTLs between METSIM and GTEx The number of eQTLs in each discovery study is counted by the number of

genes with at least one significant eSNP. The numbers of cis-eQTLs observed are shown at various p-value thresholds requiring the same

direction of effect. The percent of eQTLs that matched the direction of effect from the discovery study is based on the number of SNPs available in

both studies. Variants in GTEx were only reported at the FDR<5% threshold.

eQTL set Genomic Annotation ! Confidence P Interval

First Signals All ATAC peaks 2.53 2.48 2.58 < 2e-308 Second Signals All ATAC peaks 2.17 2.08 2.27 < 2e-308 GWAS-Coincident First Signals All ATAC peaks 3.20 2.93 3.49 7.40E-111 First Signals ATAC peaks in Enhancers 2.01 1.93 2.09 < 2e-308 Second Signals ATAC peaks in Enhancers 1.84 1.68 1.99 7.40E-119 GWAS-Coincident First Signals ATAC peaks in Enhancers 2.67 2.31 3.00 1.20E-52 First Signals ATAC peaks in Promoters 2.83 2.77 2.89 < 2e-308 Second Signals ATAC peaks in Promoters 2.52 2.41 2.63 < 2e-308 GWAS-Coincident First Signals ATAC peaks in Promoters 3.47 3.18 3.76 6.80E-124 First Signals ATAC peaks in Stretch Enhancers 2.12 2.00 2.23 1.90E-271 Second Signals ATAC peaks in Stretch Enhancers 1.93 1.70 2.14 3.50E-65 GWAS-Coincident First Signals ATAC peaks in Stretch Enhancers 2.75 2.28 3.17 6.80E-34 First Signals ATAC peaks in Typical Enhancers 1.82 1.69 1.95 5.00E-161 Second Signals ATAC peaks in Typical Enhancers 1.72 1.46 1.96 2.90E-42 GWAS-Coincident First Signals ATAC peaks in Typical Enhancers 2.28 1.69 2.80 4.70E-16 First Signals Roadmap Enhancers 1.68 1.64 1.73 < 2e-308 Second Signals Roadmap Enhancers 1.45 1.37 1.53 5.60E-277 GWAS-Coincident First Signals Roadmap Enhancers 2.81 2.48 3.17 4.50E-59 First Signals Roadmap Promoters 2.46 2.41 2.51 < 2e-308 Second Signals Roadmap Promoters 2.11 2.02 2.19 < 2e-308 GWAS-Coincident First Signals Roadmap Promoters 3.06 2.78 3.35 2.70E-97 First Signals Stretch Enhancers 1.38 1.31 1.44 < 2e-308 Second Signals Stretch Enhancers 1.31 1.20 1.41 3.50E-132 GWAS-Coincident First Signals Stretch Enhancers 2.42 2.14 2.70 1.60E-64 First Signals Typical Enhancers 1.81 1.75 1.87 < 2e-308 Second Signals Typical Enhancers 1.58 1.47 1.68 1.50E-197 GWAS-Coincident First Signals Typical Enhancers 2.51 2.23 2.78 6.90E-70 Table 2.5: Enrichment in Roadmap adipose nuclei marks and METSIM adipose tissue ATAC-seq peaksThe top 2,534 primary signals were selected to match the number of lead variants in the secondary signal set. Promoters and enhancers were identified using Roadmap adipose nuclei. ATAC peaks were identified from METSIM adipose tissue samples. 95% confidence intervals were estimated from logistic regression.

51

Without adjustment with PEER factors With adjustment for 3 PEER factors trans-eQTL trans-eQTL trans-eQTL trans-eQTL trans-eQTL trans-eQTL eGenes eGenes eGenes eGenes eGenes eGenes (P < 1e-4) (P < 5e-8) (P < 3.4e-13) (P < 1e-4) (P < 5e-8) (P < 3.4e-13) > 1Mb > 5Mb > 1Mb > 5Mb > 1Mb > 5Mb > 1Mb > 5Mb > 1Mb > 5Mb > 1Mb > 5Mb protein- coding 14,902 14,902 12,138 12,106 3,613 3,575 14,902 14,902 11,352 11,304 3,691 3,635 antisense 1,734 1,734 1,306 1,300 363 357 1,734 1,734 1,216 1,211 347 340 lincRNA 1,357 1,357 1,043 1,035 266 260 1,357 1,357 986 976 244 237 other 560 560 425 423 155 151 560 560 399 394 150 146 total 18,553 18,553 14,912 14,864 4,397 4,343 18,553 18,553 13,953 13,885 4,432 4,358 Table 2.6: Trans-eQTL discovery at three significance thresholds eGenes are reported if at least 1 variant > 1 Mb (or the subset > 5 Mb) away

from the transcription start site shows significant eQTL association. P < 1e-4 is equivalent to the p-value with and FDR of < 5% using the Storey 52

method in Small et al. Nat Gen 2018 (PMID 29632379). P < 5e-8 is an approximate genome-wide threshold. P < 4.66e-9 is equivalent to the p-

value with an FDR < 5% using the Benjamini Hochberg method. Analyses were based on 18,553 genes and 7.8M variants (hg19, GENCODE v19

comprehensive excluding psuedogenes and genes with low mappability, Haplotype Reference.

Without PEER factors With 3 PEER factors

Gene METSIM β METSIM METSIM β METSIM TwinsUK β TwinsUK P-value P-value P-value PRMT2 -0.277 4.53E-09 -0.264 2.28E-08 -0.233 2.24E-06 CCDC153 -0.250 1.25E-07 -0.231 1.17E-06 NA NA MAP3K4 -0.219 4.33E-06 -0.200 2.75E-05 NA NA GNAL -0.218 4.38E-06 -0.205 1.60E-05 NA NA UNC13B -0.214 7.14E-06 -0.222 3.07E-06 -0.226 2.55E-07 PARP6 -0.212 8.71E-06 NA NA NA NA FGFR1OP -0.211 8.91E-06 -0.227 1.73E-06 NA NA FAM109A -0.209 1.19E-05 -0.194 4.82E-05 -0.145 3.01E-04 RALGPS1 -0.208 1.28E-05 -0.187 8.99E-05 NA NA SMYD4 -0.208 1.31E-05 NA NA NA NA LSM11 -0.207 1.43E-05 NA NA NA NA ANAPC4 -0.206 1.45E-05 -0.198 3.21E-05 NA NA HMBS 0.205 1.69E-05 NA NA NA NA RP5-942I16.1 -0.203 2.13E-05 NA NA NA NA FAM211B -0.199 3.02E-05 NA NA NA NA BBS2 -0.195 4.33E-05 -0.194 4.91E-05 NA NA LZTS3 -0.194 4.53E-05 -0.217 4.88E-06 NA NA SNX14 -0.194 4.74E-05 -0.271 9.71E-09 NA NA DCLK2 -0.194 4.94E-05 NA NA NA NA BEAN1 -0.192 5.72E-05 NA NA NA NA MPP3 -0.189 7.20E-05 NA NA NA NA KIF13B -0.189 7.65E-05 NA NA NA NA PDXP 0.188 7.95E-05 NA NA NA NA APBA1 -0.188 8.06E-05 NA NA NA NA RP1-137D17.2 -0.187 9.15E-05 NA NA NA NA USP4 -0.187 9.23E-05 NA NA NA NA EPN2 -0.186 9.93E-05 NA NA NA NA RBBP7 NA NA -0.264 2.47E-08 NA NA MAGED2 NA NA -0.251 1.18E-07 -0.264 3.46E-10 HSPB11 NA NA -0.237 5.67E-07 NA NA TWSG1 NA NA -0.229 1.46E-06 NA NA CHCHD7 NA NA -0.222 2.88E-06 NA NA URI1 NA NA -0.221 3.47E-06 NA NA UBE2Q2 NA NA -0.220 3.60E-06 -0.204 4.51E-06 MFSD10 NA NA -0.219 4.24E-06 NA NA GNB1 NA NA -0.218 4.64E-06 -0.208 2.56E-06 RAN NA NA -0.218 4.76E-06 NA NA

53

HNRNPU-AS1 NA NA -0.210 9.87E-06 NA NA APLF NA NA -0.209 1.09E-05 NA NA COX20 NA NA -0.208 1.21E-05 NA NA CCBL2 NA NA -0.208 1.24E-05 NA NA ANKMY2 NA NA -0.208 1.24E-05 NA NA SRI NA NA -0.208 1.29E-05 NA NA DUSP1 NA NA -0.207 1.44E-05 NA NA DYNLT1 NA NA -0.204 1.93E-05 NA NA SAP30 NA NA -0.203 1.96E-05 NA NA KPNA5 NA NA -0.203 2.05E-05 NA NA ARHGAP12 NA NA -0.201 2.48E-05 NA NA TM2D3 NA NA -0.200 2.73E-05 -0.191 9.82E-05 PLK3 NA NA -0.199 3.00E-05 NA NA EIF4E3 NA NA -0.198 3.13E-05 -0.252 5.64E-10 PNMA1 NA NA -0.197 3.49E-05 -0.292 1.06E-08 MEAF6 NA NA -0.197 3.65E-05 NA NA ADD1 NA NA -0.196 3.86E-05 NA NA LSM11 NA NA -0.195 4.21E-05 NA NA RNF170 NA NA -0.193 5.06E-05 NA NA RHNO1 NA NA -0.193 5.25E-05 NA NA PTPLA NA NA -0.192 5.77E-05 NA NA MOCS2 NA NA -0.192 5.83E-05 -0.178 6.89E-05 PPP1R15A NA NA -0.191 6.11E-05 NA NA SLC9A6 NA NA -0.191 6.27E-05 NA NA C8orf82 NA NA -0.189 7.56E-05 -0.284 6.76E-13 RRP9 NA NA 0.188 7.97E-05 0.248 9.35E-08 CYR61 NA NA -0.188 8.38E-05 NA NA DNAH11 NA NA -0.187 8.58E-05 NA NA TKT NA NA 0.187 8.60E-05 0.214 9.10E-07 NLRX1 NA NA 0.187 9.16E-05 NA NA SUPT3H NA NA -0.187 9.22E-05 NA NA SETDB2 NA NA -0.186 9.37E-05 NA NA

54

Table 2.7: Replication of METSIM trans-eQTL for KLF14 target genes in TwinsUK Genes are listed if

METSIM eQTL for rs4731702 (chr7:130433384) was significant (P<1e-4) with 3 PEER factors. Values are shown as NA in columns B and C if not reported (P < 0.05) without PEER factors, in columns D and E if not reported with 3 PEER factors, and in columns F and G if not reported for Twins UK in Small KS et al. Nat Genet 2018. The TwinsUK study consists of 856 female twins with expression level measured from subcutaneous adipose tissue punch biopsies (1000 Genomes-imputed genotypes, genes annotated by GENCODE v10).

55

Trait Category GWAS Trait diabetes Type 2 diabetes diabetes Diabetes (gestational) diabetes Type 2 diabetes and other traits glycemic Fasting glucose-related traits glycemic Fasting insulin-related traits glycemic Fasting plasma glucose glycemic Fasting glucose-related traits (interaction with BMI) glycemic Fasting insulin-related traits (interaction with BMI) glycemic Glycated hemoglobin levels glycemic Glycemic traits glycemic Insulin-related traits glycemic Proinsulin levels glycemic Two-hour glucose challenge obesity Adiposity obesity BMI obesity Body mass index obesity Body mass (lean) obesity HIP obesity HIPadjBMI obesity Obesity obesity Obesity (early onset extreme) obesity Obesity (extreme) obesity Obesity-related traits obesity Visceral fat obesity Visceral adipose tissue/subcutaneous adipose tissue ratio obesity Waist circumference obesity Waist circumference and related phenotypes obesity Waist Circumference - Triglycerides (WC-TG) obesity Waist-hip ratio obesity WCadjBMI obesity Weight obesity WHR obesity WHRadjBMI lipids Apolipoprotein Levels lipids Cholesterol lipids Cholesterol total lipids HDL cholesterol lipids HDL Cholesterol - Triglycerides (HDLC-TG) lipids Hypertriglyceridemia lipids LDL cholesterol lipids LDL (oxidized) lipids Lipid metabolism phenotypes lipids Lipid traits lipids Lipoprotein-associated phospholipase A2 activity and mass lipids Lipoprotein-associated phospholipase A2 activity change in response to statin therapy

56

lipids Oleic acid (18:1n-9) plasma levels lipids Palmitic acid (16:0) plasma levels lipids Palmitoleic acid (16:1n-7) plasma levels lipids Phospholipid levels (plasma) lipids Phytosterol levels lipids Plasminogen activator inhibitor type 1 levels (PAI-1) lipids Resistin levels lipids Sphingolipid levels lipids Stearic acid (18:0) plasma levels lipids Triglycerides lipids Triglycerides-Blood Pressure (TG-BP) MetS Metabolic syndrome MetS Metabolic syndrome (bivariate traits) MetS Metabolic traits cardiac Abdominal aortic aneurysm cardiac Ankle-brachial index cardiac Aortic root size cardiac Aortic stiffness cardiac Aortic-valve calcification cardiac Arterial stiffness cardiac Cardiac hypertrophy cardiac Cardiac structure and function cardiac Cardiac Troponin-T levels cardiac Cardiovascular disease risk factors cardiac Cardiovascular heart disease in diabetics cardiac Carotid atherosclerosis in HIV infection cardiac Carotid intima media thickness cardiac Coronary arterial lesions in patients with Kawasaki disease cardiac Coronary heart disease cardiac Hypertension cardiac Hypertension risk in short sleep duration cardiac Intracranial aneurysm cardiac Lp(a) levels cardiac Moyamoya disease cardiac Myocardial infarction cardiac Myocardial infarction (early onset) cardiac Pulmonary arterial hypertension (without BMPR2 mutations) cardiac Stroke (ischemic) cardiac Thiazide-induced adverse metabolic effects in hypertensive patients cardiac Vascular dementia cardiac Venous thromboembolism cardiac Venous thromboembolism (gene x gene interaction) cardiac Coronary artery calcification cardiac Thoracic aortic aneurysms and dissections cardiac Stroke other circulating leptin levels adjusted for BMI

57

other circulating leptin levels other adiponectin measurement Table 2.8: 93 cardiometabolic diseases and quantitative traits key terms for which GWAS Catalog loci were examined

58

GWAS lead variant GWAS variant association with Lead eQTL variant (eSNP) Lead eQTL variant eQTL gene association with eQTL gene (eSNP) conditioned on GWAS variant

GWAS rsID Risk GWAS eQTL gene Pinitial βinitial eSNP rsID Eff Pinitial βiniti Pcond βcond LD 2 Allel Trait All al r e rs1534696 C WHRadjB SNX10 2.71E-76 -1.08 rs1534696 A 2.71E- 1.08 9.76E-01 0.00E+00 - MI 76 rs41279633 T LDL NPC1L1 4.80E-51 1.06 rs4127963 T 4.80E- 1.06 9.98E-01 3.09E+11 - 3 51 rs1061810 A T2D HSD17B12 7.00E-39 -0.88 rs1061810 A 7.00E- - 8.10E-01 3.44E+13 - 39 0.88 rs56271783 C WHRadjB VEGFB 3.37E-37 -1.51 rs5627178 C 3.37E- - 9.91E-01 5.48E+11 - MI 3 37 1.51 rs78058190 G HDL WNT6 4.61E-36 -1.65 rs7805819 G 4.61E- - 9.12E-01 - - 0 36 1.65 2.93E+13 rs2243312 A HDL RP11- 6.20E-35 0.77 rs2243312 A 6.20E- 0.77 4.30E-01 0.00E+00 - 59

217B7.2 35 rs17326656 T WHRadjB LHCGR 3.32E-28 -1.00 rs1732665 T 3.32E- - 9.84E-01 - - MI 6 28 1.00 9.17E+11 rs8126001 C WHRadjB OPRL1 1.07E-27 -0.71 rs8126001 T 1.07E- 0.71 5.61E-01 0.00E+00 - MI 27 rs886444 G BMI CDK5RAP3 2.27E-24 -0.67 rs886444 G 2.27E- - 9.95E-01 - - 24 0.67 4.74E+11 rs28451064 A WHRadjB LINC00310 5.38E-22 -0.85 rs2845106 A 5.38E- - 9.18E-01 - - MI 4 22 0.85 2.96E+13 rs1412444 T Coronary LIPA 3.50E-20 0.61 rs1412444 T 3.50E- 0.61 9.97E-01 9.88E+11 - heart 20 disease rs3814883 T BMI INO80E 2.84E-19 -0.59 rs3814883 T 2.84E- - 9.36E-01 2.44E+13 - 19 0.59 rs13095652 C BMI DOCK3 2.57E-17 -0.60 rs1309565 C 2.57E- - 8.80E-01 - - 2 17 0.60 1.47E+13 rs11926707 C T2D PTH1R 2.90E-17 -0.59 rs1192670 C 2.90E- - 8.92E-01 - - 7 17 0.59 7.20E+12 rs5112 G TG APOC1P1 1.98E-16 -0.53 rs5112 G 1.98E- - 3.07E-01 0.00E+00 - 16 0.53

rs4731702 C HDL KLF14 3.29E-16 -0.54 rs4731702 T 3.29E- 0.54 9.81E-01 - - 16 6.88E+12 rs8103017 G WHRadjB SSC5D 3.75E-15 0.58 rs8103017 G 3.75E- 0.58 9.94E-01 3.39E+11 - MI 15 rs860295 G BMI DAP3 8.00E-14 0.51 rs860295 G 8.00E- 0.51 9.96E-01 2.58E+11 - 14 rs3768321 T T2D PABPC4 1.73E-13 -0.66 rs3768321 T 1.73E- - 9.54E-01 1.33E+13 - 13 0.66 rs638722 C WHRadjB C1orf213 1.95E-13 -0.56 rs638722 C 1.95E- - 1.00E+00 - - MI 13 0.56 3.77E+09 rs2925979 T WHRadjB CMIP 2.24E-13 0.53 rs2925979 T 2.24E- 0.53 9.61E-01 1.50E+13 - MI 13 rs1979527 A WHRadjB ABTB1 2.53E-11 0.47 rs1979527 A 2.53E- 0.47 9.99E-01 - - MI 11 1.12E+11 rs76895963 T T2D CCND2 3.39E-10 -1.53 rs7689596 T 3.39E- - 9.27E-01 0.00E+00 - 3 10 1.53 rs12325187 C WHRadjB RP11- 5.37E-10 -0.44 rs1232518 C 5.37E- 0.44 9.64E-01 1.17E+12 - 60

MI 433P17.3 7 10

rs603424 A Metabolic SCD 1.66E-09 -0.59 rs603424 A 1.66E- - 9.96E-01 - - traits 09 0.59 3.17E+11 rs1534696 C WHRadjB CBX3 2.00E-09 -0.41 rs1534696 A 2.00E- 0.41 7.52E-01 0.00E+00 - MI 09 rs603424 A Metabolic AL139819.1 6.73E-09 -0.57 rs603424 A 6.73E- - 9.96E-01 - - traits 09 0.57 3.07E+11 rs333947 A HDL CSF1 2.85E-08 0.46 rs333947 A 2.85E- 0.46 1.00E+00 0.00E+00 - 08 rs11688682 G T2D INHBB 3.47E-08 0.42 rs1168868 G 3.47E- - 9.96E-01 - - 2 08 0.42 1.56E+11 rs6087685 C Venous PROCR 1.53E-07 -0.38 rs6087685 C 1.53E- - 9.97E-01 - - thromboem 07 0.38 2.49E+11 bolism rs740157 A BMI RP5- 3.09E-07 -0.33 rs740157 G 3.09E- 0.33 9.93E-01 6.65E+11 - 899E9.1 07 rs12946454 T SBP HEXIM1 6.25E-07 -0.38 rs1294645 T 6.25E- - 1.00E+00 - - 4 07 0.38 6.52E+09 rs7134375 A BMI PDE3A 9.51E-07 -0.34 rs7134375 A 9.51E- - 9.94E-01 - - 07 0.34 1.57E+12 rs13139571 C DBP RP11- 3.31E-06 -0.36 rs1313957 C 3.31E- - 9.94E-01 3.26E+11 - 588K22.2 1 06 0.36

rs7932891 A WHRadjB EIF4G2 6.23E-06 -0.31 rs7932891 A 6.23E- - 9.94E-01 4.34E+11 - MI 06 0.31 rs11277705 A HDL POLR2C 6.68E-06 0.51 rs1127770 A 6.68E- 0.51 9.19E-01 3.77E+13 - 1 51 06 rs9299 T BMI HOXB7 8.00E-06 0.34 rs9299 T 8.00E- 0.34 9.78E-01 - - 06 2.74E+12 rs8070454 C BMI MED24 5.26E-65 0.99 rs4795412 C 5.14E- 0.99 4.40E-01 2.63E+01 1.00 65 rs12489828 T WHRadjB NT5DC2 6.82E-44 -0.87 rs7614981 A 6.82E- - 8.11E-01 1.10E+13 1.00 MI 44 0.87 rs316611 T BMI NDUFAF1 2.93E-39 -0.88 rs1139675 T 3.53E- - 2.47E-02 -6.41E- 1.00 48 40 0.90 01 rs2222543 G WHRadjB CPED1 2.85E-35 0.78 rs4731009 C 2.81E- - 7.15E-01 - 1.00 MI 35 0.78 8.11E+00 rs35565646 A WHRadjB DISP2 1.27E-32 0.99 rs3803360 T 1.26E- 0.99 7.42E-01 3.46E+01 1.00 MI 32 rs7187776 G BMI RP11- 2.29E-21 -0.60 rs8049439 C 2.29E- - 9.62E-01 7.07E+12 1.00 61

1348G14.4 21 0.60

rs17207196 C BMI AC018720.1 6.95E-16 0.56 rs6247766 T 6.28E- - 5.56E-01 - 1.00 0 9 16 0.57 1.64E+00 rs211718 T Metabolic ACADM 3.43E-11 0.49 rs1384939 C 3.00E- 0.50 8.94E-02 1.06E+01 1.00 traits 88 11 rs1708302 C T2D JAZF1 9.77E-11 -0.43 rs1513272 C 9.62E- - 4.72E-01 - 1.00 11 0.43 1.43E+01 rs672356 G WHRadjB CCDC144B 6.34E-07 -0.37 rs2461838 G 6.21E- - 2.46E-01 - 1.00 MI 07 0.37 2.54E+01 rs2652822 C Metabolic LACTB 1.43E-37 0.77 rs1126308 C 1.42E- 0.77 7.90E-01 1.31E+01 1.00 traits 37 rs7184597 T Obesity NPIPB6 1.08E-14 0.53 rs1164665 C 1.08E- 0.53 1.65E-01 1.58E+02 1.00 3 14 rs912768 C BMI PRDX6 1.26E-12 -0.50 rs1886638 G 1.26E- - 1.00E+00 0.00E+00 1.00 12 0.50 rs1308362 A BMI MRAS 1.57E-17 0.66 rs1308247 T 1.55E- - 9.14E-02 - 1.00 1 17 0.66 2.07E+02 rs1063355 G T2D HLA-DQA1 3.24E-86 -1.14 rs9273493 C 1.80E- - 1.33E-01 -6.91E- 1.00 86 1.17 01 rs2049805 T blood urea GBAP1 2.56E-46 0.88 rs2974929 T 1.60E- 0.88 3.04E-01 1.35E+00 1.00 nitrogen 46 rs4897182 G T2D CENPW 2.95E-08 -0.37 rs4897179 A 1.19E- - 1.37E-02 - 1.00 08 0.38 2.38E+00

rs1075901 C BMI ZSWIM7 4.66E-101 -1.14 rs9709250 G 4.59E- - 2.23E-01 - 1.00 101 1.14 1.30E+02 rs10049088 C WHRadjB TIPARP 9.50E-35 -0.84 rs5640631 C 7.65E- - 5.01E-01 - 1.00 MI 1 35 0.84 1.09E+00 rs757608 A WHRadjB C17orf82 5.69E-08 -0.40 rs882367 C 5.62E- - 4.59E-01 - 1.00 MI 08 0.40 1.92E+01 rs950732 T WHR MARCH5 1.68E-12 -0.46 rs1078603 G 1.56E- - 3.23E-01 - 1.00 7 12 0.46 5.82E+00 rs1933182 C eGFRcrea SYPL2 4.36E-21 -0.67 rs2043059 A 1.51E- - 7.16E-02 - 1.00 21 0.68 1.67E+00 rs12574668 A BMI C11orf49 7.42E-08 -0.60 rs2902858 C 5.25E- - 3.95E-01 -8.24E- 1.00 08 0.61 01 rs10948059 C HDL PEX6 2.07E-93 -1.11 rs6907751 C 1.08E- - 2.50E-01 -9.65E- 1.00 93 1.12 01 rs11102965 C LDL CELSR2 8.02E-13 0.48 rs3429302 T 7.89E- 0.48 7.17E-01 3.55E+00 1.00 1 13 rs1549293 C BMI KAT8 1.34E-30 0.71 rs9972727 G 1.32E- 0.71 8.51E-01 1.17E+00 1.00 62

30

rs6700838 C BMI TAL1 3.53E-22 -0.68 rs6178266 A 3.33E- - 5.77E-01 - 1.00 5 22 0.68 3.30E+00 rs926279 A BMI LINC00680 4.57E-71 -1.10 rs4928688 C 1.49E- - 1.18E-01 - 1.00 71 1.14 1.63E+00 rs9341990 A WHRadjB TBX18 8.60E-06 -0.32 rs9362083 T 8.60E- - 5.99E-01 - 1.00 MI 06 0.32 3.66E+02 rs8396 C Metabolic PPID 2.88E-42 -1.01 rs9918016 A 2.84E- - 2.10E-01 - 1.00 traits 42 1.01 1.06E+02 rs17764730 C WHRadjB CTC- 1.27E-22 0.94 rs6888037 T 4.11E- 0.95 7.48E-02 2.08E+00 1.00 MI 228N24.3 23 rs1045241 C WHRadjB TNFAIP8 1.88E-12 -0.51 rs7703744 C 1.81E- - 7.71E-01 -8.40E- 1.00 MI 12 0.51 01 rs9846123 T BMI WDR6 8.77E-87 1.12 rs6768433 G 2.95E- 1.12 6.18E-03 1.64E+00 1.00 88 rs9846123 T BMI ARIH2 5.99E-10 -0.42 rs6768433 G 4.98E- - 5.35E-01 -5.59E- 1.00 10 0.42 01 rs8070454 C BMI PSMD3 7.62E-08 0.36 rs8081692 A 6.44E- 0.36 3.00E-01 2.18E+00 1.00 08 rs1789882 A WHR ADH1A 1.86E-50 -1.21 rs904092 A 1.47E- - 3.86E-01 - 1.00 50 1.21 3.21E+00 rs12964689 A BMI NPC1 4.95E-18 -0.54 rs6507716 G 4.91E- 0.54 4.20E-01 4.23E+01 1.00 18

rs13391552 A Metabolic ALMS1P 1.49E-09 0.48 rs6736866 G 1.16E- 0.49 7.78E-02 5.91E+00 1.00 traits 09 rs709400 G BMI AL049840.1 5.09E-32 0.78 rs5588559 T 5.03E- 0.78 3.22E-01 6.43E+01 1.00 2 32 rs9435732 C WHRadjB MFAP2 1.94E-16 0.57 rs5626511 C 7.86E- 0.57 7.97E-02 1.62E+00 1.00 MI 7 17 rs2657879 G Metabolic SPRYD4 1.87E-16 0.66 rs7311487 C 1.30E- 0.67 3.93E-01 7.93E-01 1.00 traits 2 16 rs6700838 C BMI RP1- 1.05E-10 -0.47 rs3408721 C 1.02E- - 6.94E-01 - 1.00 18D14.7 0 10 0.47 2.47E+00 rs823074 T BMI PM20D1 2.65E-46 0.87 rs708727 G 2.50E- 0.87 7.25E-01 7.12E-01 1.00 46 rs58749629 A Abdominal NEURL2 2.10E-17 0.82 rs7270354 A 2.02E- 0.82 6.72E-01 3.10E+00 1.00 aortic 17 aneurysm rs709400 G BMI RP11- 1.89E-50 0.94 rs7359090 G 1.65E- 0.94 5.67E-01 1.73E+00 1.00 63

73M18.9 50

rs524281 C BMI PACS1 1.13E-33 -0.89 rs489337 A 3.70E- - 1.12E-01 - 1.00 34 0.89 1.37E+00 rs2581787 T T2D RFT1 2.60E-16 -0.54 rs2564924 T 1.21E- - 1.07E-01 - 0.99 16 0.54 1.53E+00 rs1075901 C BMI ADORA2B 1.18E-48 -0.89 rs4792712 G 1.17E- - 6.51E-04 - 0.99 48 0.89 8.00E+02 rs6684514 G Glycated C1orf85 8.14E-25 0.76 rs1204086 G 6.09E- 0.76 4.46E-01 6.77E-01 0.99 hemoglobin 8 25 levels rs2306589 T WHR ZNHIT3 2.96E-07 -0.34 rs8882 A 9.70E- - 4.09E-02 - 0.99 08 0.35 1.15E+00 rs450231 G BMI ANKS6 2.05E-21 0.65 rs7035722 C 5.95E- 0.65 4.15E-02 1.82E+00 0.99 22 rs2814992 G BMI UHRF1BP1 6.68E-48 0.92 rs3414161 A 6.17E- 0.93 2.92E-02 8.64E-01 0.99 4 49 rs4253772 C LDL CDPF1 2.98E-15 0.99 rs4555093 C 1.01E- 1.01 1.32E-01 1.35E+00 0.99 7 15 rs7871866 C BMI SLC27A4 4.87E-10 -0.52 rs4734 G 3.78E- - 4.62E-01 -7.32E- 0.99 10 0.52 01 rs7871866 C BMI RP11- 3.05E-07 0.43 rs4734 G 2.99E- 0.43 7.96E-01 2.61E-01 0.99 339B21.13 07 rs1700137 C BMI MMP16 5.30E-09 -0.43 rs2867115 T 4.32E- - 5.25E-01 -3.74E- 0.99 3 09 0.43 01

rs7739232 A HIPadjBMI KLHL31 6.95E-29 -1.33 rs7990342 G 6.20E- - 5.68E-01 - 0.99 4 29 1.33 2.97E+00 rs9884482 C FIns TET2 1.76E-07 -0.37 rs1264325 G 1.21E- - 2.70E-01 - 0.99 7 07 0.38 1.06E+00 rs12964689 A BMI C18orf8 1.15E-08 -0.36 rs1245726 T 3.47E- 0.37 8.91E-02 6.67E-01 0.99 1 09 rs17207196 C BMI POM121C 2.88E-20 -0.64 rs794373 C 1.55E- 0.63 2.65E-01 5.54E-01 0.99 20 rs6859752 T WHRadjB SH3PXD2B 3.04E-19 0.66 rs6866204 A 1.59E- 0.68 1.54E-03 1.88E+00 0.99 MI 20 rs1730859 G BMI PRMT6 1.09E-25 -0.76 rs1240863 A 9.89E- - 5.88E-01 -4.79E- 0.99 4 26 0.76 01 rs2294120 A T2D ZNF34 3.33E-12 0.47 rs7004501 C 2.91E- 0.47 5.91E-01 3.61E-01 0.99 12 rs72875462 C LDL ABCG5 4.79E-08 -0.60 rs7533144 G 2.35E- - 1.59E-01 - 0.99 4 08 0.61 1.35E+00 rs926279 A BMI XXbac- 3.72E-44 0.92 rs1970605 G 1.01E- 0.94 3.43E-02 2.66E+00 0.99 64

BPGBPG55 44

C20.2 rs1106908 G BMI DHRS11 3.26E-33 -0.78 rs4796225 A 3.11E- - 2.26E-01 - 0.99 33 0.78 2.39E+01 rs7479183 T WHRadjB PIDD 8.20E-48 -0.91 rs1124631 G 8.17E- - 1.24E-02 - 0.99 MI 4 49 0.91 1.88E+00 rs12145743 T HDL RRNAD1 2.09E-08 -0.40 rs3806415 C 1.14E- - 1.96E-01 -8.96E- 0.99 08 0.41 01 rs11672254 G LDL FPR3 1.61E-06 0.33 rs1769509 A 9.74E- 0.34 2.99E-01 4.99E-01 0.99 6 07 rs10779751 A BMI MTOR 1.05E-13 0.55 rs1273936 T 4.81E- 0.55 1.86E-01 8.57E-01 0.99 8 14 rs926279 A BMI XXbac- 1.45E-58 -1.02 rs2398021 T 3.44E- - 3.98E-03 - 0.99 BPGBPG55 60 1.09 1.60E+00 C20.1 rs12679556 G WHRadjB EYA1 1.97E-16 0.62 rs7843475 G 1.14E- 0.62 2.57E-01 1.05E+00 0.99 MI 16 rs6764533 A BMI CTD- 1.76E-06 -0.35 rs7372674 A 1.09E- - 3.30E-01 -4.17E- 0.99 2002J20.1 06 0.35 01 rs6558295 G Metabolic KIAA1875 9.87E-39 -1.36 rs1325044 A 8.90E- - 2.46E-02 - 0.99 traits 6 40 1.38 1.79E+00 rs6558295 G Metabolic MAF1 4.55E-07 -0.57 rs1154147 T 3.89E- - 5.83E-01 -5.35E- 0.99 traits 5 07 0.57 01

rs1020548 G BMI BEND6 6.19E-22 -0.78 rs6904924 A 1.79E- - 1.16E-01 -7.11E- 0.99 22 0.78 01 rs17207196 C BMI NSUN5P1 7.90E-11 0.46 rs794375 C 2.12E- - 6.12E-02 -9.57E- 0.99 11 0.47 01 rs2820311 G BMI LMOD1 6.11E-41 -0.90 rs2819346 C 8.93E- - 3.77E-03 -8.96E- 0.99 43 0.92 01 rs17173637 C HDL AOC1 1.76E-20 0.82 rs7787577 G 2.76E- 0.83 5.11E-02 1.02E+00 0.99 21 chr6:324483 G WHRadjB HLA-DRB6 1.24E-64 0.97 rs1168403 T 9.68E- 0.99 2.33E-04 2.11E+00 0.99 98 MI 30 67 rs73858966 A WHRadjB COL8A1 3.71E-07 0.63 rs7386046 T 2.97E- 0.65 3.64E-01 2.10E+00 0.98 MI 4 07 rs4281707 G T2D AKTIP 4.11E-12 0.48 rs8055279 T 1.48E- 0.49 6.49E-02 1.36E+00 0.98 12 rs4372913 G WHRadjB AC017074.2 3.43E-07 0.38 rs6216882 C 9.98E- 0.39 5.84E-02 9.01E-01 0.98 MI 1 08 rs2513987 G WHR RP11- 3.51E-11 -0.50 rs634366 C 8.57E- - 2.74E-02 - 0.98 65

690D19.3 12 0.51 1.48E+00

rs13240600 A BMI CPSF4 2.46E-12 0.57 rs1025775 A 1.00E- 0.58 8.26E-02 1.62E+00 0.98 8 12 rs13240600 A BMI CPSF4 2.46E-12 0.57 rs1025775 A 1.00E- 0.58 8.26E-02 1.62E+00 0.98 8 12 rs4886506 T BMI SCAPER 3.68E-15 -0.53 rs1163040 G 2.94E- - 1.04E-02 - 0.98 2 16 0.55 1.07E+00 rs2013208 C HDL RBM6 2.17E-68 -0.99 rs2856234 G 2.73E- 1.00 2.32E-04 1.16E+00 0.98 71 rs7122539 G BMI RP11- 6.29E-07 -0.35 rs7931485 G 2.94E- - 2.14E-01 -4.59E- 0.98 867G23.1 07 0.36 01 rs17145738 T TG BCL7B 2.09E-14 0.77 rs1324426 C 1.27E- 0.78 2.19E-01 1.94E+00 0.98 8 14 rs10463416 A WHRadjB ABLIM3 4.59E-17 0.58 rs6889927 A 2.07E- 0.61 5.89E-03 1.07E+00 0.98 MI 18 rs6684514 G Glycated CCT3 6.44E-20 0.68 rs2075166 G 2.63E- 0.69 1.73E-01 5.65E-01 0.98 hemoglobin 20 levels rs754523 G LDL APOB 1.71E-30 0.82 rs1169387 C 1.60E- 0.81 4.73E-01 4.38E-01 0.98 0 30 rs4731702 C HDL RP11- 1.74E-07 -0.35 rs1215462 C 9.33E- 0.36 1.13E-01 1.28E+00 0.98 138A9.1 7 08

rs4731702 C HDL AC016831.7 7.08E-06 -0.30 rs1215462 C 4.19E- 0.31 1.30E-01 1.24E+00 0.98 7 06 rs6047259 T WHRadjB PLK1S1 2.38E-14 -0.53 rs6047255 G 1.97E- - 5.37E-01 -6.52E- 0.98 MI 14 0.54 01 rs12051272 G Adiponecti CDH13 3.15E-25 -0.99 rs2239857 C 4.35E- - 4.77E-02 -9.61E- 0.98 n 26 1.00 01 rs4281707 G T2D RBL2 4.15E-24 0.69 rs2908797 C 3.68E- 0.68 5.90E-01 4.75E-01 0.98 24 rs10832778 G BMI RP1- 7.01E-11 0.42 rs7928810 A 5.82E- 0.42 5.17E-01 3.09E-01 0.98 239B22.5 11 rs13134327 A Glycated RP11- 5.36E-20 -0.62 rs1838518 C 5.31E- - 1.39E-02 - 0.98 hemoglobin 673E1.1 21 0.64 1.25E+00 levels rs12325187 C WHRadjB ZNF75A 1.67E-51 -0.97 rs1761186 T 9.89E- 1.00 1.69E-02 8.82E-01 0.98 MI 6 53 rs6463489 T BMI FBXL18 6.28E-13 0.97 rs1022720 C 2.68E- 1.00 1.92E-01 8.91E-01 0.98 66

1 13

rs2710323 C BMI STAB1 2.11E-07 -0.36 rs2577831 A 9.97E- - 1.67E-01 -7.25E- 0.98 08 0.37 01 rs2814992 G BMI SPC 1.44E-20 -0.63 rs2814951 T 9.00E- - 2.41E-01 -4.09E- 0.98 21 0.64 01 rs7350438 C WHRadjB NET1 9.63E-08 -0.37 rs1090586 G 5.71E- - 2.83E-01 -5.99E- 0.97 MI 0 08 0.39 01 rs4660293 G HDL OXCT2 4.16E-11 0.52 rs6178137 G 2.43E- 0.55 8.53E-03 9.78E-01 0.97 1 12 rs11784833 C LDL GRINA 2.87E-11 0.43 rs5795797 A 1.50E- 0.44 2.59E-01 4.37E-01 0.97 4 11 rs12681990 C T2D ZNF703 1.11E-18 -0.84 rs1011065 C 6.96E- - 9.57E-04 - 0.97 1 21 0.88 1.22E+00 rs12369009 T BMI RP3- 9.59E-06 -0.40 rs7953810 C 3.14E- - 6.64E-02 - 0.97 473L9.4 06 0.42 1.05E+00 rs6058635 C BMI TSPY26P 3.78E-21 0.63 rs6141640 T 2.15E- - 2.87E-01 -5.59E- 0.97 21 0.63 01 rs2820311 G BMI IPO9 2.51E-07 -0.38 rs2644134 A 6.40E- - 5.00E-02 -8.91E- 0.97 08 0.40 01 rs74551598 A TC AKNA 2.07E-13 -0.51 rs4979387 G 3.49E- - 5.40E-02 -6.87E- 0.97 14 0.52 01 rs272889 A Metabolic AC034220.3 3.31E-48 -0.87 rs272852 A 1.47E- 0.89 9.98E-04 8.26E-01 0.97 traits 50

rs2448 T WHR FST 5.41E-06 0.39 rs5906173 A 2.60E- 0.40 2.19E-01 5.43E-01 0.97 8 06 rs4805834 T creatinine SLC7A9 8.31E-14 0.81 rs1215099 T 1.48E- 0.86 6.51E-02 9.28E-01 0.97 2 14 rs11600990 C BMI TEX40 3.79E-06 0.46 rs477895 T 5.74E- 0.48 2.58E-02 9.78E-01 0.97 07 rs7918400 T T2D RP11- 1.65E-29 0.72 rs7094871 G 6.72E- - 1.56E-01 -5.44E- 0.96 57H14.2 30 0.73 01 rs10268050 C BMI NUDCD3 1.68E-21 0.75 rs6463234 A 1.20E- 0.77 3.29E-01 4.97E-01 0.96 21 rs12430764 A WHR GPC6 5.64E-12 0.46 rs1328369 T 2.21E- 0.50 4.79E-03 8.85E-01 0.96 13 rs2230590 C BMI MST1R 2.64E-06 0.32 rs9862795 T 1.67E- 0.32 3.50E-01 3.24E-01 0.96 06 rs3111316 A T2D FARSA 1.84E-13 -0.54 rs2965220 C 1.97E- - 2.08E-02 -9.54E- 0.96 14 0.56 01 rs10164099 C WHR RP11- 2.97E-08 -0.51 rs5868335 C 1.09E- - 1.59E-01 -6.17E- 0.96 67

95O2.5 6 08 0.52 01

rs2453580 T eGFRcrea SLC47A1 4.78E-26 -0.75 rs2453584 G 2.38E- - 1.49E-01 -4.82E- 0.96 26 0.76 01 rs3829849 T BMI LMX1B 3.71E-06 -0.31 rs1044828 T 2.46E- - 3.52E-01 -4.61E- 0.96 5 06 0.32 01 rs1801214 T T2D WFS1 1.83E-23 0.64 rs5018648 G 5.59E- 0.67 6.69E-04 7.42E-01 0.96 26 rs11869286 G HDL PGAP3 1.83E-10 0.44 rs903502 C 5.22E- - 1.07E-01 -5.84E- 0.96 11 0.45 01 rs6980507 A Glycated SMIM19 8.02E-62 1.02 rs3099936 T 1.15E- 1.06 2.75E-03 8.52E-01 0.96 hemoglobin 63 levels rs2710323 C BMI GNL3 8.13E-30 -0.75 rs7646741 G 2.71E- - 9.47E-02 -5.18E- 0.95 30 0.76 01 rs516946 C T2D ANK1 3.21E-06 0.39 rs3802315 G 1.85E- 0.40 3.00E-01 4.56E-01 0.95 06 rs3825061 T WHR HMBS 2.01E-18 0.68 rs4614 G 5.17E- 0.68 8.69E-02 5.30E-01 0.95 19 rs9931407 T BMI CTC- 6.84E-06 -0.50 rs7550745 C 1.74E- - 3.32E-02 - 0.95 277H1.7 7 06 0.53 1.48E+00 rs9838283 A BMI RP11- 2.99E-07 0.56 rs1307534 T 2.19E- 0.57 1.48E-02 9.11E-01 0.95 804H8.6 4 08

rs79598313 T LDL ZDHHC18 5.88E-09 1.00 rs1820509 T 4.96E- 0.99 4.98E-01 6.54E-01 0.95 89 09 rs1045411 C WHR HMGB1 4.82E-15 0.53 rs7316731 C 1.03E- 0.55 8.11E-02 5.53E-01 0.95 3 15 rs7307277 A WHRadjB RP11- 3.51E-06 -0.36 rs3789967 T 2.14E- - 3.22E-01 -3.05E- 0.95 MI 380L11.4 06 0.36 01 rs11602339 T BMI MTCH2 5.82E-31 0.76 rs4752856 A 1.11E- 0.79 4.79E-03 7.09E-01 0.95 32 rs3825807 A Coronary ADAMTS7 4.16E-06 0.33 rs1205052 T 2.96E- 0.38 1.36E-02 6.22E-01 0.95 heart 5 07 disease rs11123170 G blood urea PAX8 1.50E-58 -1.00 rs7589901 C 2.74E- - 3.25E-05 -9.31E- 0.95 nitrogen 62 1.02 01 rs2294120 A T2D ZNF251 2.88E-09 0.40 rs2730059 C 6.84E- - 2.23E-03 -7.97E- 0.94 11 0.44 01 rs75584603 C WHRadjB HOXA11-AS 2.22E-08 1.32 rs1482561 A 3.13E- 1.35 1.65E-03 2.20E+00 0.94 MI 16 10 rs4932265 T T2D C15orf38- 3.01E-40 0.92 rs4932143 G 2.64E- 0.93 1.26E-04 7.67E-01 0.94 68

AP3S2 43

rs6063048 G T2D EYA2 3.86E-09 0.45 rs1206773 G 6.90E- 0.47 5.85E-02 6.31E-01 0.94 10 rs7941030 T HDL UBASH3B 2.01E-07 -0.35 rs1089287 G 3.23E- - 4.42E-02 -5.80E- 0.93 3 08 0.38 01 rs2294120 A T2D RPL8 2.82E-45 -0.87 rs2954654 G 1.62E- 0.87 1.07E-02 6.43E-01 0.93 46 rs7187776 G BMI ATXN2L 1.11E-13 -0.48 rs6203661 G 1.78E- - 4.40E-05 - 0.93 6 16 0.53 1.00E+00 rs9931407 T BMI FHOD1 1.96E-09 -0.66 rs7279803 G 2.12E- - 3.66E-02 -6.17E- 0.93 2 10 0.67 01 rs72959041 A WHRadjB RSPO3 1.87E-16 1.25 rs7296100 A 6.70E- 1.27 9.19E-02 8.37E-01 0.93 MI 7 17 rs7187776 G BMI SH2B1 3.40E-25 -0.65 rs8062405 G 1.25E- - 1.02E-02 -6.13E- 0.93 26 0.67 01 rs7187776 G BMI SH2B1 3.40E-25 -0.65 rs8062405 G 1.25E- - 1.02E-02 -6.13E- 0.93 26 0.67 01 rs62034325 G BMI SULT1A2 1.03E-27 0.67 rs2869866 T 9.58E- 0.69 1.61E-03 9.28E-01 0.93 7 30 rs2972144 G T2D IRS1 1.78E-15 -0.54 rs1515110 T 4.32E- - 8.34E-02 -4.41E- 0.93 16 0.55 01

rs1544474 G WHRadjB LAMB1 7.13E-21 -0.60 rs2021885 T 4.75E- - 2.06E-02 -6.02E- 0.92 MI 22 0.63 01 rs2034088 T HIPadjBMI VPS53 3.96E-07 0.34 rs7214241 G 2.23E- 0.38 8.88E-03 6.58E-01 0.92 08 rs12748152 T HDL RP5- 3.60E-14 -0.93 rs1275446 G 8.05E- - 8.49E-02 -8.60E- 0.92 968P14.2 8 15 0.94 01 rs4932265 T T2D AP3S2 2.69E-17 0.62 rs3560832 C 6.37E- 0.65 5.06E-04 7.38E-01 0.92 7 20 rs9291467 T BMI ANAPC4 9.94E-51 0.88 rs1002781 G 3.99E- 0.90 8.07E-05 9.19E-01 0.92 4 54 rs7102 C WHRadjB LITAF 1.54E-65 -1.07 rs3784924 G 4.70E- - 1.43E-03 -5.94E- 0.92 MI 66 1.09 01 rs9846123 T BMI NCKIPSD 1.01E-44 -0.88 rs6788650 T 7.09E- - 1.31E-05 -9.10E- 0.92 49 0.91 01 rs919433 A BMI AC013264.2 3.03E-10 0.42 rs1093177 C 3.09E- 0.43 2.42E-02 6.57E-01 0.92 9 11 rs7595 C Obesity- TSEN34 2.05E-12 0.57 rs36623 G 1.94E- 0.56 2.95E-01 2.91E-01 0.92 69

related 12

traits rs9846123 T BMI CCDC36 5.37E-09 -0.40 rs1192855 T 2.39E- - 1.20E-04 -9.36E- 0.91 2 11 0.46 01 rs9846123 T BMI RP11-3B7.1 1.99E-08 -0.39 rs1192855 T 3.22E- - 1.34E-03 -7.85E- 0.91 2 10 0.43 01 rs9838283 A BMI HEMK1 2.48E-08 0.60 rs1113025 G 8.69E- 0.62 1.54E-01 6.19E-01 0.91 2 09 rs11917361 C WHRadjB CCDC12 6.44E-21 0.65 rs7306910 T 8.27E- 0.67 3.34E-03 6.95E-01 0.91 MI 4 23 rs6884702 G T2D RP11- 2.98E-13 -0.47 rs5730097 T 1.72E- - 1.39E-02 -6.78E- 0.91 53O19.3 2 14 0.50 01 rs10190332 T BMI RP11- 1.03E-15 0.55 rs777589 C 7.21E- 0.58 2.05E-02 4.99E-01 0.91 493E12.2 17 rs8126001 C WHRadjB RGS19 1.35E-16 -0.56 rs6062343 A 6.20E- 0.56 1.58E-01 3.79E-01 0.90 MI 17 rs2275426 A BMI MAST2 5.75E-08 0.38 rs4626927 T 4.60E- 0.38 3.40E-01 2.24E-01 0.90 08 rs4653017 T BMI PHC2 3.07E-07 -0.38 rs1324476 A 2.18E- - 3.59E-01 -2.63E- 0.90 07 0.38 01 rs7902274 T LDL C10orf88 3.37E-06 -0.33 rs1891111 G 4.58E- - 2.61E-02 -6.59E- 0.90 07 0.35 01

rs59774409 T TG RCN3 1.06E-14 -1.44 rs1125907 G 2.13E- - 5.04E-02 - 0.90 88 15 1.56 1.12E+00 rs4846914 G HDL GALNT2 1.18E-15 -0.53 rs1077983 C 4.06E- 0.54 7.90E-02 3.51E-01 0.90 5 16 rs6738445 C BMI AC068039.4 6.16E-35 -0.88 rs1450188 G 5.10E- - 5.09E-03 -5.78E- 0.90 99 36 0.91 01 rs11715915 C FGlu APEH 3.21E-07 -0.34 rs1308579 A 2.62E- 0.37 2.51E-02 4.53E-01 0.90 1 08 rs6494481 T BMI TRIP4 1.41E-22 0.92 chr15:6485 T 9.18E- 0.91 8.50E-02 5.04E-01 0.90 7043 23 rs704 A LDL CTD- 2.48E-07 0.36 rs2073867 C 5.61E- 0.37 9.03E-02 3.78E-01 0.89 2350C19.1 08 rs12748152 T HDL PIGV 3.96E-11 -0.82 rs3447378 G 1.39E- - 1.20E-01 -6.11E- 0.89 8 11 0.84 01 rs2928148 A eGFRcrea INO80 2.41E-06 -0.32 rs9796 A 2.24E- - 3.22E-02 -3.84E- 0.89 07 0.36 01 rs6695980 T TG ATP2B4 2.33E-31 -0.71 rs6685593 A 2.21E- 0.75 1.45E-05 8.81E-01 0.89 70

35

rs34699 G WHR MAST4-AS1 5.24E-32 -0.78 rs706702 G 3.97E- - 1.36E-05 -7.77E- 0.89 36 0.83 01 rs2854322 T TC EVI2A 8.27E-07 0.35 rs2952994 T 5.71E- 0.38 1.27E-02 6.21E-01 0.89 08 rs181362 T HDL CCDC116 4.31E-21 -0.61 rs5754344 G 3.86E- - 1.82E-03 -7.97E- 0.89 23 0.64 01 rs524281 C BMI RP11- 2.28E-19 0.69 rs1075078 G 3.29E- 0.74 2.90E-04 8.48E-01 0.89 755F10.1 1 22 rs13333747 T WHR PKD1 3.77E-09 0.50 rs35786 T 1.99E- 0.53 1.72E-01 3.45E-01 0.88 09 rs10502148 C WHRadjB C11orf52 1.31E-08 -0.40 rs1089129 C 8.16E- - 1.00E-03 -6.43E- 0.88 MI 5 11 0.47 01 rs10502148 C WHRadjB C11orf52 1.31E-08 -0.40 rs1089129 C 8.16E- - 1.00E-03 -6.43E- 0.88 MI 5 11 0.47 01 rs4372836 T BMI AC097724.3 2.06E-12 -0.50 rs7607844 T 1.34E- - 2.03E-02 -5.82E- 0.87 13 0.54 01 rs435306 G HDL PLTP 5.83E-25 0.78 rs6104410 G 1.16E- 0.79 4.21E-03 6.46E-01 0.87 26 rs2110073 T Glycated U47924.30 1.84E-08 -0.74 rs5615765 C 9.19E- - 1.72E-01 -5.12E- 0.87 hemoglobin 6 09 0.77 01 levels

rs2276390 G WHR CRYAB 2.34E-13 -0.51 rs1241785 G 7.17E- - 4.67E-02 -3.32E- 0.87 1 14 0.53 01 rs11917361 C WHRadjB SETD2 1.28E-06 0.35 rs1215249 A 1.62E- 0.38 4.59E-02 3.94E-01 0.87 MI 3 07 rs11917361 C WHRadjB ELP6 1.37E-14 -0.54 rs1778488 C 9.85E- - 1.34E-04 -6.91E- 0.87 MI 2 18 0.59 01 rs3814883 T BMI HIRIP3 5.88E-09 -0.39 rs8054556 A 3.48E- - 1.83E-02 -4.69E- 0.87 10 0.42 01 rs7025486 A Abdominal DAB2IP 2.36E-07 -0.38 rs4837886 C 1.26E- - 2.29E-01 -2.74E- 0.87 aortic 07 0.38 01 aneurysm rs2048327 C Coronary SLC22A3 6.94E-13 0.54 rs1045578 T 8.95E- 0.57 3.89E-02 4.61E-01 0.87 heart 2 14 disease rs6558295 G Metabolic CYC1 4.21E-11 -0.74 rs1133329 C 1.48E- - 1.04E-02 -8.72E- 0.86 traits 67 12 0.92 01 rs2581787 T T2D SFMBT1 1.38E-11 0.45 rs3533315 T 3.31E- 0.46 7.09E-02 3.38E-01 0.86 71

5 12

rs12145743 T HDL HDGF 5.09E-06 0.33 rs1126453 C 6.82E- 0.38 2.26E-03 5.43E-01 0.86 1 08 rs7251505 G WHRadjB CEBPA-AS1 5.06E-16 -1.33 rs1177744 G 2.46E- - 2.33E-01 - 0.86 MI 95 16 1.37 1.38E+00 rs9438900 G LDL TMEM50A 1.21E-23 -0.65 rs3093578 A 5.72E- - 9.54E-05 -6.28E- 0.85 27 0.67 01 rs12680842 G BMI RP11- 5.81E-13 0.50 rs1254320 T 4.37E- 0.54 1.79E-03 4.83E-01 0.85 267M23.4 7 15 rs630337 C LDL RHCE 1.78E-24 -0.66 rs665372 C 1.21E- 0.65 3.54E-02 3.49E-01 0.85 24 rs223051 T BMI RCN1 1.64E-07 -0.38 rs223044 G 1.16E- - 1.01E-03 -6.03E- 0.85 09 0.42 01 rs8017377 A LDL NYIN 6.55E-06 -0.30 rs6573778 T 5.70E- - 2.74E-01 -1.72E- 0.84 06 0.31 01 rs709400 G BMI TRMT61A 7.08E-13 0.50 rs5744749 A 3.16E- 0.55 1.11E-03 5.75E-01 0.84 4 15 rs709400 G BMI APOPT1 1.40E-09 0.42 rs2274269 A 7.58E- 0.50 4.67E-05 7.27E-01 0.84 13 rs6671166 A TG TPM3 7.05E-06 0.31 rs1079704 A 2.02E- 0.32 1.19E-01 2.89E-01 0.84 7 06 rs2516739 G BMI NTHL1 1.43E-07 -0.42 rs3211995 G 3.66E- - 6.14E-02 -3.24E- 0.84 08 0.47 01

rs2836961 C BMI PSMG1 2.21E-16 0.53 rs2836949 A 1.85E- 0.58 1.75E-04 6.36E-01 0.83 19 rs998732 A BMI YJEFN3 1.13E-16 0.89 rs7252453 G 7.96E- 0.91 1.34E-05 9.78E-01 0.83 21 rs998732 A BMI YJEFN3 1.13E-16 0.89 rs7252453 G 7.96E- 0.91 1.34E-05 9.78E-01 0.83 21 rs12369009 T BMI RP3- 4.13E-13 0.64 rs2233861 A 4.19E- 0.72 2.68E-03 7.02E-01 0.83 462E2.5 15 chr10:65318 G WHR BF2 1.53E-12 0.46 rs1082215 A 2.37E- - 2.30E-04 -6.65E- 0.83 766 3 15 0.51 01 rs28451064 A WHRadjB MRPS6 1.55E-06 -0.44 rs9305545 G 7.11E- - 1.83E-01 -3.16E- 0.83 MI 07 0.43 01 rs9988 T WHR MRPS7 3.33E-11 0.73 rs2306217 T 2.50E- 0.78 9.83E-02 4.35E-01 0.83 11 rs2836961 C BMI BRWD1-IT2 3.28E-10 -0.41 rs7279497 T 1.47E- - 1.22E-01 -2.71E- 0.83 10 0.42 01 rs7187776 G BMI EIF3CL 9.08E-10 0.40 rs6565259 C 8.50E- 0.47 1.49E-04 5.99E-01 0.83 72

13

rs2066938 G Metabolic UNC119B 1.46E-07 0.39 rs1043138 G 6.64E- 0.41 1.45E-02 4.07E-01 0.83 traits 4 09 rs17207196 C BMI STAG3L1 2.19E-18 -0.61 rs1167827 G 1.39E- - 7.56E-03 -4.16E- 0.83 19 0.60 01 rs7798002 T WHRadjB AC003090.1 5.56E-07 -0.38 rs5637664 A 9.11E- - 2.42E-03 -6.55E- 0.83 MI 5 09 0.43 01 rs7871866 C BMI RP11- 4.04E-12 -0.58 rs7847624 G 9.53E- - 5.18E-03 -4.91E- 0.82 339B21.15 14 0.60 01 rs672356 G WHRadjB RP1- 9.54E-14 -0.55 rs8080966 C 8.63E- - 1.80E-05 -6.45E- 0.82 MI 178F10.3 18 0.61 01 rs6843738 G BMI MANBA 2.29E-15 0.53 rs223492 G 2.30E- 0.56 2.23E-03 4.85E-01 0.82 17 rs2633310 G T2D FUT11 8.98E-15 0.50 rs7072988 C 4.89E- 0.56 1.19E-04 5.41E-01 0.82 18 rs17826219 A BMI AC005562.1 1.85E-07 -0.66 rs216403 A 8.46E- - 1.45E-02 -7.61E- 0.81 09 0.74 01 rs7184597 T Obesity RP11- 8.62E-06 -0.31 rs2904880 C 4.60E- - 1.74E-02 -4.12E- 0.81 1348G14.1 07 0.38 01 rs498240 G BMI NELFE 1.84E-09 1.03 rs1425343 C 2.78E- 1.06 2.58E-02 2.14E+00 0.81 65 10 rs7479183 T WHRadjB AP006621.6 3.67E-12 -0.47 rs4963156 T 5.21E- 0.50 3.50E-03 4.39E-01 0.81 MI 14

rs7795371 A WHRadjB TMEM60 3.66E-17 0.57 rs1544459 T 7.34E- 0.60 4.31E-04 5.47E-01 0.81 MI 20 rs761423 T BMI MST1L 2.43E-13 0.53 rs7517838 C 1.29E- 0.57 1.10E-01 3.39E-01 0.81 13 rs12138803 T WHRadjB PIGC 7.64E-06 0.36 rs1132148 G 1.08E- 0.49 2.28E-04 6.31E-01 0.81 MI 08 rs17154889 C WHRadjB PPIP5K2 8.55E-08 0.39 rs187427 C 3.91E- 0.47 8.39E-04 6.02E-01 0.81 MI 10 rs6949451 T WHRadjB HOXA-AS4 5.67E-20 0.61 rs6461991 C 2.39E- 0.65 4.56E-04 4.99E-01 0.80 MI 22 rs12448257 A BMI DNASE1 4.90E-11 0.60 rs7199766 A 7.36E- 0.66 3.51E-03 5.69E-01 0.80 13 rs3731861 C WHRadjB AAMP 2.18E-08 -0.44 rs9751327 G 2.82E- - 2.89E-02 -3.32E- 0.80 MI 09 0.45 01 rs704 A LDL TMEM199 1.28E-08 0.39 rs967645 T 5.13E- 0.45 9.75E-04 5.03E-01 0.80 11 rs6499165 A Metabolic PLA2G15 3.37E-12 0.49 rs8056893 C 1.87E- 0.58 1.24E-04 6.05E-01 0.80 traits 15 Table 2.9: Primary eQTLs (FDR <1%) colocalized with GWAS variants based on linkage disequilibrium and conditional analysis between 73

GWAS and lead eQTL variants Shown above are LD and conditional analysis values for GWAS variants and eQTL variants found to be

colocalized. We matched the eQTL allele to the GWAS risk increasing. This table is sorted by the LD r2 between the lead eQTL and the GWAS

variant and then by the initial p-value of the GWAS variant's association with a nearby gene. LD r2 was calculated in 10K Finns from METSIM. An

extended version of this table is available in Raulerson CK et al. (AJHG 2019).

Primary signals Secondary signals GWAS Genes (%) for which Genes (%) for which signals the trait risk allele is the trait risk allele is associated Colocalize associated with Colocalize associated with with ³ 1 d GWAS Colocalized decreased d GWAS Colocalize decreased Trait gene signals genes expression signals d genes expression BMI 310 85 116 60 (52%) 12 11 4 (36%) All Lipids 186 56 62 31 (50%) 7 7 3 (43%) HDL 73 25 30 18 (60%) 3 3 2 (67%) LDL 64 16 18 8 (44%) 1 1 0 (0%) Triglycerides 42 18 19 11 (58%) 2 2 1 (50%) Total 47 Cholesterol 13 13 6 (46%) 2 2 1 (50%) WHRadjBMI 157 48 57 35 (61%) 9 9 5 (56%) WHR 134 40 49 32 (65%) 6 6 3 (50%) T2D 115 28 37 20 (54%) 7 8 4 (50%) 74

Cardiovascular Disease 28 9 10 6 (60%) - - - Adiponectin 11 4 4 2 (50%) - - - Any Trait 874 231 287 152 (53%) 31 31 15 (48%) Table 2.10: Primary and secondary eQTLs colocalized at GWAS lociColocalized eQTL (FDR < 1%; P < 9.63 ´ 10-6) among 2,843 GWAS

signals (variants clumped by LD r2 > 0.8) associated with one or more of 93 cardiometabolic traits. Each GWAS variant may colocalize with an

eQTL for more than one gene. GWAS signals reported for multiple traits are included in counts of colocalized eQTLs for each trait, so signals and

genes are not unique by row. Percentages expressed as the number of genes for which the GWAS risk allele was associated with decreased

expression among all colocalized genes for that trait. No secondary eQTL signals were found to be colocalized with GWAS loci for cardiovascular

disease or adiponectin.

Primary eQTL signals GWAS Percent (%) of tested Signals tested Colocalized Trait GWAS signals GWAS variants found for GWAS signals to be colocalized colocalization BMI 851 257 69 8% All Lipids 79 31 11 14% HDL 29 10 2 7% LDL 17 8 3 18% Triglycerides 19 8 4 21% Total Cholesterol 15 7 2 13% WHRadjBMI 381 147 45 12% WHR 204 107 30 15% T2D 263 86 22 8% Any Trait 1777 545 177 10% Table 2.11: Proportion of tested GWAS signals colocalized Proportions of all GWAS signals (column

B), by trait, that were colocalized with adipose eQTLs (column E). Signals were determined to be colocalized by LD & reciprocal conditional analysis. GWAS signals indicates that the GWAS signal was looked up in the eQTL data, but does not indicate whether any gene was associated with it. Colocalized signals were required to be a subset of all GWAS signals, from the following papers: Pulit et al. (Hum Mol

Genet 2019) for WHR & WHRadjBMI, Yengo et al. (Hum Mol Genet 2018) for BMI, Mahajan et al. (Nat.

Genet. 2018) for T2D, Xue et al. (Nat Commun 2018) for T2D, and Hoffmann et al. (Nat Genet 2018) for

Lipids.

75

GWAS lead variant GWAS variant association Lead eQTL variant (eSNP) Lead eQTL variant with eQTL gene association with eQTL gene (eSNP) conditioned on GWAS variant GWAS rsID Risk GWAS eQTL gene Pinitial βinitia eSNP rsID Effec Pinitial βinitia Pcond βcond LD 2 Allel Trait l t l r e Allele rs6518681 G T2D CTA- 1.85E- 0.78 rs6518681 G 1.85E- 0.78 9.97E- 4.12E+11 - 85E5.10 16 16 01 rs4883201 A TC M6PR 6.47E- 0.67 rs4883201 A 6.47E- 0.67 9.91E- 6.73E+11 - 15 15 01 rs235374 C HDL PTTG1IP 4.26E- 0.36 rs235374 C 4.26E- 0.36 9.93E- -1.27E+12 - 09 09 01 rs758747 T BMI CLUAP1 5.17E- 0.48 rs758747 T 5.17E- 0.48 9.87E- 1.38E+12 - 08 08 01 rs2240108 C BMI SLC48A1 1.46E- 0.51 rs2240108 C 1.46E- 0.51 9.91E- 1.95E+12 - 07 07 01 rs7932891 A WHRadjBM ZBED5 5.12E- - rs7932891 A 5.12E- - 9.84E- 1.12E+12 - 76

I 07 0.34 07 0.34 01

rs7102 C WHRadjBM SNN 2.49E- - rs7102 C 2.49E- - 9.99E- -1.89E+11 - I 06 0.32 06 0.32 01 chr19:5501590 T LDL CTD- 3.49E- 0.42 rs34157964 T 3.49E- 0.42 9.68E- 3.46E+12 - 2 2587H19.2 06 06 01 rs3746429 T BMI TRPC4AP 6.45E- - rs3746429 T 6.45E- - 9.85E- 1.93E+12 - 06 0.41 06 0.41 01 rs1575972 A T2D DMRTA1 6.70E- 1.08 rs1575972 A 6.70E- 1.08 9.87E- -3.01E+12 - 06 06 01 rs28453840 G TC MCM6 5.39E- - rs11684545 G 5.32E- - 1.10E- -1.51E+02 1.00 06 0.78 06 0.78 01 rs3111316 A T2D GCDH 9.79E- 0.47 rs10419627 A 9.78E- 0.47 7.78E- 1.57E+03 1.00 07 07 02 rs7307277 A WHRadjBM ZNF664 1.45E- - rs34114498 G 1.41E- - 3.12E- -1.76E+01 1.00 I 16 0.56 16 0.56 01 rs11724804 G WHR DGKQ 3.19E- - rs13101828 G 2.90E- 0.39 6.69E- 3.70E-01 1.00 08 0.39 08 01 rs12044597 G BMI RP1- 2.90E- 0.31 rs17162854 G 2.55E- 0.31 4.24E- 1.41E+00 1.00 140A9.1 07 07 01 rs1531583 T T2D PCGF3 1.52E- - rs10213196 T 6.02E- - 1.75E- -1.17E+00 1.00 28 1.20 29 1.23 01 rs6463489 T BMI AC092171. 1.07E- 0.83 rs2966432 T 8.31E- 0.83 3.67E- 2.23E+00 1.00 4 11 12 01

chr16:5717018 T HDL PLLP 1.10E- - rs11602924 G 8.79E- - 5.11E- -5.50E-01 0.98 9 06 0.58 4 07 0.57 01 rs35565646 A WHRadjBM IVD 5.36E- - rs11636147 G 3.63E- - 5.67E- -1.32E+00 0.98 I 12 0.56 13 0.60 03 rs10876528 A WHR HOXC4 1.36E- 0.41 rs12319419 C 4.11E- 0.42 1.26E- 4.48E-01 0.96 10 11 01 rs7715806 C BMI POC5 2.72E- 0.36 rs6453143 G 1.35E- 0.38 1.32E- 2.53E-01 0.96 06 06 01 rs1063355 G T2D HLA-DQB1 6.53E- - rs1071630 C 2.29E- - 9.04E- -4.33E-01 0.94 17 0.46 18 0.51 03 rs7811265 A TG BCL7B 4.12E- 1.00 rs35368205 C 1.63E- 0.97 7.34E- -6.08E-01 0.92 11 11 02 rs3931020 C Resistin RP11- 5.42E- - rs438014 C 1.85E- - 7.21E- -5.80E-01 0.91 levels 17E13.2 10 0.38 11 0.42 03 rs10507524 C WHRadjBM SMIM2-AS1 9.67E- 0.40 rs9533776 T 2.03E- 0.40 8.08E- 4.33E-01 0.88 I 09 09 02 rs719802 T BMI ANKK1 1.28E- - rs7115082 C 1.15E- - 2.82E- -4.29E-01 0.87 06 0.32 07 0.35 02 rs307058 A WHRadjBM NUDT6 3.32E- 0.68 rs303089 C 7.45E- 0.79 4.60E- 6.35E-01 0.84

77 I 11 13 03

rs208015 T BMI CDK5RAP3 8.96E- - rs55982711 A 4.05E- - 9.99E- -6.09E-01 0.83 12 0.68 14 0.69 04 rs2307111 T BMI POC5 3.05E- 0.35 rs6453143 G 1.35E- 0.38 1.32E- 2.53E-01 0.82 06 06 01 rs2820311 G BMI TIMM17A 7.45E- - rs4950743 C 1.67E- - 6.57E- -4.41E-01 0.82 09 0.40 10 0.44 03 rs1063355 G T2D HLA-DQB1- 9.30E- - rs74455114 T 2.57E- - 2.14E- -3.58E-01 0.82 AS1 17 0.49 18 0.51 03 rs761423 T BMI NBPF1 7.23E- 0.32 rs7517838 C 8.77E- 0.40 6.17E- 3.14E-01 0.81 06 08 04

Table 2.12: Secondary eQTLs (FDR <1%) colocalized with GWAS variants based on linkage disequilibrium and conditional analysis

between GWAS and lead eQTL variants Shown above are LD and conditional analysis values for GWAS variants and eQTL variants found to be

colocalized. We matched the eQTL allele to the GWAS risk increasing allele. This table is sorted by the LD r2 between the lead eQTL and the

GWAS variant and then by the initial p-value of the GWAS variant's association with a nearby gene. LD r2 was calculated in 10K Finns from

METSIM. An extended version of this table is available in Raulerson CK et al. (AJHG 2019).

78

coloc2 Posterior Probabilities for Source of GWAS Summary Statistics GWAS- Four Hypotheses eSNP Gene # of PP PP PP PP PP Trait Publication LD r2 Interpretation SNPs H0 H1 H2 H3 H4 of coloc2 evidence AC018720.10 407 0.00 0.00 0.00 0.00 1.00 BMI Yengo et al. (Hum Mol Genet 1.000 Strong evidence 2018) of colocalization; PP H4 > .8 NSUN5P1 414 0.00 0.00 0.00 0.00 1.00 BMI Yengo et al. (Hum Mol Genet 0.987 Strong evidence 2018) of colocalization; PP H4 > .8 LHCGR 6310 0.00 0.00 0.00 0.00 1.00 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) same lead Strong evidence variant of colocalization; PP H4 > .8 POM121C 410 0.00 0.00 0.00 0.00 1.00 BMI Yengo et al. (Hum Mol Genet 0.991 Strong evidence 2018) of colocalization; PP H4 > .8 GNL3 969 0.00 0.00 0.00 0.00 1.00 BMI Yengo et al. (Hum Mol Genet 0.955 Strong evidence 2018) of colocalization; PP H4 > .8 79

VEGFB 3560 0.00 0.00 0.00 0.00 1.00 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) same lead Strong evidence variant of colocalization; PP H4 > .8 SSC5D 7320 0.00 0.00 0.00 0.00 1.00 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) same lead Strong evidence variant of colocalization; PP H4 > .8 LINC00310 5110 0.00 0.00 0.00 0.00 1.00 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) same lead Strong evidence variant of colocalization; PP H4 > .8 MST1L 1060 0.00 0.00 0.00 0.00 1.00 BMI Yengo et al. (Hum Mol Genet 0.809 Strong evidence 2018) of colocalization; PP H4 > .8 CMIP 9100 0.00 0.00 0.00 0.00 1.00 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) same lead Strong evidence variant of colocalization; PP H4 > .8 TRIM24 5300 0.00 0.00 0.00 0.01 0.99 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) same lead Strong evidence variant of colocalization; PP H4 > .8 RP11-1348G14.4 566 0.00 0.00 0.00 0.02 0.98 BMI Yengo et al. (Hum Mol Genet 1.000 Strong evidence 2018) of colocalization; PP H4 > .8

SLC27A4 1140 0.00 0.00 0.00 0.03 0.97 BMI Yengo et al. (Hum Mol Genet 0.992 Strong evidence 2018) of colocalization; PP H4 > .8 PM20D1 1380 0.00 0.00 0.00 0.03 0.97 BMI Yengo et al. (Hum Mol Genet 0.996 Strong evidence 2018) of colocalization; PP H4 > .8 TNFAIP8 6220 0.00 0.00 0.00 0.03 0.97 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.998 Strong evidence of colocalization; PP H4 > .8 CPED1 4620 0.00 0.00 0.00 0.04 0.96 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 1.000 Strong evidence of colocalization; PP H4 > .8 RP1-239B22.5 1560 0.00 0.00 0.00 0.04 0.96 BMI Yengo et al. (Hum Mol Genet 0.977 Strong evidence 2018) of colocalization; PP H4 > .8 STAB1 835 0.00 0.00 0.00 0.04 0.96 BMI Yengo et al. (Hum Mol Genet 0.977 Strong evidence 2018) of colocalization; PP H4 > .8 RSPO3 3640 0.00 0.00 0.00 0.04 0.96 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.932 Strong evidence 80

of colocalization; PP H4 > .8 ADH1A 6140 0.00 0.00 0.00 0.05 0.95 WHR Pulit et al. (Hum Mol Genet 2019) 0.997 Strong evidence of colocalization; PP H4 > .8 YWHAZ 1740 0.00 0.00 0.00 0.05 0.95 BMI Yengo et al. (Hum Mol Genet 0.989 Strong evidence 2018) of colocalization; PP H4 > .8 RCN1 1300 0.00 0.00 0.00 0.05 0.95 BMI Yengo et al. (Hum Mol Genet 0.848 Strong evidence 2018) of colocalization; PP H4 > .8 RP11-339B21.13 1130 0.00 0.00 0.00 0.05 0.95 BMI Yengo et al. (Hum Mol Genet 0.992 Strong evidence 2018) of colocalization; PP H4 > .8 INHBB 3660 0.00 0.00 0.01 0.04 0.95 T2D Mahajan et al. (Nat. Genet. 2018) same lead Strong evidence variant of colocalization; PP H4 > .8 MRPS6 5090 0.00 0.01 0.00 0.04 0.95 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.833 Strong evidence of colocalization; PP H4 > .8 CTD-2002J20.1 780 0.00 0.01 0.00 0.05 0.94 BMI Yengo et al. (Hum Mol Genet 0.988 Strong evidence 2018) of colocalization; PP H4 > .8 JAZF1 4720 0.00 0.00 0.00 0.06 0.94 T2D Mahajan et al. (Nat. Genet. 2018) 1.000 Strong evidence of colocalization; PP H4 > .8

NTHL1 922 0.00 0.00 0.00 0.06 0.94 BMI Yengo et al. (Hum Mol Genet 0.837 Strong evidence 2018) of colocalization; PP H4 > .8 TEX40 910 0.00 0.01 0.00 0.05 0.94 BMI Yengo et al. (Hum Mol Genet 0.965 Strong evidence 2018) of colocalization; PP H4 > .8 CTC-228N24.3 5350 0.00 0.00 0.00 0.06 0.94 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.998 Strong evidence of colocalization; PP H4 > .8 HEY2 1290 0.00 0.00 0.00 0.06 0.94 BMI Yengo et al. (Hum Mol Genet 1.000 Strong evidence 2018) of colocalization; PP H4 > .8 RP1-18D14.7 1300 0.00 0.00 0.00 0.07 0.93 BMI Yengo et al. (Hum Mol Genet 0.996 Strong evidence 2018) of colocalization; PP H4 > .8 TAL1 1310 0.00 0.00 0.00 0.07 0.93 BMI Yengo et al. (Hum Mol Genet 0.999 Strong evidence 2018) of colocalization; PP H4 > .8 DHRS11 983 0.00 0.00 0.00 0.07 0.93 BMI Yengo et al. (Hum Mol Genet 0.990 Strong evidence 81 2018) of colocalization; PP H4 > .8 C10orf112 2690 0.00 0.00 0.00 0.07 0.93 BMI Yengo et al. (Hum Mol Genet 0.912 Strong evidence 2018) of colocalization; PP H4 > .8 C17orf82 2970 0.00 0.00 0.00 0.08 0.92 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.999 Strong evidence of colocalization; PP H4 > .8 MED24 965 0.00 0.00 0.00 0.08 0.92 BMI Yengo et al. (Hum Mol Genet 1.000 Strong evidence 2018) of colocalization; PP H4 > .8 HMGB1 6270 0.00 0.00 0.00 0.09 0.91 WHR Pulit et al. (Hum Mol Genet 2019) 0.948 Strong evidence of colocalization; PP H4 > .8 BEND6 1260 0.00 0.00 0.00 0.09 0.91 BMI Yengo et al. (Hum Mol Genet 0.988 Strong evidence 2018) of colocalization; PP H4 > .8 OPRL1 4070 0.00 0.00 0.00 0.09 0.91 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) same lead Strong evidence variant of colocalization; PP H4 > .8 DISP2 4460 0.00 0.00 0.00 0.09 0.91 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 1.000 Strong evidence of colocalization; PP H4 > .8 KAT8 498 0.00 0.00 0.00 0.10 0.91 BMI Yengo et al. (Hum Mol Genet 0.999 Strong evidence 2018) of colocalization; PP H4 > .8

TIPARP 5480 0.00 0.00 0.00 0.11 0.89 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 1.000 Strong evidence of colocalization; PP H4 > .8 MST1R 640 0.00 0.01 0.00 0.09 0.89 BMI Yengo et al. (Hum Mol Genet 0.964 Strong evidence 2018) of colocalization; PP H4 > .8 NPC1 1350 0.00 0.00 0.00 0.12 0.88 BMI Yengo et al. (Hum Mol Genet 0.997 Strong evidence 2018) of colocalization; PP H4 > .8 ZNHIT3 4330 0.00 0.00 0.00 0.12 0.87 WHR Pulit et al. (Hum Mol Genet 2019) 0.993 Strong evidence of colocalization; PP H4 > .8 INO80E 357 0.00 0.00 0.00 0.13 0.87 BMI Yengo et al. (Hum Mol Genet same lead Strong evidence 2018) variant of colocalization; PP H4 > .8 IPO9 1660 0.00 0.00 0.00 0.14 0.86 BMI Yengo et al. (Hum Mol Genet 0.971 Strong evidence 2018) of colocalization; PP H4 > .8 CPSF4 853 0.00 0.00 0.00 0.16 0.84 BMI Yengo et al. (Hum Mol Genet 0.982 Strong evidence 82

2018) of colocalization; PP H4 > .8 PKD1 5580 0.00 0.00 0.00 0.16 0.84 WHR Pulit et al. (Hum Mol Genet 2019) 0.884 Strong evidence of colocalization; PP H4 > .8 MFAP2 5130 0.00 0.00 0.00 0.17 0.83 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.997 Strong evidence of colocalization; PP H4 > .8 MTOR 1160 0.00 0.00 0.00 0.17 0.83 BMI Yengo et al. (Hum Mol Genet 0.989 Strong evidence 2018) of colocalization; PP H4 > .8 AC097724.3 2030 0.00 0.00 0.00 0.19 0.81 BMI Yengo et al. (Hum Mol Genet 0.873 Strong evidence 2018) of colocalization; PP H4 > .8 C18orf8 1290 0.00 0.00 0.00 0.20 0.80 BMI Yengo et al. (Hum Mol Genet 0.992 Moderate 2018) evidence of colocalization; PP H4 > .5 RGS19 4070 0.00 0.00 0.00 0.21 0.79 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.905 Moderate evidence of colocalization; PP H4 > .5 ABTB1 5190 0.00 0.00 0.00 0.22 0.78 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) same lead Moderate variant evidence of colocalization; PP H4 > .5

STAG3L1 392 0.00 0.00 0.00 0.22 0.78 BMI Yengo et al. (Hum Mol Genet 0.830 Moderate 2018) evidence of colocalization; PP H4 > .5 AP3S2 4580 0.00 0.00 0.00 0.23 0.78 T2D Mahajan A (biorxiv 2017) 0.921 Moderate evidence of colocalization; PP H4 > .5 C15orf38-AP3S2 4580 0.00 0.00 0.00 0.23 0.78 T2D Mahajan A (biorxiv 2017) 0.939 Moderate evidence of colocalization; PP H4 > .5 ZSWIM7 1160 0.00 0.00 0.00 0.24 0.77 BMI Yengo et al. (Hum Mol Genet 1.000 Moderate 2018) evidence of colocalization; PP H4 > .5 LITAF 8230 0.00 0.00 0.00 0.24 0.76 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.920 Moderate 83

evidence of colocalization; PP H4 > .5 ABLIM3 5300 0.00 0.00 0.00 0.25 0.75 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.980 Moderate evidence of colocalization; PP H4 > .5 ADORA2B 1200 0.00 0.00 0.00 0.26 0.74 BMI Yengo et al. (Hum Mol Genet 0.994 Moderate 2018) evidence of colocalization; PP H4 > .5 PSMD3 1010 0.00 0.00 0.01 0.25 0.74 BMI Yengo et al. (Hum Mol Genet 0.997 Moderate 2018) evidence of colocalization; PP H4 > .5 TBX18 5430 0.00 0.04 0.00 0.22 0.74 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.998 Moderate evidence of colocalization; PP H4 > .5 COL8A1 5030 0.00 0.00 0.00 0.31 0.69 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.985 Moderate evidence of colocalization; PP H4 > .5 EYA2 4280 0.00 0.00 0.10 0.23 0.67 T2D Mahajan et al. (Nat. Genet. 2018) 0.936 Moderate evidence of colocalization; PP H4 > .5

EEF1G 4630 0.00 0.01 0.00 0.33 0.66 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.952 Moderate evidence of colocalization; PP H4 > .5 AC068039.4 1450 0.00 0.00 0.00 0.34 0.66 BMI Yengo et al. (Hum Mol Genet 0.896 Moderate 2018) evidence of colocalization; PP H4 > .5 GPC6 5850 0.00 0.00 0.00 0.38 0.62 WHR Pulit et al. (Hum Mol Genet 2019) 0.964 Moderate evidence of colocalization; PP H4 > .5 MMP16 1700 0.00 0.00 0.00 0.38 0.62 BMI Yengo et al. (Hum Mol Genet 0.992 Moderate 2018) evidence of colocalization; PP H4 > .5 SNRPC 1380 0.00 0.00 0.00 0.39 0.61 BMI Yengo et al. (Hum Mol Genet 0.976 Moderate 84

2018) evidence of colocalization; PP H4 > .5 HIRIP3 357 0.00 0.00 0.00 0.39 0.61 BMI Yengo et al. (Hum Mol Genet 0.867 Moderate 2018) evidence of colocalization; PP H4 > .5 CRYAB 4400 0.00 0.00 0.00 0.40 0.60 WHR Pulit et al. (Hum Mol Genet 2019) 0.871 Moderate evidence of colocalization; PP H4 > .5 ROM1 4610 0.00 0.00 0.00 0.41 0.59 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.842 Moderate evidence of colocalization; PP H4 > .5 PLK1S1 4030 0.00 0.00 0.00 0.41 0.58 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.980 Moderate evidence of colocalization; PP H4 > .5 FST 6630 0.00 0.02 0.00 0.41 0.58 WHR Pulit et al. (Hum Mol Genet 2019) 0.968 Moderate evidence of colocalization; PP H4 > .5 ANKS6 1740 0.00 0.00 0.00 0.44 0.56 BMI Yengo et al. (Hum Mol Genet 0.993 Moderate 2018) evidence of colocalization; PP H4 > .5

ANAPC4 1640 0.00 0.00 0.00 0.45 0.55 BMI Yengo et al. (Hum Mol Genet 0.921 Moderate 2018) evidence of colocalization; PP H4 > .5 PRMT6 1860 0.00 0.00 0.00 0.46 0.54 BMI Yengo et al. (Hum Mol Genet 0.991 Moderate 2018) evidence of colocalization; PP H4 > .5 NELFE 539 0.00 0.00 0.00 0.48 0.52 BMI Yengo et al. (Hum Mol Genet 0.812 Moderate 2018) evidence of colocalization; PP H4 > .5 CDK5RAP3 1220 0.00 0.00 0.00 1.00 0.00 BMI Yengo et al. (Hum Mol Genet same lead Evidence that 2018) variant GWAS and eQTL signals differ; PP H3 > .8 SNX10 5690 0.00 0.00 0.00 1.00 0.00 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) same lead Evidence that 85

variant GWAS and eQTL signals differ; PP H3 > .8 AAMP 4340 0.00 0.00 0.00 1.00 0.00 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.802 Evidence that GWAS and eQTL signals differ; PP H3 > .8 TRIP4 1040 0.00 0.00 0.00 1.00 0.00 BMI Yengo et al. (Hum Mol Genet 0.896 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8 HEMK1 696 0.00 0.00 0.00 1.00 0.00 BMI Yengo et al. (Hum Mol Genet 0.913 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8 ZNF75A 4880 0.00 0.00 0.00 1.00 0.00 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.977 Evidence that GWAS and eQTL signals differ; PP H3 > .8 AC013264.2 1260 0.00 0.00 0.00 1.00 0.00 BMI Yengo et al. (Hum Mol Genet 0.918 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8 NUDCD3 1450 0.00 0.00 0.00 1.00 0.00 BMI Yengo et al. (Hum Mol Genet 0.964 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8

HOXA11-AS 5470 0.00 0.00 0.00 1.00 0.00 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.945 Evidence that GWAS and eQTL signals differ; PP H3 > .8 RP11-433P17.3 4950 0.00 0.00 0.00 1.00 0.00 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) same lead Evidence that variant GWAS and eQTL signals differ; PP H3 > .8 RP11-3B7.1 645 0.00 0.00 0.00 1.00 0.00 BMI Yengo et al. (Hum Mol Genet 0.915 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8 ARIH2 598 0.00 0.00 0.00 1.00 0.00 BMI Yengo et al. (Hum Mol Genet 0.997 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8 C1orf213 4240 0.00 0.00 0.00 1.00 0.00 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) same lead Evidence that 86

variant GWAS and eQTL signals differ; PP H3 > .8 CEBPA-AS1 6320 0.00 0.00 0.00 1.00 0.00 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.855 Evidence that GWAS and eQTL signals differ; PP H3 > .8 BRWD1-IT2 2330 0.00 0.00 0.00 1.00 0.00 BMI Yengo et al. (Hum Mol Genet 0.833 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8 PSMG1 2290 0.00 0.00 0.00 1.00 0.00 BMI Yengo et al. (Hum Mol Genet 0.835 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8 AP006621.6 5520 0.00 0.00 0.00 1.00 0.00 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.811 Evidence that GWAS and eQTL signals differ; PP H3 > .8 CCDC36 636 0.00 0.00 0.00 1.00 0.00 BMI Yengo et al. (Hum Mol Genet 0.915 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8 PIDD 5590 0.00 0.00 0.00 1.00 0.00 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.990 Evidence that GWAS and eQTL signals differ; PP H3 > .8

DNASE1 1070 0.00 0.00 0.00 1.00 0.00 BMI Yengo et al. (Hum Mol Genet 0.803 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8 RP3-462E2.5 748 0.00 0.00 0.00 1.00 0.00 BMI Yengo et al. (Hum Mol Genet 0.834 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8 PRDX6 944 0.00 0.00 0.00 1.00 0.00 BMI Yengo et al. (Hum Mol Genet 1.000 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8 SCAPER 1110 0.00 0.00 0.00 1.00 0.00 BMI Yengo et al. (Hum Mol Genet 0.982 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8 PACS1 701 0.00 0.00 0.00 1.00 0.00 BMI Yengo et al. (Hum Mol Genet 0.995 Evidence that 87

2018) GWAS and eQTL signals differ; PP H3 > .8 AKTIP 4330 0.00 0.00 0.00 1.00 0.00 T2D Mahajan et al. (Nat. Genet. 2018) 0.984 Evidence that GWAS and eQTL signals differ; PP H3 > .8 RP1-178F10.3 3000 0.00 0.00 0.00 1.00 0.00 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.818 Evidence that GWAS and eQTL signals differ; PP H3 > .8 RP11-755F10.1 631 0.00 0.00 0.00 1.00 0.00 BMI Yengo et al. (Hum Mol Genet 0.885 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8 NCKIPSD 594 0.00 0.00 0.00 1.00 0.00 BMI Yengo et al. (Hum Mol Genet 0.919 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8 MANBA 1460 0.00 0.00 0.00 1.00 0.00 BMI Yengo et al. (Hum Mol Genet 0.818 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8 DOCK3 666 0.00 0.00 0.00 1.00 0.00 BMI Yengo et al. (Hum Mol Genet same lead Evidence that 2018) variant GWAS and eQTL signals differ; PP H3 > .8

YJEFN3 849 0.00 0.00 0.00 1.00 0.00 BMI Yengo et al. (Hum Mol Genet 0.834 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8 HLA-DRB6 2750 0.00 0.00 0.00 1.00 0.00 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.986 Evidence that GWAS and eQTL signals differ; PP H3 > .8 HOXA-AS4 5490 0.00 0.00 0.00 1.00 0.00 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.804 Evidence that GWAS and eQTL signals differ; PP H3 > .8 RBL2 4180 0.00 0.00 0.00 1.00 0.00 T2D Mahajan et al. (Nat. Genet. 2018) 0.978 Evidence that GWAS and eQTL signals differ; PP H3 > .8 RP11-57H14.2 3550 0.00 0.00 0.00 1.00 0.00 T2D Mahajan et al. (Nat. Genet. 2018) 0.964 Evidence that 88

GWAS and eQTL signals differ; PP H3 > .8 WDR6 607 0.00 0.00 0.00 1.00 0.00 BMI Yengo et al. (Hum Mol Genet 0.997 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8 NDUFAF1 1210 0.00 0.00 0.00 1.00 0.00 BMI Yengo et al. (Hum Mol Genet 1.000 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8 CBX3 5810 0.00 0.00 0.00 1.00 0.00 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) same lead Evidence that variant GWAS and eQTL signals differ; PP H3 > .8 RP11-804H8.6 668 0.00 0.00 0.00 1.00 0.00 BMI Yengo et al. (Hum Mol Genet 0.950 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8 PIGC 4740 0.00 0.00 0.00 1.00 0.00 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.807 Evidence that GWAS and eQTL signals differ; PP H3 > .8 C11orf49 847 0.00 0.00 0.00 1.00 0.00 BMI Yengo et al. (Hum Mol Genet 0.999 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8

RP11-493E12.2 1270 0.00 0.00 0.00 1.00 0.00 BMI Yengo et al. (Hum Mol Genet 0.907 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8 TSPY26P 1170 0.00 0.00 0.00 1.00 0.00 BMI Yengo et al. (Hum Mol Genet 0.971 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8 MAST2 1010 0.00 0.00 0.00 1.00 0.00 BMI Yengo et al. (Hum Mol Genet 0.904 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8 MRPS7 5340 0.00 0.00 0.00 0.99 0.01 WHR Pulit et al. (Hum Mol Genet 2019) 0.833 Evidence that GWAS and eQTL signals differ; PP H3 > .8 MAST4-AS1 5500 0.00 0.00 0.00 0.99 0.01 WHR Pulit et al. (Hum Mol Genet 2019) 0.886 Evidence that 89

GWAS and eQTL signals differ; PP H3 > .8 ATXN2L 555 0.00 0.00 0.00 0.99 0.01 BMI Yengo et al. (Hum Mol Genet 0.933 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8 ELP6 2920 0.00 0.00 0.00 0.99 0.01 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.868 Evidence that GWAS and eQTL signals differ; PP H3 > .8 EIF3CL 748 0.00 0.00 0.00 0.99 0.01 BMI Yengo et al. (Hum Mol Genet 0.830 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8 RP11-380L11.4 5180 0.00 0.02 0.00 0.98 0.00 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.946 Evidence that GWAS and eQTL signals differ; PP H3 > .8 C11orf52 4400 0.00 0.00 0.00 0.98 0.02 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.875 Evidence that GWAS and eQTL signals differ; PP H3 > .8 NET1 7850 0.00 0.01 0.00 0.97 0.02 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.975 Evidence that GWAS and eQTL signals differ; PP H3 > .8

PPIP5K2 5050 0.00 0.00 0.00 0.97 0.03 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.805 Evidence that GWAS and eQTL signals differ; PP H3 > .8 LAMB1 4960 0.00 0.00 0.00 0.97 0.03 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.924 Evidence that GWAS and eQTL signals differ; PP H3 > .8 CCDC144B 2710 0.00 0.04 0.00 0.96 0.00 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 1.000 Evidence that GWAS and eQTL signals differ; PP H3 > .8 RP3-473L9.4 580 0.00 0.04 0.00 0.96 0.00 BMI Yengo et al. (Hum Mol Genet 0.971 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8 LMOD1 1660 0.00 0.00 0.00 0.95 0.05 BMI Yengo et al. (Hum Mol Genet 0.987 Evidence that 90

2018) GWAS and eQTL signals differ; PP H3 > .8 RP11-867G23.1 627 0.00 0.08 0.00 0.92 0.00 BMI Yengo et al. (Hum Mol Genet 0.980 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8 PABPC4 3740 0.00 0.00 0.00 0.91 0.09 T2D Mahajan et al. (Nat. Genet. 2018) same lead Evidence that variant GWAS and eQTL signals differ; PP H3 > .8 SH2B1 526 0.00 0.00 0.00 0.90 0.11 BMI Yengo et al. (Hum Mol Genet 0.930 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8 FHOD1 694 0.00 0.00 0.12 0.88 0.00 BMI Yengo et al. (Hum Mol Genet 0.932 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8 MARCH5 4150 0.00 0.00 0.00 0.88 0.12 WHR Pulit et al. (Hum Mol Genet 2019) 0.999 Evidence that GWAS and eQTL signals differ; PP H3 > .8 AC003090.1 6220 0.00 0.00 0.00 0.87 0.13 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.827 Evidence that GWAS and eQTL signals differ; PP H3 > .8

NRBF2 5540 0.00 0.00 0.00 0.87 0.14 WHR Pulit et al. (Hum Mol Genet 2019) 0.833 Evidence that GWAS and eQTL signals differ; PP H3 > .8 RP5-899E9.1 1330 0.00 0.13 0.00 0.87 0.00 BMI Yengo et al. (Hum Mol Genet same lead Evidence that 2018) variant GWAS and eQTL signals differ; PP H3 > .8 SETD2 3450 0.00 0.00 0.00 0.84 0.16 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.870 Evidence that GWAS and eQTL signals differ; PP H3 > .8 CTC-277H1.7 689 0.01 0.08 0.11 0.81 0.00 BMI Yengo et al. (Hum Mol Genet 0.952 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .8 RP11-95O2.5 4530 0.00 0.00 0.00 0.80 0.20 WHR Pulit et al. (Hum Mol Genet 2019) 0.959 Evidence that 9

1 GWAS and eQTL signals differ; PP H3 > .5 LMX1B 1520 0.00 0.21 0.00 0.79 0.00 BMI Yengo et al. (Hum Mol Genet 0.956 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .5 AC017074.2 4710 0.00 0.01 0.00 0.72 0.27 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.983 Evidence that GWAS and eQTL signals differ; PP H3 > .5 EIF4G2 6990 0.00 0.22 0.00 0.72 0.07 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) same lead Evidence that variant GWAS and eQTL signals differ; PP H3 > .5 CCDC12 3880 0.00 0.00 0.00 0.72 0.28 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.912 Evidence that GWAS and eQTL signals differ; PP H3 > .5 IRS1 4620 0.00 0.00 0.00 0.72 0.28 T2D Mahajan et al. (Nat. Genet. 2018) 0.925 Evidence that GWAS and eQTL signals differ; PP H3 > .5 TMEM60 6050 0.00 0.00 0.00 0.69 0.31 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.811 Evidence that GWAS and eQTL signals differ; PP H3 > .5

RP11-690D19.3 6430 0.00 0.00 0.00 0.66 0.34 WHR Pulit et al. (Hum Mol Genet 2019) 0.983 Evidence that GWAS and eQTL signals differ; PP H3 > .5 FBXL18 961 0.00 0.00 0.00 0.65 0.35 BMI Yengo et al. (Hum Mol Genet 0.977 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .5 UHRF1BP1 1330 0.00 0.00 0.00 0.65 0.35 BMI Yengo et al. (Hum Mol Genet 0.993 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .5 PLCL1 1220 0.00 0.01 0.00 0.60 0.38 BMI Yengo et al. (Hum Mol Genet 0.935 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .5 PHC2 1540 0.00 0.00 0.00 0.58 0.42 BMI Yengo et al. (Hum Mol Genet 0.903 Evidence that 92

2018) GWAS and eQTL signals differ; PP H3 > .5 RP11-339B21.15 1110 0.00 0.00 0.00 0.55 0.45 BMI Yengo et al. (Hum Mol Genet 0.822 Evidence that 2018) GWAS and eQTL signals differ; PP H3 > .5 SH3PXD2B 5660 0.00 0.00 0.00 0.55 0.45 WHRadjBMI Pulit et al. (Hum Mol Genet 2019) 0.991 Evidence that GWAS and eQTL signals differ; PP H3 > .5 FARSA 3220 0.00 0.00 0.01 0.54 0.46 T2D Mahajan et al. (Nat. Genet. 2018) 0.962 Evidence that GWAS and eQTL signals differ; PP H3 > .5 PDE3A 2239 0.02 0.19 0.05 0.46 0.28 BMI Yengo et al. (Hum Mol Genet same lead No coloc2 2018) variant hypothesis with posterior probability > .5 HOXB7 1270 0.00 0.72 0.00 0.28 0.00 BMI Yengo et al. (Hum Mol Genet same lead Evidence that the 2018) variant eQTL signal is not sufficiently strong CCND2 4710 0.00 0.73 0.00 0.26 0.00 T2D Mahajan et al. (Nat. Genet. 2018) same lead Evidence that the variant eQTL signal is not sufficiently strong

RP11-53O19.3 2490 0.00 0.00 0.92 0.08 0.00 T2D Mahajan et al. (Nat. Genet. 2018) 0.908 Evidence that the GWAS signal is not sufficiently strong XXbac- 433 0.00 0.00 0.95 0.05 0.00 BMI Yengo et al. (Hum Mol Genet 0.989 Evidence that the BPGBPG55C20.1 2018) GWAS signal is not sufficiently strong LINC00680 400 0.00 0.00 0.95 0.05 0.00 BMI Yengo et al. (Hum Mol Genet 0.998 Evidence that the 2018) GWAS signal is not sufficiently strong XXbac- 400 0.00 0.00 0.95 0.05 0.00 BMI Yengo et al. (Hum Mol Genet 0.990 Evidence that the BPGBPG55C20.2 2018) GWAS signal is not sufficiently strong RFT1 3240 0.00 0.00 0.97 0.03 0.00 T2D Mahajan et al. (Nat. Genet. 2018) 0.995 Evidence that the GWAS signal is not sufficiently strong SFMBT1 3140 0.00 0.00 0.97 0.02 0.00 T2D Mahajan et al. (Nat. Genet. 2018) 0.862 Evidence that the GWAS signal is 93

not sufficiently

strong HSD17B12 4970 0.00 0.00 0.97 0.02 0.01 T2D Mahajan et al. (Nat. Genet. 2018) same lead Evidence that the variant GWAS signal is not sufficiently strong Table 2.13: Coloc2 summaries of 173 genes found to be colocalized by conditional analysis Comparison of colocalization of GWAS and

eQTL signals based on LD and conditional analysis to coloc. This comparison was limited to GWAS publications reporting more than 10 eGenes

found to be colocalized based on LD and conditional analysis. Coloc2 hypotheses are: H0: neither trait has a genetic association in the region, H1:

only trait 1 (GWAS) has a genetic association in the region, H2: only trait 2 (eQTL) has a genetic association in the region, H3: both traits are

associated, but with different causal variants, and H4: both traits are associated and share a single causal variant. We provide interpretations

based on the recommendations of the coloc2 authors, where a posterior probability >0.8 is considered strong evidence of that hypothesis.

GWAS lead GWAS variant Lead eQTL variant (eSNP) association with METSIM East Asian variant association with eQTL eQTL gene LD 1KG LD gene 2 2 GWAS rsID Risk eQTL gene Pinitial βinitia eSNP rsID Eff Pinitial βinitia Pcond βcond r r Interpretation Allel l Allel l of e e colocalizatio n rs860295 G DAP3 8.00E 0.51 rs860295 G 8.00E 0.51 9.96E-01 2.58E+1 - - Meets -14 -14 1 colocalization criteria rs709400 G AL049840.1 5.09E 0.78 rs5588559 T 5.03E 0.78 3.22E-01 6.43E+0 1.00 1.0 Meets -32 2 -32 1 0 colocalization criteria rs709400 G RP11- 1.89E 0.94 rs7359090 G 1.65E 0.94 5.67E-01 1.73E+0 1.00 0.9 Meets 73M18.9 -50 -50 0 3 colocalization criteria rs1160233 T MTCH2 5.82E 0.76 rs4752856 A 1.11E 0.79 4.79E-03 7.09E-01 0.95 0.9 Meets 9 -31 -32 2 colocalization criteria rs6203432 G SULT1A2 1.03E 0.67 rs2869866 T 9.58E 0.69 1.61E-03 9.28E-01 0.93 0.2 Meets 94

5 -27 7 -30 2 colocalization criteria rs1268084 G RP11- 5.81E 0.50 rs1254320 T 4.37E 0.54 1.79E-03 4.83E-01 0.85 0.9 Meets 2 267M23.4 -13 7 -15 7 colocalization criteria rs709400 G TRMT61A 7.08E 0.50 rs5744749 A 3.16E 0.55 1.11E-03 5.75E-01 0.84 0.8 Meets -13 4 -15 0 colocalization criteria rs709400 G APOPT1 1.40E 0.42 rs2274269 A 7.58E 0.50 4.67E-05 7.27E-01 0.84 0.8 Meets -09 -13 0 colocalization criteria rs1782621 A AC005562. 1.85E - rs216403 A 8.46E - 1.45E-02 -7.61E- 0.81 0.6 Meets 9 1 -07 0.66 -09 0.74 01 4 colocalization criteria rs860295 G RIT1 1.83E 0.36 rs1126436 C 2.66E 0.42 4.07E-03 3.92E-01 0.76 0.8 Does not -07 9 -09 1 meet colocalization criteria rs1160233 T MYBPC3 1.69E - rs1103919 C 3.08E - 4.09E-05 -5.43E- 0.75 0.6 Does not 9 -09 0.42 9 -13 0.53 01 1 meet colocalization criteria rs2270204 G RP11- 1.00E - rs7847624 G 9.53E - 1.72E-04 -5.19E- 0.74 0.3 Does not 339B21.15 -10 0.49 -14 0.60 01 4 meet

colocalization criteria rs1782621 A CRLF3 1.23E - rs122899 G 3.92E 0.71 7.87E-03 5.98E-01 0.69 0.6 Does not 9 -07 0.67 -09 3 meet colocalization criteria rs1160233 T PSMC3 1.66E 0.39 rs7928419 A 4.00E - 5.18E-04 -4.09E- 0.69 0.7 Does not 9 -08 -11 0.46 01 1 meet colocalization criteria rs1749405 A DAP3 1.64E 0.44 rs860295 G 8.00E 0.51 1.05E-04 4.73E-01 0.66 0.3 Does not -10 -14 7 meet colocalization criteria Table 2.14: Comparison of LD for loci with strongest GWAS signals for BMI identified in East Asians This table shows cis-eQTL results for

which the GWAS-eQTL lead variant EUR LD r2 > 0.6 and the strongest GWAS p-value came from a study of East Asians (Akiyama M et al. Nat

Genet 2017). The results are sorted by GWAS-eQTL lead variant LD (r2) in all METSIM (~10k Finnish men) imputed to the Haplotype Reference

Consortium (HRC), then by the initial eQTL p-value of the GWAS variant. We matched the eQTL allele to the GWAS risk allele. The “Interpretation 95

of colocalization” column shows the interpretation of colocalization based on METSIM LD and conditional analysis criteria. LD r2 in East Asians

presented from 1000 Genomes, phase 3 (1KG) and LD r2 in Finns presented from METSIM 10K HRC imputation. Colocalization is defined as LD

2 -6 r > .8 in Finns, and PcondGWAS > 9.6 x 10 . An extended version of this table is available in Raulerson CK et al. (AJHG 2019).

BMI Fasting Free Fatty Acids Matsuda Index Gene P β FDR P β FDR P β FDR AOC1 4.1E-02 9.8E-02 9.7E-02 2.6E-01 -5.4E-02 3.1E-01 2.3E-02 -1.1E-01 6.5E-02 RBM6 3.9E-01 -4.1E-02 3.9E-01 8.4E-01 9.6E-03 5.7E-01 2.0E-01 -6.2E-02 2.7E-01 PKD1 3.7E-03 -1.4E-01 1.8E-02 7.2E-02 -8.6E-02 1.4E-01 1.5E-03 1.5E-01 8.5E-03 MED24 3.3E-02 1.0E-01 8.4E-02 2.1E-01 6.0E-02 2.8E-01 2.3E-03 -1.5E-01 1.2E-02 MLXIPL 4.8E-01 3.4E-02 4.3E-01 9.1E-01 5.2E-03 5.9E-01 9.7E-01 1.9E-03 6.1E-01 STAB1 1.6E-04 -1.8E-01 1.4E-03 5.9E-01 2.6E-02 4.8E-01 6.3E-03 1.3E-01 2.6E-02 NPC1L1 4.5E-02 -9.6E-02 1.0E-01 9.1E-01 5.2E-03 5.9E-01 2.4E-01 5.7E-02 3.0E-01 NUDCD3 8.0E-03 1.3E-01 3.1E-02 4.3E-01 -3.8E-02 4.1E-01 7.7E-02 -8.5E-02 1.5E-01 SLC7A9 1.1E-01 7.7E-02 1.9E-01 1.1E-01 7.8E-02 1.8E-01 5.5E-01 -2.8E-02 4.7E-01 ANK1 1.6E-02 1.2E-01 5.0E-02 3.4E-01 4.6E-02 3.6E-01 8.2E-03 -1.3E-01 3.1E-02 RCN1 4.2E-19 4.1E-01 2.8E-16 1.6E-01 6.8E-02 2.3E-01 6.8E-25 -4.7E-01 2.8E-21 96

ANAPC4 9.6E-09 2.7E-01 3.2E-07 5.7E-03 1.3E-01 2.4E-02 5.1E-14 -3.5E-01 1.0E-11 ATP2B4 5.4E-01 2.9E-02 4.6E-01 9.7E-01 -1.8E-03 6.1E-01 5.0E-01 -3.3E-02 4.4E-01 PIGV 2.8E-08 2.6E-01 8.1E-07 6.0E-02 9.0E-02 1.2E-01 4.1E-06 -2.2E-01 6.2E-05 EYA2 5.8E-02 -9.1E-02 1.2E-01 9.5E-01 -3.1E-03 6.0E-01 1.3E-01 7.4E-02 2.0E-01 NTHL1 5.0E-01 3.3E-02 4.4E-01 8.0E-01 1.2E-02 5.6E-01 9.5E-01 -3.2E-03 6.0E-01 UHRF1BP1 1.6E-02 1.2E-01 5.1E-02 1.1E-01 7.7E-02 1.8E-01 1.6E-01 -6.8E-02 2.4E-01 APOB 5.4E-01 -2.9E-02 4.6E-01 8.1E-01 1.2E-02 5.6E-01 6.9E-01 -1.9E-02 5.2E-01 MAST2 9.2E-04 1.6E-01 5.9E-03 1.9E-02 1.1E-01 5.7E-02 6.9E-03 -1.3E-01 2.8E-02 SNX10 5.6E-01 2.8E-02 4.7E-01 3.1E-01 4.9E-02 3.4E-01 3.8E-01 -4.3E-02 3.8E-01 DOCK3 1.4E-02 1.2E-01 4.7E-02 5.7E-01 2.7E-02 4.8E-01 1.1E-01 -7.7E-02 1.9E-01 PLK1S1 1.1E-08 2.7E-01 3.6E-07 8.4E-01 9.9E-03 5.7E-01 3.7E-10 -3.0E-01 1.7E-08 PABPC4 2.1E-14 3.6E-01 4.9E-12 8.8E-02 8.2E-02 1.6E-01 4.0E-15 -3.7E-01 1.2E-12 CERS4 3.3E-04 1.7E-01 2.5E-03 3.3E-01 4.7E-02 3.6E-01 2.8E-06 -2.2E-01 4.6E-05 LAMB1 1.7E-02 -1.1E-01 5.3E-02 4.4E-01 -3.7E-02 4.2E-01 1.8E-03 1.5E-01 1.0E-02 CWF19L1 3.0E-04 -1.7E-01 2.3E-03 8.4E-01 -9.8E-03 5.7E-01 9.5E-04 1.6E-01 6.0E-03 SCD 5.5E-02 9.2E-02 1.2E-01 8.4E-01 -9.6E-03 5.7E-01 5.5E-03 -1.3E-01 2.4E-02

PLTP 8.7E-01 7.7E-03 5.8E-01 3.2E-01 -4.8E-02 3.5E-01 1.4E-01 -7.1E-02 2.2E-01 PROCR 2.6E-11 3.1E-01 1.8E-09 5.2E-01 3.1E-02 4.5E-01 1.6E-10 -3.0E-01 9.0E-09 POLR2C 2.4E-07 2.4E-01 5.3E-06 7.0E-01 1.9E-02 5.3E-01 1.2E-04 -1.8E-01 1.1E-03 PLA2G15 2.3E-06 -2.2E-01 3.9E-05 6.9E-01 -1.9E-02 5.2E-01 1.1E-04 1.8E-01 1.1E-03 RBL2 7.1E-01 1.8E-02 5.3E-01 3.1E-01 4.9E-02 3.4E-01 2.5E-01 -5.6E-02 3.0E-01 KAT8 5.6E-11 3.1E-01 3.5E-09 7.9E-01 -1.3E-02 5.6E-01 1.2E-12 -3.3E-01 1.5E-10 LACTB 9.4E-01 -3.4E-03 6.0E-01 2.7E-01 -5.3E-02 3.2E-01 7.5E-01 -1.5E-02 5.5E-01 TRIP4 7.4E-02 -8.6E-02 1.4E-01 4.6E-01 -3.6E-02 4.2E-01 4.4E-02 9.7E-02 1.0E-01 EYA1 8.3E-01 1.0E-02 5.7E-01 6.7E-01 2.0E-02 5.2E-01 8.4E-01 -9.5E-03 5.7E-01 CPED1 2.7E-09 -2.8E-01 1.1E-07 2.0E-01 -6.1E-02 2.8E-01 1.0E-11 3.2E-01 8.8E-10 BCL7B 7.6E-18 4.0E-01 3.4E-15 9.9E-02 7.9E-02 1.7E-01 3.8E-20 -4.2E-01 3.1E-17 AKNA 4.3E-01 3.8E-02 4.1E-01 4.4E-02 9.7E-02 1.0E-01 1.4E-02 -1.2E-01 4.7E-02 LIPA 2.8E-01 5.1E-02 3.3E-01 2.8E-01 5.2E-02 3.2E-01 8.7E-01 7.8E-03 5.8E-01 OBFC1 6.7E-12 3.2E-01 6.7E-10 1.4E-01 7.2E-02 2.2E-01 2.9E-08 -2.6E-01 8.3E-07 97

DHRS11 9.3E-04 -1.6E-01 5.9E-03 2.2E-01 -5.9E-02 2.9E-01 4.6E-05 1.9E-01 5.0E-04

ZNHIT3 1.1E-01 -7.6E-02 1.9E-01 4.9E-01 3.4E-02 4.4E-01 2.7E-01 5.3E-02 3.2E-01 PSMD3 3.4E-02 1.0E-01 8.4E-02 9.4E-01 3.5E-03 6.0E-01 7.2E-02 -8.7E-02 1.4E-01 CDK5RAP3 2.2E-01 5.9E-02 2.8E-01 2.7E-01 5.3E-02 3.2E-01 3.5E-01 -4.5E-02 3.7E-01 MANBA 2.2E-01 -5.9E-02 2.9E-01 3.9E-01 4.1E-02 3.9E-01 3.1E-01 4.9E-02 3.4E-01 WFS1 5.4E-06 -2.2E-01 7.9E-05 9.7E-01 1.6E-03 6.1E-01 1.2E-07 2.5E-01 2.9E-06 CRYAB 2.2E-04 1.8E-01 1.8E-03 3.7E-01 4.3E-02 3.8E-01 3.4E-05 -2.0E-01 4.0E-04 MTCH2 9.2E-01 4.7E-03 5.9E-01 4.8E-01 3.4E-02 4.3E-01 8.3E-01 1.0E-02 5.7E-01 EIF4G2 7.7E-02 8.5E-02 1.5E-01 2.2E-05 2.0E-01 2.7E-04 1.4E-04 -1.8E-01 1.2E-03

Table 2.15: Gene expression - trait associations (unadjusted for BMI) for genes colocalized by LD This table shows selected results from

the gene expression – clinical trait associations. Traits were adjusted for age and genes were adjusted TIN, sequencing batch, and age. Both traits

and genes were inverse normal transformed prior to linear regression. FDR was calculated using the Storey method. An extended version of this

table including all 287 genes and 20 traits is available in Raulerson CK et al. (AJHG 2019).

98

BMI Fasting Free Fatty Acids Matsuda Index Gene P β FDR P β FDR P β FDR AOC1 8.8E-01 7.6E-03 8.6E-01 1.7E-01 -6.6E-02 5.6E-01 8.1E-02 -8.4E-02 4.4E-01 RBM6 8.2E-01 1.1E-02 8.5E-01 6.7E-01 2.0E-02 8.1E-01 5.7E-02 -9.1E-02 3.8E-01 PKD1 5.1E-01 3.1E-02 7.7E-01 1.9E-01 -6.3E-02 5.8E-01 9.6E-03 1.2E-01 1.8E-01 MED24 8.7E-01 -8.2E-03 8.5E-01 3.5E-01 4.5E-02 6.9E-01 2.5E-02 -1.1E-01 2.8E-01 MLXIPL 9.1E-01 -5.2E-03 8.6E-01 9.6E-01 2.4E-03 8.7E-01 5.3E-01 3.0E-02 7.8E-01 STAB1 8.8E-01 -7.1E-03 8.6E-01 2.5E-01 5.5E-02 6.3E-01 4.4E-01 3.7E-02 7.4E-01 NPC1L1 8.3E-01 -1.1E-02 8.5E-01 6.7E-01 2.0E-02 8.1E-01 8.1E-01 1.2E-02 8.5E-01 NUDCD3 7.9E-01 1.3E-02 8.4E-01 2.3E-01 -5.8E-02 6.1E-01 4.4E-01 -3.7E-02 7.4E-01 SLC7A9 9.7E-01 1.5E-03 8.7E-01 1.5E-01 6.9E-02 5.3E-01 8.6E-01 8.2E-03 8.5E-01 ANK1 6.0E-01 2.5E-02 7.9E-01 5.2E-01 3.1E-02 7.7E-01 3.1E-02 -1.0E-01 3.0E-01 RCN1 4.2E-01 3.8E-02 7.3E-01 8.5E-01 8.9E-03 8.5E-01 1.6E-13 -3.4E-01 3.1E-10 99

ANAPC4 8.0E-01 1.2E-02 8.4E-01 2.9E-02 1.0E-01 3.0E-01 3.4E-08 -2.6E-01 3.9E-05

ATP2B4 9.0E-01 -5.9E-03 8.6E-01 9.3E-01 -4.4E-03 8.6E-01 8.6E-01 -8.7E-03 8.5E-01 PIGV 9.7E-01 -1.6E-03 8.7E-01 2.7E-01 5.3E-02 6.4E-01 5.2E-02 -9.3E-02 3.7E-01 EYA2 7.7E-01 -1.4E-02 8.3E-01 8.2E-01 1.1E-02 8.5E-01 3.2E-01 4.8E-02 6.8E-01 NTHL1 8.4E-01 -9.5E-03 8.5E-01 8.6E-01 8.6E-03 8.5E-01 4.4E-01 3.7E-02 7.4E-01 UHRF1BP1 7.3E-01 -1.7E-02 8.2E-01 2.0E-01 6.2E-02 5.8E-01 8.5E-01 -9.3E-03 8.5E-01 APOB 8.8E-01 7.0E-03 8.6E-01 7.5E-01 1.5E-02 8.3E-01 5.5E-01 -2.9E-02 7.8E-01 MAST2 8.4E-01 9.5E-03 8.5E-01 4.4E-02 9.7E-02 3.4E-01 1.2E-01 -7.5E-02 4.9E-01 SNX10 7.1E-01 -1.8E-02 8.2E-01 3.0E-01 5.0E-02 6.6E-01 8.0E-01 -1.3E-02 8.4E-01 DOCK3 7.7E-01 1.4E-02 8.4E-01 8.1E-01 1.2E-02 8.5E-01 4.3E-01 -3.8E-02 7.3E-01 PLK1S1 7.1E-01 1.8E-02 8.2E-01 4.7E-01 -3.5E-02 7.5E-01 1.0E-05 -2.1E-01 2.3E-03 PABPC4 6.5E-01 2.2E-02 8.1E-01 4.9E-01 3.4E-02 7.6E-01 4.0E-07 -2.4E-01 1.9E-04 CERS4 8.2E-01 1.1E-02 8.5E-01 6.4E-01 2.3E-02 8.0E-01 6.1E-04 -1.6E-01 3.7E-02 LAMB1 8.2E-01 1.1E-02 8.5E-01 7.0E-01 -1.8E-02 8.2E-01 7.1E-03 1.3E-01 1.5E-01 CWF19L1 7.3E-01 -1.6E-02 8.2E-01 7.4E-01 1.6E-02 8.2E-01 1.1E-01 7.8E-02 4.7E-01

SCD 9.7E-01 -1.9E-03 8.7E-01 5.7E-01 -2.7E-02 7.8E-01 3.2E-02 -1.0E-01 3.0E-01 PLTP 8.3E-01 -1.0E-02 8.5E-01 3.1E-01 -4.9E-02 6.7E-01 1.1E-01 -7.6E-02 4.8E-01 PROCR 7.8E-01 -1.3E-02 8.4E-01 7.3E-01 -1.7E-02 8.2E-01 8.8E-05 -1.9E-01 1.1E-02 POLR2C 5.5E-01 2.9E-02 7.8E-01 7.5E-01 -1.5E-02 8.3E-01 3.1E-02 -1.0E-01 3.0E-01 PLA2G15 8.0E-01 -1.2E-02 8.4E-01 8.0E-01 1.2E-02 8.5E-01 1.5E-02 1.2E-01 2.2E-01 RBL2 4.8E-01 -3.4E-02 7.5E-01 4.2E-01 3.9E-02 7.3E-01 6.3E-01 -2.3E-02 8.0E-01 KAT8 5.8E-01 2.6E-02 7.9E-01 1.7E-01 -6.6E-02 5.6E-01 2.2E-06 -2.2E-01 7.9E-04 LACTB 9.1E-01 5.7E-03 8.6E-01 2.9E-01 -5.1E-02 6.5E-01 4.2E-01 -3.9E-02 7.3E-01 TRIP4 7.4E-01 -1.6E-02 8.3E-01 6.3E-01 -2.3E-02 8.0E-01 5.9E-02 9.1E-02 3.8E-01 EYA1 5.0E-01 -3.2E-02 7.6E-01 6.9E-01 1.9E-02 8.1E-01 8.0E-01 1.2E-02 8.4E-01 CPED1 6.1E-01 -2.5E-02 7.9E-01 6.9E-01 -1.9E-02 8.1E-01 5.8E-08 2.6E-01 5.6E-05 BCL7B 5.7E-01 2.7E-02 7.8E-01 5.7E-01 2.7E-02 7.8E-01 8.7E-11 -3.1E-01 1.3E-07 AKNA 4.5E-01 3.7E-02 7.4E-01 4.3E-02 9.7E-02 3.4E-01 9.4E-03 -1.2E-01 1.8E-01 LIPA 9.7E-01 -1.6E-03 8.7E-01 3.0E-01 5.0E-02 6.6E-01 5.7E-01 2.7E-02 7.8E-01 100

OBFC1 9.3E-01 -4.0E-03 8.6E-01 4.3E-01 3.8E-02 7.3E-01 1.0E-02 -1.2E-01 1.9E-01

DHRS11 2.7E-01 -5.3E-02 6.4E-01 4.0E-01 -4.1E-02 7.2E-01 4.9E-04 1.7E-01 3.2E-02 ZNHIT3 8.7E-01 -8.0E-03 8.5E-01 2.8E-01 5.2E-02 6.5E-01 7.6E-01 1.5E-02 8.3E-01 PSMD3 9.0E-01 -6.0E-03 8.6E-01 8.3E-01 -1.0E-02 8.5E-01 4.8E-01 -3.4E-02 7.5E-01 CDK5RAP3 6.0E-01 -2.5E-02 7.9E-01 3.7E-01 4.3E-02 7.0E-01 7.5E-01 -1.5E-02 8.3E-01 MANBA 6.4E-01 -2.2E-02 8.1E-01 2.8E-01 5.2E-02 6.5E-01 4.1E-01 4.0E-02 7.2E-01 WFS1 6.6E-01 -2.1E-02 8.1E-01 5.1E-01 3.1E-02 7.7E-01 3.4E-06 2.2E-01 1.1E-03 CRYAB 6.7E-01 2.0E-02 8.1E-01 7.4E-01 1.6E-02 8.2E-01 5.4E-03 -1.3E-01 1.3E-01 MTCH2 7.3E-01 -1.6E-02 8.2E-01 4.6E-01 3.6E-02 7.4E-01 7.5E-01 1.6E-02 8.3E-01 EIF4G2 4.3E-01 3.8E-02 7.3E-01 3.0E-05 2.0E-01 5.3E-03 7.2E-05 -1.9E-01 1.0E-02

Table 2.16: Gene expression - trait associations (adjusted for BMI) for genes colocalized by LD This table shows selected results from the

gene expression – clinical trait associations. Traits were adjusted for age and BMI and genes were adjusted TIN, sequencing batch, age, and BMI.

Both traits and genes were inverse normal transformed prior to linear regression. FDR was calculated using the Storey method. An extended

version of this table including all 287 genes and 20 traits is available in Raulerson CK et al. (AJHG 2019).

1 01

Direct Effect Total Effect Mediation GWAS SNP Gene Trait Tested Eff SE P Eff SE P Eff 95% 95% CI does CI CI not Lower Upper include Limit Limit 0 rs12051272 CDH13 Adiponectin -1.75 0.36 1.3E-06 -1.41 0.34 3.2E-05 0.34 0.09 0.63 yes rs11600990 TEX40 BMI -0.87 0.36 1.5E-02 -1.10 0.35 1.8E-03 -0.23 -0.44 -0.07 yes rs7811265 MLXIPL Triglycerides -0.10 0.09 2.5E-01 -0.22 0.08 6.4E-03 -0.12 -0.21 -0.04 yes rs1789882 ADH1A WHR 0.00 0.00 8.6E-01 -0.01 0.00 2.4E-02 -0.01 -0.02 0.00 yes rs72959041 RSPO3 Triglycerides 0.40 0.14 4.2E-03 0.28 0.14 3.9E-02 -0.12 -0.21 -0.04 yes rs1708302 JAZF1 BMI 0.31 0.24 1.9E-01 0.48 0.24 4.6E-02 0.17 0.06 0.30 yes rs11602339 MTCH2 BMI 0.02 0.24 9.5E-01 0.48 0.25 5.5E-02 0.46 0.27 0.69 yes rs9435732 MFAP2 WHR 0.00 0.00 3.1E-01 -0.01 0.00 6.1E-02 0.00 0.00 0.00 yes rs2972144 IRS1 Adiponectin -0.17 0.24 4.7E-01 -0.42 0.24 7.8E-02 -0.25 -0.41 -0.12 yes

102 rs76895963 CCND2 Matsuda 2.37 0.77 2.3E-03 1.39 0.82 9.1E-02 -0.97 -1.65 -0.35 yes Index

rs10502148 C11orf52 WHR -0.01 0.00 4.1E-02 -0.01 0.00 9.7E-02 0.00 0.00 0.00 yes rs9931407 CTC-277H1.7 BMI -0.80 0.40 4.4E-02 -0.62 0.39 1.1E-01 0.18 0.02 0.39 yes rs4897182 CENPW Matsuda -0.54 0.24 2.4E-02 -0.36 0.23 1.2E-01 0.18 0.06 0.33 yes Index rs2820311 LMOD1 BMI 0.97 0.26 2.4E-04 0.40 0.26 1.3E-01 -0.57 -0.81 -0.35 yes rs3825061 HMBS WHR -0.01 0.00 5.4E-02 -0.01 0.00 1.3E-01 0.00 0.00 0.00 yes rs17145738 BCL7B BMI 0.37 0.37 3.1E-01 0.55 0.36 1.3E-01 0.17 0.02 0.36 yes rs12489828 NT5DC2 WHR 0.01 0.00 1.1E-02 0.00 0.00 1.4E-01 0.00 -0.01 0.00 yes rs6843738 MANBA BMI 0.45 0.25 7.1E-02 0.36 0.24 1.4E-01 -0.09 -0.20 -0.01 yes rs7798002 AC003090.1 WHR 0.01 0.00 6.0E-02 0.01 0.00 1.4E-01 0.00 0.00 0.00 yes rs950732 MARCH5 WHR 0.00 0.00 3.6E-01 0.00 0.00 1.5E-01 0.00 0.00 0.00 yes rs8070454 MED24 BMI 0.25 0.26 3.5E-01 -0.34 0.24 1.5E-01 -0.59 -0.87 -0.33 yes rs6063048 EYA2 HOMA beta 5.54 2.89 5.6E-02 3.89 2.80 1.6E-01 -1.65 -3.42 -0.13 yes rs2972144 IRS1 HDL -0.01 0.03 6.0E-01 -0.04 0.03 1.7E-01 -0.02 -0.04 -0.01 yes

rs4731702 AC016831.7 BMI 0.54 0.24 2.3E-02 0.32 0.24 1.8E-01 -0.22 -0.37 -0.09 yes rs4731702 KLF14 BMI 0.05 0.25 8.4E-01 0.32 0.24 1.8E-01 0.27 0.07 0.48 yes rs6047259 PLK1S1 WHR 0.00 0.00 4.4E-01 0.00 0.00 1.9E-01 0.00 0.00 0.00 yes rs6700838 RP1-18D14.7 BMI -0.13 0.25 6.2E-01 -0.34 0.26 1.9E-01 -0.21 -0.38 -0.07 yes rs6700838 TAL1 BMI -0.05 0.27 8.5E-01 -0.34 0.26 1.9E-01 -0.29 -0.48 -0.13 yes rs2276390 CRYAB WHR -0.01 0.00 7.6E-02 0.00 0.00 1.9E-01 0.00 0.00 0.00 yes rs7871866 SLC27A4 BMI -0.13 0.29 6.6E-01 -0.38 0.30 2.0E-01 -0.25 -0.45 -0.09 yes rs1308362 MRAS BMI 0.51 0.29 7.4E-02 0.36 0.28 2.1E-01 -0.16 -0.31 -0.04 yes rs740157 RP5-899E9.1 BMI -0.57 0.22 1.1E-02 -0.28 0.23 2.2E-01 0.28 0.14 0.46 yes rs1544474 LAMB1 WHR -0.01 0.00 4.9E-02 0.00 0.00 2.3E-01 0.00 0.00 0.00 yes rs2972144 IRS1 BMI -0.53 0.25 3.2E-02 -0.29 0.25 2.4E-01 0.24 0.11 0.40 yes

103 rs10779751 MTOR BMI -0.16 0.27 5.4E-01 -0.30 0.26 2.6E-01 -0.14 -0.27 -0.04 yes

rs9846123 WDR6 BMI -0.71 0.32 2.5E-02 -0.27 0.24 2.6E-01 0.43 0.04 0.84 yes rs223051 RCN1 BMI -0.08 0.25 7.6E-01 -0.28 0.25 2.7E-01 -0.20 -0.36 -0.07 yes rs1708302 JAZF1 Matsuda -0.11 0.23 6.5E-01 -0.26 0.23 2.7E-01 -0.15 -0.28 -0.05 yes Index rs926279 LINC00680 BMI 0.86 0.35 1.5E-02 0.26 0.25 3.0E-01 -0.60 -1.09 -0.11 yes rs174547 FADS1 Triglycerides 0.04 0.06 5.5E-01 0.06 0.06 3.2E-01 0.02 0.00 0.05 yes rs4253772 CDPF1 Total -0.16 0.12 1.7E-01 -0.11 0.11 3.5E-01 0.05 0.01 0.11 yes Cholesterol rs2710323 STAB1 BMI 0.01 0.24 9.8E-01 -0.23 0.25 3.6E-01 -0.24 -0.41 -0.09 yes rs2710323 GNL3 BMI 0.02 0.26 9.4E-01 -0.23 0.25 3.6E-01 -0.25 -0.43 -0.10 yes rs2925979 CMIP Adiponectin 0.09 0.26 7.3E-01 0.22 0.25 3.8E-01 0.13 0.02 0.27 yes rs12679556 EYA1 WHR 0.01 0.00 4.5E-02 0.00 0.00 3.9E-01 0.00 -0.01 0.00 yes rs17326656 LHCGR WHR 0.01 0.00 1.1E-01 0.00 0.00 3.9E-01 0.00 -0.01 0.00 yes rs2972144 IRS1 Matsuda 0.17 0.24 4.8E-01 -0.20 0.24 4.1E-01 -0.37 -0.56 -0.20 yes Index rs2972144 IRS1 Triglycerides -0.01 0.06 9.1E-01 0.05 0.06 4.3E-01 0.06 0.02 0.10 yes rs17145738 BCL7B HDL 0.04 0.04 2.5E-01 0.03 0.04 4.5E-01 -0.02 -0.03 0.00 yes rs13333747 PKD1 WHR 0.00 0.00 2.8E-01 0.00 0.00 4.5E-01 0.00 0.00 0.00 yes

rs7187776 SH2B1 WHR 0.00 0.00 8.1E-01 0.00 0.00 4.6E-01 0.00 0.00 0.00 yes rs2294120 ZNF251 Matsuda 0.07 0.24 7.7E-01 0.17 0.23 4.6E-01 0.10 0.02 0.22 yes Index rs1700137 MMP16 BMI -0.09 0.26 7.3E-01 0.19 0.26 4.8E-01 0.28 0.13 0.46 yes rs1075901 ADORA2B BMI 0.74 0.28 8.2E-03 0.16 0.24 5.0E-01 -0.58 -0.91 -0.28 yes rs2581787 SFMBT1 Matsuda -0.22 0.23 3.3E-01 -0.15 0.23 5.1E-01 0.07 0.00 0.17 yes Index rs2581787 RFT1 Matsuda -0.33 0.25 1.8E-01 -0.15 0.23 5.1E-01 0.18 0.02 0.36 yes Index rs3768321 PABPC4 HDL 0.01 0.03 8.2E-01 0.02 0.03 5.2E-01 0.01 0.00 0.03 yes rs7134375 PDE3A BMI -0.31 0.25 2.1E-01 -0.15 0.25 5.5E-01 0.16 0.06 0.30 yes rs1549293 KAT8 BMI -0.01 0.24 9.6E-01 0.12 0.23 6.0E-01 0.13 0.00 0.28 yes rs9988 MRPS7 WHR 0.00 0.01 4.0E-01 0.00 0.01 6.0E-01 0.00 0.00 0.00 yes

104 rs12145743 HDGF HDL 0.00 0.03 9.2E-01 0.01 0.03 6.3E-01 0.01 0.00 0.02 yes

rs12574668 C11orf49 BMI 0.23 0.38 5.5E-01 -0.19 0.40 6.3E-01 -0.42 -0.71 -0.17 yes rs2972144 IRS1 WHR 0.00 0.00 1.8E-01 0.00 0.00 7.2E-01 0.00 0.00 0.01 yes rs59774409 RCN3 Triglycerides 0.07 0.17 7.0E-01 -0.06 0.17 7.4E-01 -0.12 -0.23 -0.04 yes rs7251505 CEBPA-AS1 WHR 0.00 0.01 7.5E-01 0.00 0.01 7.4E-01 -0.01 -0.01 0.00 yes rs11688682 INHBB Matsuda -0.13 0.25 6.0E-01 0.08 0.26 7.6E-01 0.21 0.08 0.37 yes Index rs76895963 CCND2 HOMA beta -9.08 8.58 2.9E-01 -2.57 8.72 7.7E-01 6.51 2.17 11.76 yes rs8126001 OPRL1 WHR 0.00 0.00 4.2E-01 0.00 0.00 7.8E-01 0.00 0.00 0.01 yes rs8126001 RGS19 WHR 0.00 0.00 8.3E-01 0.00 0.00 7.8E-01 0.00 0.00 0.00 yes rs2657879 SPRYD4 Triglycerides 0.01 0.08 8.6E-01 -0.02 0.07 7.8E-01 -0.03 -0.07 0.00 yes rs12369009 RP3-462E2.5 BMI -0.13 0.33 6.9E-01 0.09 0.32 7.8E-01 0.22 0.01 0.46 yes rs10832778 RP1-239B22.5 BMI -0.15 0.23 5.1E-01 0.06 0.23 7.8E-01 0.22 0.10 0.37 yes rs6063048 EYA2 Matsuda -0.24 0.27 3.9E-01 -0.06 0.27 8.2E-01 0.18 0.03 0.35 yes Index rs2972144 IRS1 HOMA beta -1.67 2.59 5.2E-01 0.56 2.55 8.3E-01 2.23 0.90 3.85 yes rs9838283 RP11-804H8.6 BMI -0.16 0.39 6.8E-01 0.08 0.38 8.4E-01 0.23 0.07 0.45 yes rs357438 TRIM24 WHR 0.00 0.00 5.0E-01 0.00 0.00 8.5E-01 0.00 0.00 0.00 yes rs6058635 TSPY26P BMI -0.41 0.25 1.1E-01 -0.05 0.24 8.5E-01 0.36 0.18 0.57 yes

rs7187776 SH2B1 Waist 0.14 0.31 6.5E-01 -0.05 0.31 8.6E-01 -0.19 -0.37 -0.05 yes rs435306 PLTP HDL -0.03 0.03 4.0E-01 0.00 0.03 8.7E-01 0.02 0.00 0.04 yes rs1063355 HLA-DQA1 Matsuda -1.02 0.37 6.1E-03 -0.04 0.24 8.7E-01 0.98 0.43 1.54 yes Index rs7187776 RP11-1348G14.4 BMI 0.14 0.25 5.7E-01 -0.04 0.23 8.8E-01 -0.18 -0.37 0.00 yes rs7102 LITAF WHR -0.01 0.00 1.4E-01 0.00 0.00 8.8E-01 0.01 0.00 0.01 yes rs13209968 HEY2 BMI -0.70 0.24 3.7E-03 0.03 0.24 9.0E-01 0.73 0.49 0.99 yes rs62034325 SULT1A2 BMI -0.46 0.24 5.1E-02 -0.03 0.23 9.1E-01 0.44 0.26 0.64 yes rs2972144 IRS1 Waist -0.17 0.33 6.2E-01 0.03 0.33 9.4E-01 0.19 0.04 0.39 yes rs757608 C17orf82 BMI -0.11 0.26 6.8E-01 -0.02 0.26 9.4E-01 0.09 0.01 0.20 yes rs524281 PACS1 BMI -0.16 0.29 5.9E-01 0.02 0.28 9.4E-01 0.18 0.02 0.36 yes

105 rs1020548 BEND6 BMI -0.62 0.30 4.1E-02 -0.01 0.30 9.8E-01 0.61 0.37 0.88 yes

rs4281707 AKTIP Matsuda -0.08 0.25 7.5E-01 -0.01 0.24 9.8E-01 0.07 0.01 0.17 yes Index rs13095652 DOCK3 BMI 0.35 0.26 1.7E-01 0.00 0.26 1.0E+00 -0.35 -0.55 -0.19 yes rs2928148 INO80 GFR 0.00 0.00 6.2E-03 0.00 0.00 3.6E-03 0.00 0.00 0.00 no rs17145738 BCL7B Triglycerides -0.24 0.09 9.3E-03 -0.23 0.09 1.2E-02 0.01 -0.02 0.06 no rs73858966 COL8A1 WHR -0.01 0.01 2.3E-02 -0.01 0.01 1.3E-02 0.00 0.00 0.00 no rs7932891 EIF4G2 WHR 0.01 0.00 1.4E-02 0.01 0.00 1.4E-02 0.00 0.00 0.00 no rs7918400 RP11-57H14.2 HOMA beta -4.04 2.76 1.4E-01 -5.89 2.46 1.7E-02 -1.85 -4.41 0.58 no rs754523 APOB LDL 0.13 0.06 4.5E-02 0.12 0.06 3.8E-02 -0.01 -0.06 0.04 no rs13391552 ALMS1P GFR 0.00 0.00 1.4E-02 0.00 0.00 4.0E-02 0.00 0.00 0.00 no rs12776880 C10orf112 BMI -0.48 0.25 5.3E-02 -0.47 0.24 4.6E-02 0.00 -0.13 0.14 no rs2925979 CMIP Matsuda 0.42 0.26 1.0E-01 0.49 0.25 5.7E-02 0.06 -0.05 0.19 no Index rs3111316 FARSA HOMA beta 5.32 2.74 5.3E-02 4.97 2.71 6.7E-02 -0.35 -1.27 0.45 no rs35565646 DISP2 WHR -0.01 0.00 2.1E-01 -0.01 0.00 6.8E-02 0.00 -0.01 0.00 no rs7918400 RP11-57H14.2 Matsuda 0.57 0.26 2.9E-02 0.42 0.23 7.3E-02 -0.15 -0.39 0.08 no Index rs3814883 INO80E BMI -0.51 0.25 4.1E-02 -0.43 0.24 7.4E-02 0.08 -0.05 0.23 no rs3814883 HIRIP3 BMI -0.39 0.24 1.1E-01 -0.43 0.24 7.4E-02 -0.04 -0.14 0.06 no

rs2222543 CPED1 WHR -0.01 0.00 3.7E-02 -0.01 0.00 8.7E-02 0.00 0.00 0.00 no rs761423 MST1L BMI 0.39 0.28 1.6E-01 0.45 0.26 9.0E-02 0.05 -0.14 0.25 no rs4846914 GALNT2 HDL 0.05 0.02 5.0E-02 0.04 0.02 9.4E-02 -0.01 -0.02 0.00 no rs10164099 RP11-95O2.5 WHR 0.01 0.00 8.9E-02 0.01 0.00 9.5E-02 0.00 0.00 0.00 no rs672356 CCDC144B WHR -0.01 0.00 1.2E-01 -0.01 0.00 9.9E-02 0.00 0.00 0.00 no rs12680842 RP11-267M23.4 BMI -0.34 0.27 2.0E-01 -0.41 0.25 9.9E-02 -0.07 -0.26 0.11 no rs6463489 FBXL18 BMI -0.88 0.50 8.0E-02 -0.78 0.49 1.1E-01 0.10 -0.12 0.34 no rs9931407 FHOD1 BMI -0.60 0.40 1.3E-01 -0.62 0.39 1.1E-01 -0.03 -0.16 0.10 no rs8103017 SSC5D WHR 0.00 0.00 2.3E-01 0.01 0.00 1.1E-01 0.00 0.00 0.00 no rs1979527 ABTB1 WHR 0.01 0.00 6.5E-02 0.01 0.00 1.2E-01 0.00 0.00 0.00 no rs6494481 TRIP4 BMI 0.63 0.37 8.9E-02 0.54 0.35 1.2E-01 -0.08 -0.31 0.14 no

106 rs4500751 NPIPA2 BMI -0.38 0.26 1.6E-01 -0.38 0.24 1.2E-01 0.00 -0.20 0.20 no

rs2820311 IPO9 BMI 0.43 0.26 1.0E-01 0.40 0.26 1.3E-01 -0.03 -0.12 0.04 no rs11917361 SETD2 WHR -0.01 0.00 1.2E-01 0.00 0.00 1.4E-01 0.00 0.00 0.00 no rs11917361 CCDC12 WHR 0.00 0.00 1.8E-01 0.00 0.00 1.4E-01 0.00 0.00 0.00 no rs41279633 NPC1L1 LDL 0.11 0.08 1.6E-01 0.09 0.06 1.4E-01 -0.02 -0.11 0.07 no rs7350438 NET1 WHR 0.00 0.00 2.9E-01 0.00 0.00 1.5E-01 0.00 0.00 0.00 no rs3731861 AAMP WHR -0.01 0.00 1.6E-01 -0.01 0.00 1.5E-01 0.00 0.00 0.00 no rs8070454 PSMD3 BMI -0.28 0.24 2.4E-01 -0.34 0.24 1.5E-01 -0.06 -0.15 0.00 no rs1801214 WFS1 HOMA beta -3.57 2.53 1.6E-01 -3.42 2.44 1.6E-01 0.15 -1.17 1.43 no rs7811265 MLXIPL WHR 0.01 0.00 6.4E-02 0.01 0.00 1.6E-01 0.00 -0.01 0.00 no rs450231 ANKS6 BMI 0.43 0.27 1.1E-01 0.35 0.25 1.6E-01 -0.08 -0.26 0.10 no rs4731702 RP11-138A9.1 BMI 0.33 0.24 1.8E-01 0.32 0.24 1.8E-01 -0.01 -0.12 0.10 no rs1045411 HMGB1 Matsuda 0.32 0.25 2.0E-01 0.32 0.24 1.8E-01 0.00 -0.13 0.14 no Index rs9299 HOXB7 WHR 0.00 0.00 2.8E-01 0.00 0.00 1.8E-01 0.00 0.00 0.00 no rs7479183 PIDD WHR -0.01 0.00 7.5E-02 0.00 0.00 1.9E-01 0.00 0.00 0.00 no rs7479183 AP006621.6 WHR 0.00 0.00 2.9E-01 0.00 0.00 1.9E-01 0.00 0.00 0.00 no rs7595 TSEN34 WHR 0.00 0.00 2.1E-01 0.00 0.00 2.0E-01 0.00 0.00 0.00 no

rs7871866 RP11-339B21.15 BMI -0.33 0.32 3.0E-01 -0.38 0.30 2.0E-01 -0.05 -0.26 0.14 no rs7871866 RP11-339B21.13 BMI -0.37 0.31 2.3E-01 -0.38 0.30 2.0E-01 -0.02 -0.14 0.09 no rs79598313 ZDHHC18 LDL 0.18 0.13 1.8E-01 0.16 0.13 2.1E-01 -0.01 -0.05 0.01 no rs7122539 RP11-867G23.1 BMI 0.36 0.25 1.6E-01 0.30 0.25 2.2E-01 -0.05 -0.17 0.05 no rs886444 CDK5RAP3 BMI 0.27 0.24 2.7E-01 0.29 0.24 2.3E-01 0.02 -0.06 0.10 no rs3768321 PABPC4 HOMA beta -3.97 3.37 2.4E-01 -3.94 3.30 2.3E-01 0.03 -1.30 1.41 no rs2275426 MAST2 BMI 0.25 0.25 3.2E-01 0.29 0.25 2.4E-01 0.04 -0.08 0.17 no rs2836961 BRWD1-IT2 BMI 0.21 0.25 3.9E-01 0.28 0.24 2.4E-01 0.07 -0.07 0.22 no rs10268050 NUDCD3 BMI -0.29 0.29 3.2E-01 -0.34 0.29 2.4E-01 -0.05 -0.15 0.02 no rs2854322 EVI2A Total -0.06 0.06 3.7E-01 -0.07 0.06 2.5E-01 -0.02 -0.04 0.01 no Cholesterol

107 rs17154889 PPIP5K2 WHR 0.00 0.00 2.6E-01 0.00 0.00 2.5E-01 0.00 0.00 0.00 no

rs67536243 ROM1 WHR 0.00 0.00 2.5E-01 0.00 0.00 2.6E-01 0.00 0.00 0.00 no rs67536243 EEF1G WHR 0.00 0.00 2.5E-01 0.00 0.00 2.6E-01 0.00 0.00 0.00 no rs9846123 RP11-3B7.1 BMI -0.37 0.25 1.4E-01 -0.27 0.24 2.6E-01 0.10 -0.01 0.22 no rs9846123 ARIH2 BMI -0.26 0.25 2.8E-01 -0.27 0.24 2.6E-01 -0.01 -0.07 0.05 no rs7307277 RP11-380L11.4 Matsuda -0.28 0.27 3.0E-01 -0.30 0.27 2.7E-01 -0.02 -0.11 0.07 no Index rs5112 APOC1P1 Triglycerides 0.08 0.06 2.2E-01 0.06 0.06 2.7E-01 -0.01 -0.06 0.03 no rs11591741 CWF19L1 Total 0.11 0.07 1.0E-01 0.07 0.07 2.7E-01 -0.04 -0.09 0.00 no Cholesterol rs8017377 NYNRIN LDL -0.07 0.05 1.8E-01 -0.06 0.05 2.8E-01 0.01 0.00 0.04 no rs78058190 WNT6 HDL 0.01 0.06 9.1E-01 -0.05 0.05 2.9E-01 -0.06 -0.12 0.01 no rs7307277 RP11-380L11.4 WHR 0.00 0.00 4.1E-01 0.00 0.00 2.9E-01 0.00 0.00 0.00 no rs926279 XXbac- BMI 0.18 0.27 5.0E-01 0.26 0.25 3.0E-01 0.08 -0.13 0.30 no BPGBPG55C20. 2 rs926279 XXbac- BMI 0.11 0.33 7.3E-01 0.26 0.25 3.0E-01 0.15 -0.26 0.56 no BPGBPG55C20. 1 rs12448257 DNASE1 BMI 0.39 0.33 2.3E-01 0.33 0.32 3.0E-01 -0.06 -0.20 0.06 no rs11688682 INHBB HOMA beta 3.66 2.74 1.8E-01 2.74 2.70 3.1E-01 -0.92 -2.18 0.10 no

rs10049088 TIPARP WHR 0.00 0.00 7.5E-01 0.00 0.00 3.1E-01 0.00 -0.01 0.00 no rs4886506 SCAPER BMI 0.30 0.26 2.5E-01 0.25 0.25 3.1E-01 -0.05 -0.21 0.10 no rs4372913 AC017074.2 WHR 0.00 0.00 2.7E-01 0.00 0.00 3.3E-01 0.00 0.00 0.00 no rs174547 FADS1 HDL -0.02 0.02 3.6E-01 -0.02 0.02 3.3E-01 0.00 -0.01 0.01 no rs74551598 AKNA Total 0.07 0.06 2.6E-01 0.06 0.06 3.3E-01 -0.01 -0.04 0.01 no Cholesterol rs3768321 PABPC4 Matsuda 0.30 0.32 3.5E-01 0.30 0.31 3.3E-01 0.01 -0.12 0.14 no Index rs333947 CSF1 HDL -0.02 0.03 4.4E-01 -0.03 0.03 3.3E-01 -0.01 -0.02 0.00 no rs2448 FST Adiponectin -0.35 0.29 2.3E-01 -0.28 0.29 3.4E-01 0.07 0.00 0.17 no rs1549293 KAT8 Waist -0.18 0.32 5.8E-01 -0.29 0.31 3.4E-01 -0.12 -0.30 0.06 no rs11591741 CWF19L1 Matsuda -0.23 0.27 4.1E-01 -0.23 0.25 3.6E-01 -0.01 -0.20 0.18 no Index

108 rs7134375 PDE3A HDL -0.02 0.03 3.9E-01 -0.02 0.03 3.6E-01 0.00 -0.01 0.01 no

rs12325187 ZNF75A WHR 0.00 0.00 2.2E-01 0.00 0.00 3.6E-01 0.00 -0.01 0.00 no rs12325187 RP11-433P17.3 WHR 0.00 0.00 3.2E-01 0.00 0.00 3.6E-01 0.00 0.00 0.00 no rs7902274 C10orf88 LDL 0.04 0.05 4.4E-01 0.05 0.05 3.6E-01 0.01 -0.01 0.02 no rs4660293 OXCT2 HDL 0.03 0.03 2.7E-01 0.03 0.03 3.7E-01 -0.01 -0.03 0.01 no rs11784833 GRINA LDL 0.04 0.05 4.0E-01 0.04 0.05 3.9E-01 0.00 -0.01 0.01 no rs7941030 UBASH3B BMI -0.28 0.24 2.4E-01 -0.21 0.24 3.9E-01 0.08 -0.01 0.19 no rs1933182 SYPL2 GFR 0.00 0.00 6.1E-01 0.00 0.00 4.0E-01 0.00 0.00 0.00 no rs11672254 FPR3 LDL -0.06 0.05 3.0E-01 -0.05 0.05 4.0E-01 0.01 0.00 0.03 no rs11715915 APEH Fasting -0.03 0.03 3.9E-01 -0.03 0.03 4.0E-01 0.00 -0.01 0.02 no glucose rs919433 AC013264.2 BMI 0.28 0.25 2.5E-01 0.20 0.24 4.1E-01 -0.09 -0.22 0.03 no rs316611 NDUFAF1 BMI -0.20 0.28 4.8E-01 -0.21 0.26 4.2E-01 -0.01 -0.24 0.22 no rs3111316 FARSA Matsuda 0.18 0.26 4.9E-01 0.21 0.26 4.2E-01 0.03 -0.05 0.12 no Index rs6684514 C1orf85 HbA1c -0.01 0.02 5.9E-01 -0.02 0.02 4.3E-01 -0.01 -0.02 0.00 no rs6684514 CCT3 HbA1c -0.01 0.02 5.7E-01 -0.02 0.02 4.3E-01 0.00 -0.02 0.01 no rs10190332 RP11-493E12.2 BMI -0.29 0.26 2.7E-01 -0.19 0.25 4.3E-01 0.09 -0.06 0.26 no rs4731702 KLF14 HOMA beta 2.92 2.65 2.7E-01 1.88 2.45 4.4E-01 -1.03 -3.12 0.94 no

rs4731702 RP11-138A9.1 HOMA beta 1.32 2.51 6.0E-01 1.88 2.45 4.4E-01 0.56 -0.59 1.79 no rs4731702 AC016831.7 HOMA beta 2.32 2.51 3.6E-01 1.88 2.45 4.4E-01 -0.43 -1.58 0.63 no rs3814883 INO80E Matsuda 0.10 0.25 6.8E-01 0.18 0.24 4.5E-01 0.08 -0.06 0.23 no Index rs3814883 HIRIP3 Matsuda 0.15 0.24 5.4E-01 0.18 0.24 4.5E-01 0.03 -0.06 0.14 no Index rs4731702 AC016831.7 Matsuda -0.27 0.24 2.6E-01 -0.17 0.23 4.5E-01 0.09 -0.01 0.21 no Index rs4731702 KLF14 Matsuda -0.28 0.25 2.6E-01 -0.17 0.23 4.5E-01 0.11 -0.08 0.30 no Index rs4731702 RP11-138A9.1 Matsuda -0.12 0.24 6.1E-01 -0.17 0.23 4.5E-01 -0.05 -0.17 0.05 no Index rs4253772 CDPF1 LDL -0.10 0.10 3.4E-01 -0.07 0.10 4.6E-01 0.02 -0.02 0.07 no rs516946 ANK1 HOMA beta -2.34 3.17 4.6E-01 -2.30 3.10 4.6E-01 0.04 -1.32 1.40 no

109 rs7187776 RP11-1348G14.4 WHR 0.00 0.00 7.6E-01 0.00 0.00 4.6E-01 0.00 0.00 0.00 no

rs6980507 SMIM19 HbA1c 0.01 0.02 7.3E-01 0.02 0.02 4.6E-01 0.01 -0.01 0.03 no rs1045411 HMGB1 BMI -0.12 0.26 6.4E-01 -0.18 0.25 4.6E-01 -0.06 -0.21 0.07 no rs2294120 ZNF34 Matsuda 0.25 0.24 2.9E-01 0.17 0.23 4.6E-01 -0.08 -0.20 0.02 no Index rs2294120 RPL8 Matsuda 0.08 0.25 7.4E-01 0.17 0.23 4.6E-01 0.09 -0.08 0.26 no Index rs3814883 INO80E HOMA beta 2.02 2.59 4.4E-01 1.82 2.48 4.6E-01 -0.20 -1.70 1.28 no rs3814883 HIRIP3 HOMA beta 1.78 2.53 4.8E-01 1.82 2.48 4.6E-01 0.04 -0.90 1.02 no rs12748152 PIGV Triglycerides -0.03 0.12 7.8E-01 -0.08 0.11 4.6E-01 -0.05 -0.11 0.00 no rs12748152 RP5-968P14.2 Triglycerides -0.09 0.12 4.4E-01 -0.08 0.11 4.6E-01 0.01 -0.07 0.09 no rs2814992 SNRPC BMI 0.26 0.26 3.1E-01 0.18 0.25 4.7E-01 -0.08 -0.20 0.03 no rs2814992 UHRF1BP1 BMI 0.18 0.27 5.2E-01 0.18 0.25 4.7E-01 0.01 -0.22 0.23 no rs709400 TRMT61A BMI -0.22 0.25 3.7E-01 -0.18 0.25 4.7E-01 0.04 -0.01 0.13 no rs709400 AL049840.1 BMI -0.35 0.27 2.0E-01 -0.18 0.25 4.7E-01 0.17 -0.05 0.38 no rs709400 RP11-73M18.9 BMI -0.36 0.28 2.0E-01 -0.18 0.25 4.7E-01 0.18 -0.09 0.46 no rs2243312 RP11-217B7.2 HDL 0.00 0.03 8.9E-01 -0.02 0.02 4.7E-01 -0.02 -0.05 0.01 no rs10497807 PLCL1 BMI 0.24 0.26 3.4E-01 0.18 0.25 4.7E-01 -0.06 -0.19 0.05 no

rs174547 FADS1 Total -0.06 0.06 3.6E-01 -0.04 0.06 4.7E-01 0.01 -0.01 0.03 no Cholesterol rs6884702 RP11-53O19.3 HOMA beta 1.24 2.39 6.0E-01 1.71 2.38 4.7E-01 0.47 -0.14 1.33 no rs1730859 PRMT6 WHR 0.00 0.00 4.0E-01 0.00 0.00 4.7E-01 0.00 0.00 0.00 no rs823074 PM20D1 BMI -0.43 0.30 1.5E-01 -0.17 0.24 4.8E-01 0.26 -0.10 0.62 no rs75584603 HOXA11-AS WHR 0.01 0.01 6.5E-01 0.01 0.01 4.8E-01 0.00 0.00 0.01 no rs8017377 NYNRIN Total -0.05 0.06 4.5E-01 -0.04 0.06 4.9E-01 0.00 -0.02 0.03 no Cholesterol rs13139571 RP11-588K22.2 0.47 0.67 4.8E-01 0.45 0.66 4.9E-01 -0.01 -0.23 0.19 no rs11926707 PTH1R Matsuda -0.11 0.26 6.8E-01 -0.17 0.25 5.0E-01 -0.06 -0.20 0.07 no Index rs2453580 SLC47A1 GFR 0.00 0.00 4.3E-01 0.00 0.00 5.0E-01 0.00 0.00 0.00 no rs1075901 ZSWIM7 BMI -0.10 0.36 7.8E-01 0.16 0.24 5.0E-01 0.26 -0.27 0.81 no

110 rs3134353 YWHAZ BMI 0.22 0.23 3.5E-01 0.15 0.23 5.1E-01 -0.06 -0.16 0.00 no

rs12964689 NPC1 BMI 0.10 0.24 6.7E-01 0.14 0.23 5.3E-01 0.04 -0.08 0.18 no rs12964689 C18orf8 BMI 0.13 0.23 5.7E-01 0.14 0.23 5.3E-01 0.01 -0.06 0.09 no rs2898290 FAM85A sbp -0.94 1.03 3.6E-01 -0.63 1.00 5.3E-01 0.31 -0.22 0.87 no rs2898290 RP11-419I17.1 sbp -0.66 1.02 5.2E-01 -0.63 1.00 5.3E-01 0.03 -0.38 0.46 no rs17764730 CTC-228N24.3 WHR 0.01 0.01 2.4E-01 0.00 0.00 5.3E-01 0.00 -0.01 0.00 no rs2925979 CMIP HOMA beta -1.59 2.75 5.6E-01 -1.67 2.68 5.3E-01 -0.08 -1.30 1.13 no rs2925979 CMIP HDL 0.02 0.03 4.5E-01 0.02 0.03 5.3E-01 0.00 -0.02 0.01 no rs1061810 HSD17B12 Matsuda 0.36 0.28 1.9E-01 0.16 0.25 5.4E-01 -0.21 -0.46 0.03 no Index rs2034088 VPS53 BMI 0.13 0.25 6.0E-01 0.14 0.24 5.5E-01 0.01 -0.08 0.12 no rs174547 FADS1 Fasting -0.02 0.03 5.4E-01 -0.02 0.03 5.6E-01 0.00 -0.01 0.01 no glucose rs1045411 HMGB1 WHR 0.00 0.00 4.7E-01 0.00 0.00 5.6E-01 0.00 0.00 0.00 no rs7184597 RP11-1348G14.1 WHR 0.00 0.00 4.1E-01 0.00 0.00 5.7E-01 0.00 0.00 0.00 no rs7184597 NPIPB6 WHR 0.00 0.00 7.0E-01 0.00 0.00 5.7E-01 0.00 0.00 0.00 no rs2516739 NTHL1 BMI -0.22 0.28 4.5E-01 -0.16 0.28 5.7E-01 0.06 -0.01 0.15 no rs4653017 PHC2 BMI 0.06 0.26 8.1E-01 0.15 0.26 5.7E-01 0.09 0.00 0.20 no rs7307277 RP11-380L11.4 BMI -0.13 0.28 6.5E-01 -0.16 0.28 5.7E-01 -0.03 -0.13 0.06 no

rs1045241 TNFAIP8 WHR 0.00 0.00 7.3E-01 0.00 0.00 5.8E-01 0.00 0.00 0.00 no rs11869286 PGAP3 HDL 0.01 0.03 7.4E-01 0.01 0.03 5.8E-01 0.01 0.00 0.02 no rs6764533 CTD-2002J20.1 BMI 0.17 0.27 5.3E-01 0.14 0.26 5.9E-01 -0.02 -0.13 0.08 no rs56271783 VEGFB Triglycerides 0.05 0.13 7.1E-01 0.06 0.11 5.9E-01 0.01 -0.09 0.12 no rs1045241 TNFAIP8 Triglycerides 0.05 0.06 4.3E-01 0.04 0.07 5.9E-01 -0.02 -0.04 0.01 no rs912768 PRDX6 BMI -0.35 0.23 1.3E-01 -0.14 0.26 5.9E-01 0.21 -0.01 0.44 no rs2448 FST WHR 0.00 0.00 6.8E-01 0.00 0.00 5.9E-01 0.00 0.00 0.00 no rs516946 ANK1 Matsuda 0.10 0.30 7.4E-01 0.16 0.29 5.9E-01 0.06 -0.07 0.20 no Index rs174547 FADS1 LDL -0.03 0.05 5.3E-01 -0.03 0.05 5.9E-01 0.01 -0.01 0.02 no rs12748152 PIGV LDL -0.07 0.10 4.6E-01 -0.05 0.10 6.0E-01 0.02 -0.02 0.07 no rs12748152 RP5-968P14.2 LDL -0.03 0.10 7.5E-01 -0.05 0.10 6.0E-01 -0.02 -0.09 0.05 no

111 rs13134327 RP11-673E1.1 HbA1c 0.02 0.02 3.8E-01 0.01 0.02 6.0E-01 -0.01 -0.03 0.01 no

rs13240600 CPSF4 BMI 0.13 0.30 6.7E-01 0.15 0.29 6.1E-01 0.02 -0.11 0.16 no rs860295 DAP3 BMI 0.18 0.25 4.8E-01 0.12 0.25 6.2E-01 -0.06 -0.16 0.03 no rs1063355 HLA-DQA1 HOMA beta 5.42 3.96 1.7E-01 1.26 2.57 6.2E-01 -4.16 -10.06 1.61 no rs12145743 RRNAD1 HDL 0.01 0.03 6.0E-01 0.01 0.03 6.3E-01 0.00 -0.01 0.01 no rs2581787 SFMBT1 HOMA beta 1.57 2.45 5.2E-01 1.17 2.44 6.3E-01 -0.40 -1.18 0.12 no rs2581787 RFT1 HOMA beta 1.80 2.60 4.9E-01 1.17 2.44 6.3E-01 -0.63 -2.47 1.12 no rs3829849 LMX1B BMI -0.17 0.25 5.0E-01 -0.11 0.24 6.4E-01 0.05 -0.05 0.17 no rs72875462 ABCG5 Total -0.06 0.10 5.9E-01 -0.05 0.10 6.4E-01 0.01 -0.05 0.07 no Cholesterol rs6859752 SH3PXD2B WHR 0.00 0.00 6.6E-01 0.00 0.00 6.5E-01 0.00 0.00 0.00 no rs7307277 RP11-380L11.4 Triglycerides 0.04 0.07 5.8E-01 0.03 0.07 6.5E-01 -0.01 -0.03 0.01 no rs4500751 NPIPA2 Hip -0.04 0.32 9.1E-01 -0.13 0.30 6.5E-01 -0.10 -0.35 0.15 no rs704 TMEM199 LDL 0.03 0.05 5.3E-01 0.02 0.05 6.6E-01 -0.01 -0.03 0.00 no rs704 CTD-2350C19.1 LDL 0.03 0.05 5.7E-01 0.02 0.05 6.6E-01 -0.01 -0.02 0.01 no rs6884702 RP11-53O19.3 Matsuda 0.09 0.23 6.9E-01 0.10 0.23 6.6E-01 0.01 -0.06 0.08 no Index rs9299 HOXB7 BMI -0.16 0.27 5.6E-01 -0.12 0.27 6.6E-01 0.04 -0.05 0.15 no

rs630337 RHCE Total 0.03 0.07 6.3E-01 0.02 0.06 6.7E-01 -0.01 -0.06 0.05 no Cholesterol rs2230590 MST1R BMI 0.07 0.24 7.7E-01 0.10 0.24 6.8E-01 0.03 -0.05 0.12 no rs7739232 KLHL31 Hip 0.11 0.57 8.5E-01 -0.22 0.53 6.8E-01 -0.33 -0.79 0.10 no rs12748152 RP5-968P14.2 HDL 0.00 0.05 9.5E-01 0.02 0.05 6.8E-01 0.02 -0.02 0.05 no rs12748152 PIGV HDL 0.02 0.05 7.3E-01 0.02 0.05 6.8E-01 0.00 -0.02 0.02 no rs4897182 CENPW HOMA beta 2.02 2.51 4.2E-01 0.98 2.44 6.9E-01 -1.04 -2.40 0.13 no rs6738445 AC068039.4 BMI 0.10 0.32 7.6E-01 -0.11 0.27 7.0E-01 -0.20 -0.54 0.12 no rs10948059 PEX6 HDL -0.03 0.03 4.0E-01 -0.01 0.02 7.0E-01 0.02 -0.03 0.06 no rs4660293 OXCT2 Triglycerides 0.05 0.07 5.2E-01 0.03 0.07 7.0E-01 -0.02 -0.06 0.02 no rs1708302 JAZF1 HOMA beta 0.28 2.49 9.1E-01 0.96 2.47 7.0E-01 0.68 -0.03 1.66 no

112 rs1045411 HMGB1 HOMA beta -1.06 2.64 6.9E-01 -0.97 2.54 7.0E-01 0.09 -1.30 1.53 no

rs6671166 TPM3 Triglycerides 0.03 0.06 6.4E-01 0.02 0.06 7.1E-01 -0.01 -0.02 0.01 no rs11102965 CELSR2 LDL 0.03 0.05 5.6E-01 0.02 0.05 7.2E-01 -0.01 -0.04 0.01 no rs7941030 UBASH3B HDL -0.01 0.03 7.0E-01 -0.01 0.02 7.2E-01 0.00 -0.01 0.01 no rs524281 PACS1 WHR 0.00 0.00 9.6E-01 0.00 0.00 7.3E-01 0.00 0.00 0.00 no rs2306589 ZNHIT3 WHR 0.00 0.00 5.9E-01 0.00 0.00 7.4E-01 0.00 0.00 0.00 no rs4372836 AC097724.3 BMI 0.17 0.27 5.2E-01 0.08 0.26 7.4E-01 -0.09 -0.25 0.06 no rs181362 CCDC116 HDL 0.02 0.03 3.7E-01 0.01 0.02 7.4E-01 -0.02 -0.04 0.00 no rs1534696 SNX10 WHR 0.00 0.00 4.2E-01 0.00 0.00 7.4E-01 0.00 0.00 0.01 no rs1534696 CBX3 WHR 0.00 0.00 7.1E-01 0.00 0.00 7.4E-01 0.00 0.00 0.00 no rs11926707 PTH1R HOMA beta 0.31 2.70 9.1E-01 0.84 2.60 7.5E-01 0.52 -0.85 2.00 no rs56271783 VEGFB WHR 0.00 0.01 7.9E-01 0.00 0.01 7.6E-01 0.00 -0.01 0.00 no rs7307277 RP11-380L11.4 HOMA beta -0.68 2.88 8.1E-01 -0.88 2.84 7.6E-01 -0.20 -1.21 0.75 no rs7307277 RP11-380L11.4 HDL 0.01 0.03 6.1E-01 0.01 0.03 7.6E-01 -0.01 -0.02 0.00 no rs2034088 VPS53 Hip 0.19 0.29 5.1E-01 0.09 0.29 7.6E-01 -0.11 -0.25 0.01 no rs4731702 AC016831.7 Hip -0.19 0.29 5.2E-01 -0.09 0.29 7.6E-01 0.10 -0.02 0.25 no rs4731702 RP11-138A9.1 Hip -0.12 0.29 6.8E-01 -0.09 0.29 7.6E-01 0.04 -0.10 0.17 no rs4731702 KLF14 Hip -0.13 0.31 6.8E-01 -0.09 0.29 7.6E-01 0.04 -0.19 0.28 no

rs4846914 GALNT2 Triglycerides -0.02 0.06 8.0E-01 -0.02 0.06 7.6E-01 0.00 -0.03 0.03 no rs630337 RHCE LDL -0.02 0.06 7.2E-01 -0.01 0.05 7.7E-01 0.01 -0.04 0.05 no rs498240 NELFE BMI -0.04 0.62 9.5E-01 -0.18 0.61 7.7E-01 -0.14 -0.38 0.05 no rs12430764 GPC6 WHR 0.00 0.00 7.8E-01 0.00 0.00 7.8E-01 0.00 0.00 0.00 no rs12369009 RP3-473L9.4 BMI 0.17 0.33 6.0E-01 0.09 0.32 7.8E-01 -0.08 -0.24 0.05 no rs9341990 TBX18 WHR 0.00 0.00 9.8E-01 0.00 0.00 7.8E-01 0.00 0.00 0.00 no rs10463416 ABLIM3 WHR 0.00 0.00 9.3E-01 0.00 0.00 7.8E-01 0.00 0.00 0.00 no rs1106908 DHRS11 BMI -0.10 0.26 7.1E-01 -0.07 0.24 7.8E-01 0.03 -0.16 0.22 no rs638722 C1orf213 WHR 0.00 0.00 6.4E-01 0.00 0.00 7.9E-01 0.00 0.00 0.00 no rs4281707 AKTIP HOMA beta -0.81 2.60 7.6E-01 -0.65 2.58 8.0E-01 0.16 -0.53 0.95 no rs4281707 RBL2 HOMA beta -0.59 2.68 8.3E-01 -0.65 2.58 8.0E-01 -0.07 -1.49 1.34 no rs12946454 HEXIM1 sbp 0.35 1.10 7.5E-01 0.28 1.10 8.0E-01 -0.07 -0.34 0.12 no

113 rs9931407 FHOD1 HDL -0.02 0.04 6.7E-01 -0.01 0.04 8.0E-01 0.01 0.00 0.02 no

rs9931407 CTC-277H1.7 HDL -0.02 0.04 6.5E-01 -0.01 0.04 8.0E-01 0.01 -0.01 0.03 no rs4731702 KLF14 HDL 0.01 0.03 8.2E-01 -0.01 0.02 8.3E-01 -0.01 -0.03 0.01 no rs4731702 RP11-138A9.1 HDL 0.00 0.02 1.0E+00 -0.01 0.02 8.3E-01 -0.01 -0.02 0.01 no rs4731702 AC016831.7 HDL -0.01 0.02 7.7E-01 -0.01 0.02 8.3E-01 0.00 -0.01 0.01 no rs2513987 RP11-690D19.3 WHR 0.00 0.00 9.5E-01 0.00 0.00 8.4E-01 0.00 0.00 0.00 no rs9838283 HEMK1 BMI 0.14 0.39 7.1E-01 0.08 0.38 8.4E-01 -0.07 -0.24 0.09 no rs12681990 ZNF703 Matsuda 0.16 0.36 6.7E-01 0.07 0.34 8.5E-01 -0.09 -0.33 0.14 no Index rs72959041 RSPO3 WHR 0.01 0.01 4.9E-01 0.00 0.01 8.6E-01 0.00 -0.01 0.00 no rs17173637 AOC1 HDL 0.00 0.04 9.8E-01 -0.01 0.03 8.6E-01 -0.01 -0.03 0.02 no rs1801214 WFS1 Matsuda 0.02 0.24 9.2E-01 0.04 0.23 8.6E-01 0.02 -0.10 0.14 no Index rs72875462 ABCG5 LDL -0.01 0.09 9.0E-01 -0.02 0.09 8.6E-01 0.00 -0.06 0.05 no rs7187776 RP11-1348G14.4 Waist 0.11 0.33 7.3E-01 -0.05 0.31 8.6E-01 -0.17 -0.43 0.06 no rs28451064 MRPS6 WHR 0.00 0.00 7.5E-01 0.00 0.00 8.8E-01 0.00 0.00 0.00 no rs28451064 LINC00310 WHR 0.00 0.00 9.9E-01 0.00 0.00 8.8E-01 0.00 0.00 0.00 no rs7187776 SH2B1 BMI 0.03 0.24 9.1E-01 -0.04 0.23 8.8E-01 -0.06 -0.19 0.04 no

rs7184597 RP11-1348G14.1 Matsuda 0.09 0.25 7.2E-01 0.04 0.24 8.8E-01 -0.05 -0.17 0.05 no Index rs7184597 NPIPB6 Matsuda 0.08 0.25 7.5E-01 0.04 0.24 8.8E-01 -0.04 -0.20 0.10 no Index rs17826219 AC005562.1 BMI 0.17 0.46 7.2E-01 0.06 0.45 8.9E-01 -0.10 -0.32 0.09 no rs3814883 HIRIP3 WHR 0.00 0.00 7.2E-01 0.00 0.00 9.0E-01 0.00 0.00 0.00 no rs3814883 INO80E WHR 0.00 0.00 9.6E-01 0.00 0.00 9.0E-01 0.00 0.00 0.00 no rs2448 FST HDL 0.00 0.03 9.9E-01 0.00 0.03 9.1E-01 0.00 0.00 0.01 no rs7184597 RP11-1348G14.1 HOMA beta 0.78 2.61 7.6E-01 0.28 2.55 9.1E-01 -0.51 -1.71 0.55 no rs7184597 NPIPB6 HOMA beta -0.01 2.67 1.0E+00 0.28 2.55 9.1E-01 0.29 -1.24 1.85 no rs1061810 HSD17B12 HOMA beta -0.41 2.96 8.9E-01 0.28 2.65 9.1E-01 0.69 -1.89 3.31 no rs17207196 STAG3L1 BMI 0.08 0.25 7.7E-01 0.02 0.25 9.2E-01 -0.05 -0.18 0.07 no

114 rs17207196 AC018720.10 BMI -0.05 0.26 8.6E-01 0.02 0.25 9.2E-01 0.07 -0.10 0.25 no

rs17207196 POM121C BMI 0.04 0.25 8.7E-01 0.02 0.25 9.2E-01 -0.02 -0.10 0.07 no rs17207196 NSUN5P1 BMI 0.03 0.25 9.2E-01 0.02 0.25 9.2E-01 0.00 -0.09 0.09 no rs2110073 U47924.30 HbA1c 0.01 0.04 8.7E-01 0.00 0.04 9.3E-01 0.00 -0.02 0.02 no rs7941030 UBASH3B Total 0.00 0.06 9.4E-01 0.01 0.06 9.3E-01 0.00 -0.02 0.02 no Cholesterol rs4660293 OXCT2 WHR 0.00 0.00 6.8E-01 0.00 0.00 9.3E-01 0.00 0.00 0.00 no rs2294120 ZNF34 HOMA beta -0.97 2.53 7.0E-01 -0.21 2.48 9.3E-01 0.76 -0.26 2.02 no rs2294120 ZNF251 HOMA beta 0.26 2.51 9.2E-01 -0.21 2.48 9.3E-01 -0.47 -1.51 0.37 no rs2294120 RPL8 HOMA beta 0.37 2.63 8.9E-01 -0.21 2.48 9.3E-01 -0.58 -2.39 1.14 no rs1730859 PRMT6 BMI -0.06 0.29 8.4E-01 0.02 0.27 9.3E-01 0.08 -0.11 0.28 no rs12681990 ZNF703 HOMA beta -0.25 3.84 9.5E-01 0.22 3.63 9.5E-01 0.47 -2.03 2.96 no rs757608 C17orf82 WHR 0.00 0.00 8.7E-01 0.00 0.00 9.6E-01 0.00 0.00 0.00 no rs11277705 POLR2C HDL 0.00 0.04 9.5E-01 0.00 0.04 9.7E-01 0.00 -0.02 0.00 no 1 rs11591741 CWF19L1 HOMA beta 0.91 2.87 7.5E-01 -0.09 2.69 9.7E-01 -1.00 -3.07 0.95 no rs1730859 PRMT6 LDL 0.02 0.06 7.1E-01 0.00 0.06 9.8E-01 -0.02 -0.06 0.02 no rs4281707 RBL2 Matsuda 0.06 0.25 8.1E-01 -0.01 0.24 9.8E-01 -0.07 -0.21 0.06 no Index rs2049805 GBAP1 GFR 0.00 0.00 8.3E-01 0.00 0.00 9.8E-01 0.00 0.00 0.00 no

rs2925979 CMIP WHR 0.00 0.00 9.5E-01 0.00 0.00 9.9E-01 0.00 0.00 0.00 no rs6063048 EYA2 Triglycerides 0.01 0.07 8.6E-01 0.00 0.07 1.0E+00 -0.01 -0.05 0.03 no Table 2.17: Mediation of cardiometabolic traits through gene expression We tested the effect of the variant through gene expression level on

the GWAS trait, or an equivalent trait measured in METSIM. We estimated the 95% confidence interval of the mediation effect to assess

significance. This table is sorted first on whether the mediation effect CI does not include 0, and then on the total effect p-value. Abbreviations

include “Eff” for effect and “CI” for confidence interval.

115

Combined Effect Direct Effect of Direct and Mediation of GWAS variant on Trait Effects Mediation Effect GWAS Effect Effect Effect variant Gene Trait Size SE P-value Size SE P-value size (95% CI)

rs12051272 CDH13 Adiponectin -1.75 0.36 1.25E-6 -1.41 0.34 3.20E-5 0.34 (0.086, 0.63)

rs11600990 CATSPERZ BMI -0.87 0.36 0.01 -1.10 0.35 1.80E-3 -0.23 (-0.44, -0.065)

rs7811265 MLXIPL Triglycerides -0.10 0.09 0.25 -0.22 0.08 6.41E-3 -0.12 (-0.20, -0.040)

rs1789882 ADH1A WHR 0.001 0.005 0.86 -0.009 0.004 0.024 -0.010 (-0.016, -0.005)

rs72959041 RSPO3 Triglycerides 0.40 0.14 4.22E-3 0.28 0.14 0.039 -0.12 (-0.21, -0.043)

rs1708302 JAZF1 BMI 0.31 0.24 0.19 0.48 0.24 0.046 0.17 (0.059, 0.30) 116

Table 2.18: Mediation analysis of GWAS variants Mediation by adipose gene expression level on the effect of a GWAS variant on a

cardiometabolic trait in 434 samples from the METSIM study. Results are shown when the combined effect of the direct and mediation effects

were nominally significant (p < 0.05)

CHAPTER 3: SPLICE EQTL DETECTION REVEALS ISOFORM VARIATION IN

CARDIOMETABOLIC TRAITS

Introduction

mRNA can be spliced differently to produce isoforms and greatly expand the transcriptional diversity of cells. An estimated 92-94% of genes in the human genome are subject to alternative splicing129 and as many as 20-30% of disease causing mutations may affect mRNA splicing130. In mice, high fat diet has been shown to alter adipose tissue splicing profiles, altering adipose tissue thermogenesis and white adipose browning28. Alternative splicing has been associated with type 2 diabetes (T2D) variants at the TCF7L2 GWAS locus in a tissue-specific manner131. In human adipose tissue, differential splicing has been observed between different fat depots and during weight loss132.

RNA-sequencing offers the opportunity to examine different quantifications of the same pool of data to identify the presence genes, exons, and splices in the sample transcriptome, potentially highlighting differential isoform usage. Reads can be used to quantify genes or exons, from which gene and exon-level eQTLs detected. Splices can be quantified by examining the reads that span the junctions of exons, excluding certain introns and potentially exons34. Both exon and splice detection provide the opportunity to detect changes in the composition of the transcript, such as removal of an exon.

Depending on the detection method, novel splice junctions or exons may also be identified even if they were not previously annotated. The quantity of reads at splice junctions can then be used as outcomes in eQTL detection, resulting in splice eQTLs (sQTLs).

Previous studies have examined sQTLs in several different tissues, including whole blood, brain, and fetal retinal pigment epithelium35,37,40. Despite the hypothesis that variants within canonical splice sites themselves may be common sQTL variants, Saferali et al. noted that the distribution of eQTLs and sQTLs did not differ at splice donor and acceptor sites37. Lead sQTL variants are found in introns, exons,

117

and intergenically, suggesting that variants may influence splice expression not merely by disrupting consensus splice sites and potentially implicating splice silencers and enhancers. Prior studies have shown sQTL colocalizations with GWAS loci for traits including chronic obstructive pulmonary disease

(COPD), Alzheimer’s disease, age-related macular degeneration (AMD), and Parkinson’s disease in relevant tissues35,37,39,40. In several studies, sQTLs were able to provide a plausible mechanisms for the function of known candidate genes35,40.

Here, we present the analysis of subcutaneous adipose tissue expression level from 426 subjects from the METabolic Syndrome in Men (METSIM study). We differentiated between the splice, exon, and previously described gene-level eQTLs by identifying genes that were detected by sQTL but not by gene- level and exon-level eQTLs and then pathways enriched with those genes. We further identified the sQTL signals that colocalize with GWAS signals for BMI, WHRadjBMI, lipid traits, T2D, and cardiovascular disease. At the gene dense MADD-NR1H3 locus, we describe a novel splice junction of NR1H3, which encodes LXRa, a transcription factor involved in cholesterol homeostasis, and show that alternative splicing may be responsible for an HDL cholesterol GWAS signal.

Methods

METSIM study participants, genotypes and sequencing

METSIM is a population-based cohort composed of 10,197 males of Finnish ancestry from

Kuopio, Finland74. For this analysis, we examined a subset of 426 participants with subcutaneous adipose tissue collected from near the umbilicus via needle biopsy. The METSIM study was approved by the

Ethics Committee of the University of Eastern Finland and Kuopio University Hospital in Kuopio, Finland and carried out in accordance with the Helsinki Declaration. Written informed consent was obtained from all participants.

We measured genotypes using the Illumina OmniExpress BeadChip array and the Illumina

HumanCoreExome at the Center for Inherited Disease Research (Baltimore, MD). Genotyping quality control (QC) and imputation to the Haplotype Reference Consortium (HRC) panel85 have been previously described52. For this study, we filtered HRC-imputed genotypes to retain ~7.8 million variants with minor allele frequency (MAF)>0.01 and imputation quality r2>0.5.

118

Total mRNA was isolated using the prepared for using the Qiagen miRNeasy kit (Qiagen, Hilden,

Germany) and was polyA+ selected using Illumina TruSeq RNA Sample Preparation Kit v2 at the

University of California Los Angeles Neuroscience Genomics Core (UNGC). RNA-sequencing and quality control of 434 samples from the METSIM study have been previously described133; here, we used a subset of 426 samples to avoid sample overlap with the FUSION study for future meta-analysis. The average sequencing depth was 45 million paired-end 50 bp reads.

Gene, exon, and splice junction quantification

We mapped reads using STAR (version 2.4.2a)86 and GENCODE v19 annotation (July 2013 freeze)29, employing a two-pass method which uses splice sites observed from the first pass as annotation for the second pass. This method allows for discovery of novel splice junctions from the sequencing data itself. We then quantified the reads into genes, exons, and two metrics for splices using distinct methods (Figure S3.1). (1) For genes, we quantified reads using Salmon (version 0.7.2)90 and collapsed isoforms to genes using tximport (version 1.0.3)91. We retained 25,623 genes with at least 5 reads in 25% of samples. We then calculated counts per million (CPM) adjusted for trimmed means of M values (CPMadjTMM)134. (2) For exons, we quantified reads using QTLtools (version 1.1)32, retained

160,899 exons with at least 5 reads in 25% of samples, and calculated CPMadjTMM. (3) For splice junctions, we quantified reads using Leafcutter (version 2.17.4.)34, which does not use annotation for splice junction quantification, potentially allowing for novel junctions to be discovered alongside previously annotated splices. Leafcutter produces raw read counts and read counts per cluster, suitable for transformation into a percent spliced in (PSI) metric, which accounts for gene-level variability by adjusting for the prevalence of other nearby splicing events that share a start or stop position. We filtered both read metrics requiring 5 reads in 25% of samples, resulting in a set of 61,833 splice junctions.

Gene eQTL, exon eQTL, and sQTL detection

For gene, exon, and both splice quantifications, we performed a separate nominal eQTL pass using QTLtools32, providing both BMI and different numbers of PEER factors as covariates to the analysis. We explicitly included BMI as a covariate, because ~51% of genes are correlated with BMI. To determine the number of PEER factors to serve as covariates, we generated factors from 0 to 100, by

119

10s. We then determined the point at which adding ten more PEER factors did not increase the number of unique transcripts found to have at least one eQTL or sQTL association by at least 1%. For genes and exons, we included 40 PEER factors, and for splices measured as PSI, we included 20 PEER factors

(Table 3.1).

After identifying the number of PEER factors to include for each analysis, we performed eQTL detections for genes, exons, and splices, including BMI and the number of PEER factors (k = 40,

40, 20, respectively) as covariates to QTLtools. We calculated the false discovery rate (FDR) < 1% for each type of eQTL detection using the Storey and Tibshirani method135. The equivalent p-value for FDR <

1% for gene-level eQTL detection was P < 2.2 ´ 10-3 and the equivalent p-value for the exon-level eQTL detection was P < 7.1 ´ 10-4. The equivalent p-value for FDR < 1% for splice junctions was P < 2.03 ´ 10-

5.

To obtain the gene names of transcripts to which splice junctions belong, we used the get_intron_meta() and map_clusters_to_genes() utilities from Leafcutter along with GENCODE v19 annotation. Splices that did not map to genes in the annotation provided were retained but marked with an “unknown” tag, since they might represent novel junctions.

Pathway analysis

To assess whether genes detected by sQTLs were in any relevant biological pathways, we used

SIGORA (version 3.0.5)136. Rather than treating all genes as equally informative for identifying corresponding pathways, SIGORA uses genes or gene-pairs that are only found in specific pathways to test whether the pathway is overrepresented. We ran SIGORA using recommended settings of the

Reactome Pathways and 4 levels, because the database is hierarchical. We reported results for which pathways were significant (Bonferroni-corrected P < 0.05).

Colocalization with cardiometabolic GWAS loci

We tested for colocalization between splice QTLs from the PSI set and 2,843 cardiometabolic

GWAS variants representing 93 traits that have been previously described133. To test for colocalization, we first identified GWAS variants associated with the expression level of a splice junction (FDR < 1%).

120

We then identified the most significant sQTL variant for each splice junction and calculated LD (in

METSIM, n~ 10,000) between the GWAS variant and lead sQTL variant. We then conditioned the lead sQTL variant on the GWAS variant by including the latter as a covariate in the QTLtools analysis, alongside BMI and PEER factors. We considered the signals to be colocalized if: (i) the LD between the

GWAS and sQTL variant was strong (r2 ³ 0.8) and (ii) the sQTL variant is no longer associated with the splice after conditioning on the GWAS variant (FDR > 1%).

Validation of a novel NR1H3 splice junction

RNA was isolated from adipose tissue samples of individuals with each genotype of rs1449626

(Fatty Tissue RNA Purification Kit, Norgen Biotek). First strand cDNA was reverse transcribed according to manufacturer’s instructions (SuperScript III, Invitrogen) and amplified with primers located in exons 3 and 5 of NR1H3 (AGCAGCTGCATCCTCAGAGA/GAAGCCGGTCAGAAAAGGAG; hg19 chr11:

47281404 – 47282992). PCR products were run on a 1% agarose gel and isoform-specific band intensity was quantified using ImageJ137.

Results sQTL identification and characterization

We performed genome-wide sQTL analysis between 7.8M variants (MAF > 0.01) and 61,834 splice junctions detected in 426 individuals from the METSIM study. We found 294,361 variants associated with level of at least one splice junction (FDR <1%, P < 2.03 ´ 10-5; Table 3.2). We found

6,077 (9.8%) splice junctions to be associated with at least one variant within 1 Mb of the splice start

(FDR <1%, P < 2.03 ´ 10-5; Table 3.1). These 6,077 splice junctions correspond to 3,198 unique genes.

Among these genes, 1,874 (59%) were identified by an sQTL for a single junction and 27 (0.4%) genes were associated with 10 or more junctions (Figure 3.1). The largest number of unique splice junctions in the same gene that have at least one sQTL association was 15 and found for members of the SAA gene family (SAA1, SAA2, SAA2-SAA4, SAA4), encoding serum amyloid A apolipoproteins. This family of proteins is involved in pro-inflammatory response138 and HDL cholesterol homeostasis139. It is possible that the high volume of splices in this region is due difficulty differentiating between similar gene products,

121

but this chance is mitigated by our use of 50 paired end reads and low tolerance for mismatches during alignment (m < 2). The abundance in this region may also reflect residual gene-level eQTL association that was not removed by the percent spliced in method of sQTL detection.

The most significant sQTL identified was the association between rs7216 and a novel splice junction in the 3’ UTR of ALDH3A2 (chr17:19,578,812:19,578,875; P = 2.17 ´ 10-104). Notably, rs7216 is an A/T variant located in the consensus splice acceptor site itself, disrupting the intronic AG site, creating

TG. This splice is present in a predicted isoform of ALDH3A2 with a much shorter 3’ UTR compared with other isoforms of the gene. ALDH3A2 is an aldehyde dehydrogenase involved mitigating lipid peroxidation by detoxifying aldehyde products; in mice, decreased aldehyde dehydrogenase activity was associated with insulin resistance140. Mysore et al. showed that mi-192* targets ALDH3A2 in adipose tissue, suggesting a potential functional consequence to changes in the 3’ UTR of this gene141. Taken together, these results show that sQTL detection highlighted genes involved in important adipose tissue biological processes.

To identify differences between the new sQTL and exon eQTL datasets and our previously described gene-level eQTL detection133, we compared the genes found uniquely by each of these different quantification methods. To more directly compare these discovery methods, we summarized the exon eQTLs and sQTLs to the gene level and compared them with eQTL genes. We identified 2,280 genes found exclusively as sQTLs, exon eQTLs, or both but not as gene-level eQTLs (FDR < 1% in each analysis; Figure 3.2).

To find the biological processes corresponding to genes identified by genes found as only as sQTLs, we performed pathway analysis. Among the 112 genes identified only as sQTLs and not detected as exon or gene-level eQTLs, we found three significant pathways (Table 3.3). One of the pathways identified was “COPI-mediated anterograde transport” (R-HSA-6807878). COPI-mediated trafficking has been implicated in lipid droplet formation and regulation of lipid homeostasis in adipose tissue142. This analysis also identified “Regulation of HSF1-mediated heat shock response” (R-HSA-3371453; Table

3.3). Changes in heat shock response proteins have been linked with inflammation in human adipose tissue143. These results suggest that sQTLs contribute to our detection of active pathways in adipose tissue.

122

Colocalization of sQTLs and cardiometabolic GWAS loci

To identify splices that are associated with GWAS signals and potential mechanisms underlying them, we tested for colocalization between 2,843 cardiometabolic GWAS signals (variants clumped LD r2

³ 0.8) and the sQTLs. We first identified GWAS lead variants associated with splice junctions; GWAS lead variants were associated (FDR <1%) with levels of 479 splice junctions from 256 transcripts. After performing colocalization analyses using LD and conditional analysis, we found that 104 GWAS signals were colocalized with 132 splice junction QTLs. Each GWAS signal can be colocalized with sQTL signals from multiple splice junctions. These splice junctions represented 74 unique genes (Table 3.4). Each gene can be found colocalized via association from multiple, unique splice junctions, possibly indicating the presence of a gene- or isoform-level association. Among the splice junctions, we identified 39 at BMI loci, 26 at WHRadjBMI loci, 20 at lipid loci, three at T2D loci and two at CVD loci.

For example, we identified a colocalization between an HDL-cholesterol GWAS signal and a sQTL for a splice junction (chr11:47281530-47282792) in NR1H3, which removes exon 4 (Figure 3.2).

This splice junction shared the annotated edges of exons three and five, which matched a predicted isoform with only one supporting expressed sequence tag (EST), and our splicing analysis in adipose provided novel sequence-based support. We validated this novel splice junction by PCR amplification from cDNA derived from subcutaneous adipose tissue from individuals of differing genotypes. We found that a fragment matching the predicted length lacking exon 4 was detectable in adipose tissue, together with the isoform that includes exon 4 (Figure 3.3A and 3.3B), indicating that this splice is present in adipose tissue. Additionally, we found that the amount of the fragment differed according to the genotype of rs74948330, a proxy variant for the lead sQTL variant, rs1449626 (LD r2 =0.95; Figure 3.3C). NR1H3 encodes Alpha (LXRa), a transcription factor that binds to retinoid response elements along with its cofactor (RXR) and regulates lipid homeostasis144. Exon 4 encodes the zinc-finger binding domain needed for motif recognition, suggesting that isoforms lacking exon 4 lead to non-functioning transcription factor.

123

Colocalizations at MADD-NR1H3 locus, by trait and quantification

The NR1H3 splice junction is located in a gene-dense region of extensive LD on

(chr11:46,720,000-47,700,000). To better understand the complex relationships between genetic variants at this locus, we evaluated all GWAS signals reported in this region, including extended data sets not included in the GWAS Catalog145. We calculated the pairwise LD between reported GWAS variants regardless of trait. Using this method, we identified nine distinct signals (LD r2 <0.8 between lead variants) for HDL-C, proinsulin, BMI, and WHRadjBMI, which we denote by the letters A-I (Table 3.5, 3.6).

Two of these signals, rs10838686 (HDL signal C) and rs10838687 (proinsulin signal F), are in moderate

LD with each other (r2 =0.82 in Finns and 0.77 in Europeans). We retained these signals as separate because they represent different traits and because Finnish LD differs from broader European LD for these variants. All other signals exhibit low or moderate pairwise LD between lead variants (r2<.0.7).

Using these nine GWAS signals, we further performed colocalization analyses on the gene-level and identified two gene-level eQTL colocalizations in addition to the splice colocalization already described (Figures 3.4; Table 3.7, Table 3.4). The primary eQTL signal for NR1H3 (lead variant rs75393320) colocalized with HDL signal C (rs10838686; LD r2 = 0.93). The secondary eQTL signal for

NR1H3 (lead variant rs11039149) colocalized with proinsulin GWAS signal D (rs10501320; LD r2 = 1.0).

The sQTL signal for the lack of exon 4 of NR1H3 (lead variant rs1449626) colocalized with HDL signal C

(rs10838686; LD r2 = 1.0). The lead variant of the splice eQTL is non-coding and is located in the intergenic region 3’ of NR1H3. Among the proxy variants (r2 > 0.8), none were located in the splice donor or acceptor site, but one is a synonymous mutation located in exon 4 of NR1H3 (rs2279238; Table 3.8;

Figure 3.5). Xiong et al. found that exonic mutations that are not themselves disease-causing are five times more likely to affect splicing than other variants146, suggesting that rs2279238 may be the most likely variant responsible for the sQTL. Interestingly, the colocalization with HDL signal C is shared between the primary gene-level eQTL and the splice eQTL. This variant (rs10838686) is associated with increased expression of the gene and increased expression of the aberrant splice product. Taken together, these colocalizations, suggest a role for NR1H3 adipose expression in both HDL-cholesterol and glycemic traits.

124

Discussion

We identified splice QTLs in subcutaneous adipose tissue from 426 Finnish men enrolled in the

METSIM study. We found that 9.8% of expressed splice junctions were associated with at least one nearby variant. By examining the genes found to have an eQTL only on the splice or exon level, we identified 46 pathways that potentially affect the transcriptional program of adipose tissue. We colocalized these sQTLs with cardiometabolic GWAS signals and identified 104 variants associated with 132 splice junctions, found in 74 genes. One sQTL for a novel splice junction, which affects NR1H3, is colocalized with an HDL cholesterol GWAS signal, suggesting a splicing mechanism at this locus.

Splicing analysis found noteworthy genes, biological pathways, and novel splice junctions in adipose tissue. Our sQTL detection and subsequent pathway analysis of eQTL associated transcripts not identified on the gene-level found several pathways with links to important adipose tissue processes such as lipid droplet formation, inflammation, and cholesterol homeostasis. Further, we highlighted two novel splices (ALDH3A2 and NR1H3) with plausible mechanisms for altering the function of these genes.

Splice-based analysis was critical to identifying these genes for potential functional testing.

Colocalization with sQTLs provides an opportunity to explore isoform differences at GWAS loci. In the same way that gene-level eQTLs can mark transcripts that are colocalized with GWAS traits, and may act to influence diseases and phenotypes, splice eQTLs can be used to identify transcripts that were not identified on the gene-level. However, sQTLs have the added benefit of potentially identifying exon inclusion or exclusions that are unique to specific isoforms. Since changes in isoform abundance can indicate the pathways at work in tissue37 and some isoform changes can lead to changes in the activity of the gene, these sQTLs may indicate the mechanisms by which these genes impact the GWAS trait.

At the MADD-NR1H3 locus, we identified a novel splice junction that removes exon 4 and remains in frame. This splice was colocalized with a GWAS signal for HDL-cholesterol, suggesting some role for this isoform in lipid traits. Exon 4 encodes the only zinc-finger binding domain of LXRa, the transcription factor that NR1H3 encodes. Since the transcript remains in frame, it is not likely that the mRNA is subject to nonsense mediated decay. The loss of this exon may render the protein unable to bind retinoid response elements in DNA alongside its binding partner RXR. We hypothesize that, if translated, the protein produced cannot fulfill its role of regulating cholesterol homeostasis. This sQTL

125

colocalized provides evidence of a potential mechanism for the effect of NR1H3 adipose expression on

HDL levels.

Splicing analysis offers the opportunity to explore isoform differences based on the exclusion of certain exons. The analyses presented here show that splice QTLs can highlight pathways of interest in adipose tissue and identify potential mechanisms at colocalized cardiometabolic GWAS loci. As future studies explore splicing in adipose tissue, splice QTLs may help to uncover changes in the transcriptional program that influence human disease.

Acknowledgements

This work was made possible by the contributions of many other scientists. While the previous chapter used 434 samples, this chapter uses 426 samples to remove overlapping samples from other

Finnish eQTL studies. For this reason, Sarah M Brotman is responsible for the gene-level eQTL data, as well as the exon quantification and exon-level eQTL data. The PCR amplification work was done in the

Mohlke laboratory by Apoorva K Iyengar and Tamara S Roman. The METSIM study is run by Markku

Laakso and Johanna Kuusisto, who are responsible for the tissue samples and the genotype collection in this study. The following people were involved in METSIM genotyping and imputation to the Haplotype

Reference consortium: Anne U Jackson, Heather M Stringham, Ryan P Welch, Christian Fuchsberger,

Adam E Locke, Narisu Narisu, Francis S Collins, Michael Boehnke, Laura J Scott. Finally, this work was performed with the advice and direction of Terrence S Furey and Karen L Mohlke.

126

Figure 3.1: The number of significant splice QTLs observed per gene: A histogram of the number of significant sQTLs found per gene using the percent spliced in metric.

127

Exons Genes 16,937 17,370

1861 12035 2556

2734

Splices 3,198 307 45 112

Figure 3.2: Overlap between genes identified by eQTLs on the gene-, exon- and splice-level:

Here, the counts of genes for which at least one variant was associated with the level of a gene, an exon of the gene, or a splice junction of the gene, significant at FDR <1%. This Venn Diagram shows the overlap between genes (blue), exons (green), and splices (purple).

128

Scale 10 kb hg19 chr11: 47,271,000 47,272,000 47,273,000 47,274,000 47,275,000 47,276,000 47,277,000 47,278,000 47,279,000 47,280,000 47,281,000 47,282,000 47,283,000 47,284,000 47,285,000 47,286,000 47,287,000 47,288,000 47,289,000 47,290,000 Basic Gene Annotation Set from GENCODE Version 19 ACP2 NR1H3 ACP2Scale 10 kb NR1H3 hg19 ACP2chr11:A 47,271,000 47,272,000 47,273,000 47,274,000 47,275,000 47,276,000 47,277,000 47,278,000 47,279,000NR1H347,280,000 47,281,000 47,282,000 47,283,000 47,284,000 47,285,000 47,286,000 47,287,000 47,288,000 47,289,000 47,290,000 ACP2 Basic Gene Annotation Set from GENCODENR1H3 Version 19 ACP2ACP2 NR1H3 ACP2ACP2 NR1H3 ACP2ACP2 NR1H3 ACP2ACP2 NR1H3 NR1H3ACP2 ACP2NR1H3 ACP2NR1H3 ACP2NR1H3 NR1H3NR1H3 NR1H3 NR1H3 NR1H3 B NR1H3

Exon 3 Exon 4 Exon 5

Exon 3 Exon 5

C

Figure 3.3: Novel splice junction that removes exon 4 of NR1H3 (A) The gene model of NR1H3 is shown with exons 3-5 marked. (B) PCR product of amplification from cDNA shows the presence of both the 3 exon product (top) and the 2 exon product (bottom). (C) PCR product of amplification from cDNA shows a change in the intensity of the 2 exon product according to genotype of rs74948330, which is in high LD with the lead variant at this locus, rs1449626 (LD r2= 0.95). rs74948330(A) allele is correlated with rs1449626(C) allele. In this figure, panels B and C are images from different gels.

129

HDL cholesterol 50 Proinsulin ● ● ● ● ●● ●●● 15 ● ● ● ● ● ●● ●● ● ● ● ● ● 40 ● ● rs10838692 ● ● ● ● rs10501320 rs10838686 ● ● ● ● ● ● ● ● ● ●● ● ● ● 30 10 value) value) −

● − (p ● (p 10 ● 10 g ● ●g

o 20 o l ●● l - ● ● ●●● ● - ● ●●●●● ●● ● ●● ●● ● ● ● ● ● ● ●● 5 ● ● ● ●● ● ● ● ● ●● ●● ●● ● ● ● 10 ●● ●●● ●●●● ●●●● ●●●●● ●● ● ● ● ●●● ● ● ● ●●● ●● ● ● ●●●●●●●●● ●●● ●●●●●●●●● ● ● ● ● ● ● ● ●●● ●●● ● ● ● ● ● ● ●●●● ● ●● ●●●● ● ●●●●●● ● ●● ● ●● ●●● ● ● ● ● ● ● ●● ● ●● ● ● ●●●● ●●● ● ● ●●● ●●● ●● ●● 0 ● ● 0 ● ● ● ● ●● ●●●●●● ●● ●●●●●●●●● ● ●●● ●●●●●●●●●● ●●●● ● ●● ● ●●●●● ● ●● ●●● ●

50 C11orf49 DDB2 MADD SPI1 PSMC3 CELF1 C11orf49 DDB2 MADD SPI1 PSMC3 CELF1 100 NR1H3 sQTLARFGAP2 ACP2 MYBPC3 SLC39A13 PTPMT1 30 NR1H3 ARFGAP2 ACP2 MYBPC3rs10838686SLC39A13 PTPMT1 ●● ● ● ●●●●●●●●●●●●● ●●● ● ● ●●●●●●● PACSIN3 NR1H3 ● MIR4487 KBTBD4 PACSIN3 NR1H3 MIR4487 KBTBD4

40 Recombination rate (cM/Mb) Gene-level eQTL ●●●●●●●●●●●●● ● ● 2580 ●● ● ● MIR6745 LOC101928943 RAPSN NDUFS3 MIR6745 ●LOC101928943● ●●● ● ● ●●●●●●● RAPSN NDUFS3 ● ●● ●● ● ● ●● ●● ●● ●● ● ● ● ● ● ● FAM180B ● FAM180B ● ● rs10838686 ●●●●●●●●●●●● 30 20 ● ● ● ● 60 ●●●●●●●●●●●●●●●●●●●●● ● ● C1QTNF4 ● ●C1QTNF4 value) value) − ● ● ●●● ● ●− ●● (p ● ● ●●●● ●● ●● (p 15 ●●●● ●●● ●●●● ●●●● ●●●●●●●● ●● 10 ●● ● ● ●●● ● ● ● ●● 47.1 47.2 47.3 ● 47.4●●●● ●47.5 ● ● ●47.610 ● ● g 20 ●●● ●●●●●●●●●●●●●● ● ●● ● ● ● ●● ●●● ●●●● ● ●● ●● ● 47.1●● ● ●● ●●●●●●●●● ●●47.2● ●● ●● 47.3 47.4 ● 47.5 47.6 ●● ●●● ●●●●●●●●●● ● ● ● g ● ●● ● ● o ●● ● 40 ●●● ●●●● ●●●● ● ●●● l ●● ● o ●●●●●● ●● ●●● ● ● ●●●● ●●● Position on chr11 (Mb)● l ● - ●●●● ●● ●● Position on chr11 (Mb) ● - ●●●●●● ●●●●●● ● ● ● ● ●●●●●●●●● 10 ● ● rs10501320 ● ●●●● ●●●● ●●●● ● ● ●● ●●●●●● ● ●● ● ●● ● ● ● ●●●●●●●● ● ●● ● ●●● ●● ● 10 ● ●● ●●●●●●● ● ●●●●●●●●●● ● 20 ● ●● ●● ●● ● ● ● ● ● ●● ● ● ●●●●●●● ●●●●●●● ● ●● ● ● ● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ● ●● ● ● ●● ●● ●●●●●● ● ●● ● 5 ● ● ●● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●●●● ●●●●●●●●●●●●●●● ●● ●● ●● ● ● ●● ● ● ● ● ● ● ●● ● ●●● ● ● ● ●● ●● ●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●●●● ●●●●●●●● ●●●● ●●● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ●●● ● ● ●● ● ● ●●● ●● ●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ●●●●●● ●●●● ●●● ● ● ● ● ● ●● ● ● ●● ●● ●●● ●● ●●●●●●●●●● ●● ● ●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●● ●●●●●●●●●●●●●●●●●●●●● ●●● ●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ● ● ●● ● ●●● ● ●●●●●●●●●●● ● ●●●● ●● ● ● ● 0 ●● ● ● ●●●●●● ●● ●●● ●●● ●● ● ●●●●● ●●● ● ●●●●●●●●●●●● ●●●●●●●●●●●●●● ●● ●●●●●●●●●●●●●●● ●●● ●● 0 ● ● ● ● ● ●● ●●●● ● ●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●●● ●●● ● ● 0 ●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●

C11orf49 DDB2 MADD SPI1 PSMC3 CELF1 C11orf49 DDB2 MADD SPI1 PSMC3 CELF1

ARFGAP2 ACP2 MYBPC3 SLC39A13 PTPMT1 ARFGAP2 ACP2 MYBPC3 SLC39A13 PTPMT1

PACSIN3 NR1H3 MIR4487 KBTBD4 PACSIN3 NR1H3 MIR4487 KBTBD4

MIR6745 LOC101928943 RAPSN NDUFS3 MIR6745 LOC101928943 RAPSN NDUFS3

FAM180B FAM180B

C1QTNF4 C1QTNF4

47.1 47.2 47.3 47.4 47.5 47.6 47.1 47.2 47.3 47.4 47.5 47.6 Position on chr11 (Mb) Position on chr11 (Mb)

Figure 3.4: Colocalization of multiple GWAS and eQTL signals at MADD-NR1H3:Lead variants of the associations for HDL cholesterol (top left), Proinsulin (top right), NR1H3 splice (bottom left), and NR1H3 gene eQTL (bottom right) are shown as diamonds in the above LocusZoom plots; for all other variants,

LD with the lead variant is indicated by the intensity of the color and the shape of the point plotted.

Variants without strong association with one or more lead variants are shown in grey. HDL cholesterol association is plotted with data from Teslovich et al (Nature 2010); HDL signal B is shown in blue triangles and signal C is shown in red circles. Proinsulin association is plotted using Strawbridge et al

(Diabetes 2011) and is shown in purple squares. The splice in NR1H3 is plotted in red circles to show its colocalization with HDL signal C. The gene-level eQTLs for NR1H3 are shown in red circles (primary eQTL, colocalized with HDL signal C) and purple squares (secondary eQTL, colocalized with proinsulin signal D).

130

Scale 10 kb hg19 chr11: 47,275,000 47,280,000 47,285,000 47,290,000 Proxy variants for rs1449626 rs7118396 rs3758670 rs11039148 rs2279238 rs6144337 rs12574081 rs113681661 rs1449626 rs16938581 rs77532017 Basic Gene Annotation Set from GENCODE Version 28lift37 (Ensembl 92) ACP2 ACP2 ACP2 ACP2 ACP2 NR1H3 NR1H3 NR1H3 NR1H3 NR1H3 NR1H3 NR1H3 NR1H3 NR1H3

Scale 500 bases hg19 chr11: 47,281,500 47,281,600 47,281,700 47,281,800 47,281,900 47,282,000 47,282,100 47,282,200 47,282,300 47,282,400 47,282,500 47,282,600 47,282,700 47,282,800 47,282,900 47,283,000 Proxy variants for rs1449626 rs2279238 Basic Gene Annotation Set from GENCODE Version 28lift37 (Ensembl 92) NR1H3 NR1H3 NR1H3 NR1H3 NR1H3 NR1H3 NR1H3 NR1H3 NR1H3

Figure 3.5: Proxy variants for rs1449626 in Finns Here are the variants in high LD (r2 > 0.8) with the lead sQTL variant (rs1449626). This region encompasses all known isoforms of NR1H3, and specifically the exon removed by the significant sQTL. A variant in exon 4, rs2279238, is in high LD (r2 = 0.95) with rs1449626.

131

Number of Number of Number of Number of Unique PEER factors Unique Genes Unique Exons Splice Junctions k10 15,814 54,037 6,056 k20 16,703 57,980 6,077 k30 17,133 59,241 6,074 k40 17,370 60,486 6,030 k50 17,524 61,170 5,943 Table 3.1: Gene, exon, and splice eQTL discovery for PSI quantification with various PEER factors

The number of QTL genes detected by including BMI and a number of PEER factors for gene-level, exon- level, and splice junction-level (k=10-50).

132

Number of Number of features Percent Equivalent P- features tested significantly associated of total value (FDR 1%) with > 1 variant tested Gene-level 25,623 17,370 68% 2.2 x 10-3 Exon-level 160,899 60,486 38% 7.1 x 10-4 Splice junction- 61,833 6,077 10% 2.03 x 10-5. level Table 3.2: A summary of gene-, exon-, and splice junction eQTL discovery This table shows the number of genes, exons, and splices tested for eQTLs and the number of these features found to have significant eQTLs at FDR 1%.

133

Reactome Pathway description P-value Adjusted Pathway Genes Pathway P Size R-HSA-83936 Transport of nucleosides and free 2.05E-08 2.05E-05 62.6 ARL2BP; ARL2 purine and pyrimidine bases across the plasma membrane R-HSA-6807878 COPI-mediated anterograde 6.40E-06 6.40E-03 3153.1 ARFGAP3;ARFGAP2;TUBA1B;TUBA1A transport R-HSA-3371453 Regulation of HSF1-mediated heat 3.85E-05 3.85E-02 776.9 BAG1;RPS19BP1 shock response Table 3.3: Pathway analysis of sQTL-specific genes

The Reactome pathway analysis of genes found only by sQTL analysis but not found in the gene- or exon-level eQTL analysis. Pathway size

indicates the number of genes annotated with that pathway. The genes unique to each pathway from Sigora are listed (far right). Adjusted p-

values are corrected using the Bonferroni method. 1 34

GWAS GWAS variant association with splice junction Initial Conditioned Association on GWAS Variant Trait Splice Splice Gene P Variant Eff P β P LD All r2 rs2222543 WHRadjBMI chr7:120884392:120901686 CPED1 1.75E-19 rs10953934 A 1.75E-19 -0.60 1.00E+00 1.00 rs211718 Metabolic chr1:76190502:76198329 ACADM 9.65E-17 rs138493988 C 9.37E-17 -0.61 9.98E-01 1.00 traits rs1783541 T2D chr11:65307352:65307716 LTBP3 1.76E-16 rs1783541 T 1.76E-16 0.95 1.00E+00 1.00 rs2222543 WHRadjBMI chr7:120884392:120906281 CPED1 2.73E-16 rs10953934 A 2.73E-16 0.55 1.00E+00 1.00 rs17154889 WHRadjBMI chr5:102296933:102325976 PAM 1.37E-12 rs17154889 A 1.37E-12 -0.52 1.00E+00 1.00 rs4883201 TC chr12:9099001:9102084 M6PR 1.90E-11 rs4883201 G 1.90E-11 -0.74 1.00E+00 1.00 rs435306 HDL chr20:44536543:44538581 PLTP 3.70E-11 rs378114 C 3.70E-11 -0.52 1.00E+00 1.00 cholesterol rs1783541 T2D chr11:65307624:65307716 LTBP3 1.06E-08 rs1144928 G 1.04E-08 -0.68 9.92E-01 1.00 rs3746429 BMI chr20:33596550:33597990 TRPC4AP 1.68E-06 rs3746429 T 1.68E-06 -0.41 1.00E+00 1.00 rs4938303 Triglycerides chr11:117090474:117093924 PCSK7 2.39E-06 rs4938303 T 2.39E-06 0.33 1.00E+00 1.00 135

rs12692738 WHRadjBMI chr2:165557266:165560954 COBLL1 7.93E-06 rs12692738 C 7.93E-06 -0.37 1.00E+00 1.00

rs6606686 BMI chr12:110874009:110874470 ARPC3 1.74E-05 rs6606686 C 1.74E-05 0.31 1.00E+00 1.00 rs10838686 HDL-C chr11:47282226:47282792 NR1H3 2.63E-15 rs753992 A 2.56E-15 -0.56 9.97E-01 1.00 rs10138217 HDL chr14:92516880:92524663 RP11- 2.88E-41 rs910368 A 2.85E-41 0.97 1.00E+00 1.00 529H20.6, ATXN3 rs62034325 BMI chr16:28604889:28606688 SULT1A2, 5.31E-28 rs62034322 A 5.29E-28 0.68 1.00E+00 1.00 SULT1A1 rs17207196 BMI chr7:75039743:75044163 NSUN5P1 4.55E-08 rs236671 G 3.87E-08 0.40 9.76E-01 1.00 rs757608 WHRadjBMI chr17:59479312:59480422 TBX2 6.81E-07 rs9905385 G 6.23E-07 0.37 9.82E-01 1.00 rs314311 TC chr7:100480711:100481691 SRRT 1.09E-05 rs314316 G 1.01E-05 0.34 9.84E-01 1.00 rs6767619 BMI chr3:156867174:156867274 CCNL1 3.53E-07 rs10451916 T 3.50E-07 -0.39 9.99E-01 1.00 rs10838686 HDL-C chr11:47281530:47282792 NR1H3 8.34E-44 rs1449626 C 1.76E-44 0.93 9.07E-01 1.00 rs6700838 BMI chr1:47694194:47694868 TAL1 1.40E-06 rs61782665 G 1.34E-06 0.36 9.93E-01 1.00 rs62034325 BMI chr16:28549476:28550117 NUPR1 6.18E-18 rs4787458 G 5.71E-18 0.55 9.92E-01 1.00 rs17826219 BMI chr17:29061961:29062051 SUZ12P 2.19E-09 rs11656845 T 1.05E-09 -0.79 8.87E-01 1.00

rs11220462 LDL-C chr11:126225737:126275991 ST3GAL4 1.75E-16 rs10893500 C 1.36E-16 -0.68 9.72E-01 1.00 rs9435732 WHRadjBMI chr1:17313127:17313300 ATP13A2 3.61E-09 rs56265117 T 2.68E-09 -0.42 9.50E-01 1.00 rs2277862 TC chr20:34130690:34135163 ERGIC3 7.00E-25 rs12625762 A 6.95E-25 -1.11 9.99E-01 1.00 rs2277862 TC chr20:34130690:34131410 ERGIC3 3.48E-18 rs12625762 A 3.46E-18 0.95 9.99E-01 1.00 rs2277862 TC chr20:34144519:34144744 ERGIC3 4.12E-13 rs12625762 A 4.10E-13 0.80 9.99E-01 1.00 rs2277862 TC chr20:34144042:34144744 ERGIC3 7.29E-13 rs12625762 A 7.24E-13 -0.79 9.99E-01 1.00 rs34312154 WHRadjBMI chr11:47288029:47289464 NR1H3 1.19E-07 rs113459934 A 1.09E-07 0.43 9.84E-01 1.00 rs9435732 WHRadjBMI chr1:17312853:17312958 ATP13A2 1.80E-07 rs2311528 A 1.23E-07 0.38 9.46E-01 0.99 rs3825061 WHR chr11:118952024:118952169 VPS11 2.26E-22 rs1168570 C 2.13E-22 -0.75 9.95E-01 0.99 rs17207196 BMI chr7:75044076:75044163 NSUN5P1 2.54E-25 rs809439 A 4.85E-26 -0.71 8.17E-01 0.99 rs62034325 BMI chr16:28603802:28604573 SULT1A2, 6.75E-23 rs56209193 T 2.00E-23 0.62 8.75E-01 0.99 SULT1A1

136 rs17207196 BMI chr7:75042210:75044163 NSUN5P1 2.37E-39 rs794373 C 1.39E-40 0.88 7.41E-01 0.99

rs1075901 BMI chr17:15881227:15881358 ZSWIM7 1.89E-79 rs2302415 T 1.32E-79 -1.07 9.77E-01 0.99 rs1075901 BMI chr17:15880406:15880893 ZSWIM7 7.51E-60 rs2302415 T 6.80E-60 0.97 9.87E-01 0.99 rs1075901 BMI chr17:15881014:15881114 ZSWIM7 4.95E-38 rs2302415 T 3.33E-38 0.81 9.75E-01 0.99 rs9905991 BMI chr17:80052063:80053196 FASN 9.46E-28 rs62078747 G 2.01E-28 -0.72 8.72E-01 0.99 rs9905991 BMI chr17:80051647:80053196 FASN 9.46E-28 rs62078747 G 2.01E-28 0.72 8.72E-01 0.99 rs1308362 BMI chr3:138091918:138116166 MRAS 1.10E-07 rs166195 C 4.00E-08 0.44 8.28E-01 0.99 rs7479183 WHRadjBMI chr11:802396:802543 PIDD 2.50E-15 rs11246314 A 8.97E-16 0.55 8.53E-01 0.99 rs8070454 BMI chr17:38184199:38185079 MED24 8.99E-06 rs3826331 C 5.14E-06 -0.31 8.89E-01 0.99 rs2291542 WHR chr3:49738142:49738924 RNF123 2.27E-09 rs2276864 C 1.34E-09 -0.47 9.07E-01 0.99 rs62034325 BMI chr16:28606996:28607104 SULT1A2, 2.19E-82 rs743590 A 6.44E-85 1.05 7.32E-01 0.99 SULT1A1 rs1997243 TC chr7:1095119:1095740 GPR146 2.48E-12 rs117800627 C 1.97E-12 0.72 9.32E-01 0.99 rs35576166 WHRadjBMI chr12:124144629:124144713 GTF2H3 1.83E-05 rs4759370 A 1.79E-05 -0.58 9.95E-01 0.99 rs10838686 HDL-C chr11:47281530:47281960 NR1H3 1.26E-14 rs4647725 C 4.60E-15 -0.56 8.07E-01 0.99 rs10138217 HDL chr14:92530758:92537279 RP11- 7.12E-32 rs59864685 G 1.62E-32 -0.88 7.97E-01 0.99 529H20.6, ATXN3

rs4500751 Phospholipid chr16:15138264:15141299 NTAN1 2.15E-15 rs3803573 T 3.66E-16 -0.56 8.10E-01 0.99 levels (plasma) rs12024554 WHR chr1:19943830:19948594 MINOS1-NBL1 2.38E-06 rs10429880 A 6.87E-07 -0.40 7.86E-01 0.99 ,MINOS1 rs78429050 HDL chr3:122135097:122139833 RP11-299J3.8 1.81E-13 rs9843165 C 1.23E-13 -0.72 9.00E-01 0.98 rs1057452 BMI chr16:29845387:29847025 MVP 1.71E-12 rs12932691 A 1.55E-12 -0.75 9.52E-01 0.98 rs1057452 BMI chr16:29845758:29847025 MVP 1.71E-12 rs12932691 A 1.55E-12 0.75 9.52E-01 0.98 rs17154889 WHRadjBMI chr5:102310140:102325976 PAM 1.20E-08 rs12659870 C 3.64E-09 0.43 8.17E-01 0.98 rs17145738 Triglycerides chr7:73010299:73010483 MLXIPL 3.22E-15 rs13244268 C 2.11E-15 -0.81 9.45E-01 0.98 rs17145738 Triglycerides chr7:73010245:73010483 MLXIPL 1.15E-13 rs13244268 C 7.13E-14 0.77 9.39E-01 0.98 rs1075901 BMI chr17:15897092:15902833 ZSWIM7 3.27E-06 rs2779207 A 2.82E-06 -0.32 9.65E-01 0.98 rs2814992 BMI chr6:34664774:34664936 RP11-140K17.3 5.87E-06 rs4513788 T 2.17E-06 0.34 7.91E-01 0.98

137 rs10138217 HDL chr14:92516158:92516685 RP11- 1.31E-37 rs58339937 C 1.24E-38 0.93 6.86E-01 0.97

529H20.6,

ATXN3 rs10138217 HDL chr14:92514433:92515947 RP11-529H20.6 4.16E-13 rs58339937 C 1.01E-13 -0.57 7.46E-01 0.97 rs905938 WHRadjBMI chr1:154938993:154940147 SHC1 3.73E-06 rs1870940 A 1.73E-06 -0.38 8.00E-01 0.97 rs3803042 WHRadjBMI chr12:54393051:54393493 HOXC-AS1 1.66E-05 rs11614913 T 6.28E-06 -0.30 8.09E-01 0.97 rs7222 WHR chr12:2055452:2058252 DCP1B 2.04E-19 rs1860047 A 9.12E-20 0.59 8.10E-01 0.97 rs4969387 BMI chr17:79084759:79089570 BAIAP2 8.20E-08 rs4072589 A 4.46E-08 0.39 8.90E-01 0.97 rs1019503 2hGlu chr5:96245468:96248341 ERAP2 2.15E-12 rs2617434 G 5.45E-13 0.50 7.77E-01 0.97 rs10176176 Myocardial chr2:85806290:85808934 VAMP8 4.52E-10 rs6705839 T 1.32E-10 0.44 7.70E-01 0.97 infarction rs6749646 WHR chr2:25195101:25195495 DNAJC27-AS1 3.43E-08 rs2384078 C 1.54E-08 0.66 8.21E-01 0.96 rs174547 Lipid chr11:61557393:61557965 TMEM258 1.73E-10 rs174592 G 1.97E-11 -0.45 6.59E-01 0.96 metabolism phenotypes rs174547 Lipid chr11:61557462:61557847 TMEM258 9.58E-08 rs174592 G 5.14E-08 0.37 8.27E-01 0.96 metabolism phenotypes rs2814992 BMI chr6:34725328:34730372 SNRPC 4.01E-27 rs1201874 C 1.56E-27 -0.73 6.26E-01 0.95 rs4969387 BMI chr17:79082309:79089570 BAIAP2 1.57E-08 rs4969248 T 1.04E-08 -0.41 8.83E-01 0.95 rs926279 BMI chr6:58273455:58275151 LINC00680 6.78E-08 rs4563744 G 1.38E-08 -0.42 6.78E-01 0.94

rs62034325 BMI chr16:28603764:28606688 SULT1A2, 2.35E-18 rs149271 G 4.49E-20 0.58 5.17E-01 0.94 SULT1A1 rs62034325 BMI chr16:28604667:28604763 SULT1A2, 2.86E-07 rs149271 G 4.15E-09 0.38 3.86E-01 0.94 SULT1A1 rs62034325 BMI chr16:28620180:28621252 SULT1A2, 4.45E-11 rs56272201 G 1.57E-11 0.43 6.91E-01 0.94 SULT1A1 rs1048775 BMI chr17:79213743:79214787 C17orf89 2.20E-72 rs2255166 C 5.11E-77 -1.10 3.44E-01 0.94 rs1048775 BMI chr17:79213478:79214787 C17orf89 1.85E-53 rs2255166 C 7.13E-59 1.00 3.16E-01 0.94 rs7187776 BMI chr16:28846720:28846870 ATXN2L 2.26E-58 rs11863370 G 4.72E-63 0.97 2.81E-01 0.93 rs7187776 BMI chr16:28846720:28846867 ATXN2L 7.66E-58 rs11863370 G 9.22E-63 -0.96 2.75E-01 0.93 rs9931407 BMI chr16:67264694:67265333 FHOD1 6.56E-14 rs72798032 A 1.14E-14 0.81 4.82E-01 0.93 rs8396 Metabolic chr4:159620282:159624575 ETFDH 2.03E-22 rs17843919 G 4.82E-23 -0.76 6.34E-01 0.93 traits rs62034325 BMI chr16:28606785:28606871 SULT1A2, 4.99E-21 rs28698667 T 1.62E-23 0.62 4.38E-01 0.93 SULT1A1

138 rs10176176 Myocardial chr2:85806290:85808699 VAMP8 6.56E-16 rs2366639 G 1.48E-16 -0.55 5.21E-01 0.93

infarction

rs12631248 BMI chr3:49722815:49722904 MST1 1.88E-05 rs75729494 A 2.69E-06 -0.70 6.25E-01 0.93 rs3783347 FGlu chr14:100835595:100841620 WARS 1.56E-18 rs4905957 T 7.64E-19 0.75 6.28E-01 0.93 rs10838686 HDL-C chr11:47279680:47281342 NR1H3 1.08E-21 rs11039155 A 1.27E-23 0.70 4.13E-01 0.92 rs10105606 Triglycerides chr8:19777483:19824493 UNKNOWN 1.60E-27 rs13702 C 1.28E-30 0.80 2.99E-01 0.92 rs1019503 2hGlu chr5:96235893:96237210 ERAP2 2.15E-14 rs2927608 A 2.24E-15 0.55 7.17E-01 0.92 rs1019503 2hGlu chr5:96232567:96235825 ERAP2 8.39E-13 rs2927608 A 1.41E-14 0.53 4.92E-01 0.92 rs1019503 2hGlu chr5:96232208:96232436 ERAP2 3.56E-10 rs2927608 A 8.53E-11 0.45 6.94E-01 0.92 rs1019503 2hGlu chr5:96238057:96239081 ERAP2 5.83E-07 rs2927608 A 1.04E-08 0.40 2.96E-01 0.92 rs1019503 2hGlu chr5:96237385:96237978 ERAP2 7.97E-07 rs2927608 A 5.20E-07 0.35 7.09E-01 0.92 rs7479183 WHRadjBMI chr11:778218:783530 AP006621.5 5.49E-20 rs28633403 A 8.91E-23 -0.66 3.06E-01 0.92 rs62034325 BMI chr16:28619924:28620029 SULT1A2, 1.85E-78 rs11074907 G 1.29E-81 -1.04 3.45E-01 0.92 SULT1A1 rs114031082 WHR chr6:31237862:31237987 HLA-C,HLA-B 3.08E-37 rs115114271 C 2.19E-42 -0.94 2.32E-01 0.91 rs1105881 WHRadjBMI chr15:42134789:42135874 JMJD7- 9.22E-14 rs4924580 A 5.85E-15 -0.52 4.92E-01 0.91 PLA2G4B, PLA2G4B rs6738445 BMI chr2:172586363:172600559 DYNC1I2 4.03E-18 rs12692973 T 2.84E-20 -0.70 1.62E-01 0.91 rs11715915 FGlu chr3:49721622:49721804 MST1 1.97E-49 rs9852529 A 2.86E-58 0.95 3.21E-02 0.90

rs11715915 FGlu chr3:49721622:49721747 MST1 6.90E-49 rs9858213 T 1.04E-58 -0.96 2.70E-02 0.90 rs211718 Metabolic chr1:76216231:76226807 ACADM 1.85E-05 rs1251075 G 8.54E-06 0.33 6.67E-01 0.89 traits rs62034325 BMI chr16:28726755:28734485 EIF3C 1.44E-17 rs378421 A 6.03E-21 0.59 2.46E-01 0.88 rs62034325 BMI chr16:28732353:28734485 EIF3C 5.31E-12 rs378421 A 1.65E-14 -0.49 2.77E-01 0.88 rs62034325 BMI chr16:28617557:28618082 SULT1A2, 8.66E-92 rs3176926 G 1.46E-95 -1.10 9.90E-02 0.88 SULT1A1 rs62034325 BMI chr16:28603764:28604573 SULT1A2, 1.05E-85 rs3176926 G 8.02E-91 1.08 9.95E-02 0.88 SULT1A1 rs9435732 WHRadjBMI chr1:17301518:17301764 MFAP2 3.19E-06 rs55779591 T 8.83E-08 -0.38 3.30E-01 0.87 rs3814883 BMI chr16:30012361:30012735 INO80E 2.17E-12 rs4787491 G 7.32E-15 0.51 2.48E-01 0.87 rs3814883 BMI chr16:30014847:30014971 INO80E 8.46E-10 rs4787491 G 1.87E-10 -0.43 5.51E-01 0.87

139 rs3814883 BMI chr16:30014991:30015889 INO80E 1.66E-09 rs4787491 G 1.47E-12 -0.47 1.73E-01 0.87 rs3814883 BMI chr16:30015978:30016542 INO80E 3.55E-09 rs4787491 G 5.56E-12 -0.46 1.99E-01 0.87

rs3814883 BMI chr16:30012851:30016542 INO80E 5.44E-09 rs4787491 G 2.47E-11 0.45 2.66E-01 0.87 rs3814883 BMI chr16:30015978:30016071 INO80E 1.00E-06 rs4787491 G 2.92E-08 -0.37 3.46E-01 0.87 rs524281 BMI chr11:66052437:66053172 YIF1A 3.04E-15 rs7951079 T 9.68E-17 0.66 3.50E-01 0.87 rs1105881 WHRadjBMI chr15:42135421:42135874 JMJD7- 3.70E-17 rs9672671 T 4.53E-21 0.63 1.20E-01 0.87 PLA2G4B, PLA2G4B rs17826219 BMI chr17:29226624:29227431 TEFM 7.08E-06 rs55811708 T 8.40E-07 -0.66 4.02E-01 0.86 rs4329 Metabolic chr17:61574346:61574498 ACE 7.66E-08 rs4968651 T 9.77E-09 0.40 7.76E-01 0.85 traits chr6:32448398 WHRadjBMI chr6:32548633:32549334 HLA-DRB5, 7.90E-43 chr6:32448561 G 1.85E-50 -0.98 3.82E-02 0.84 HLA-DRB1 chr6:32448398 WHRadjBMI chr6:32548047:32548523 HLA-DRB5, 3.26E-40 chr6:32448561 G 6.77E-46 0.95 6.68E-02 0.84 HLA-DRB1 rs709400 BMI chr14:104040507:104053611 RP11-73M18.2, 3.35E-14 rs28513222 T 1.36E-17 -0.59 1.19E-01 0.84 APOPT1 rs2034088 HIPadjBMI chr17:424978:436082 VPS53 2.06E-11 rs2289620 C 3.81E-14 0.51 1.11E-01 0.83 rs9905991 BMI chr17:80115824:80121076 CCDC57 2.18E-08 rs4789678 T 1.60E-11 0.45 5.54E-02 0.83 rs386000 HDL-C chr19:54802229:54802483 LILRB5, 7.16E-16 rs434124 G 2.49E-18 -0.64 1.96E-01 0.83 LILRB2, LILRA3 rs7479183 WHRadjBMI chr11:780381:783530 AP006621.5 2.85E-24 rs7104929 G 5.62E-31 0.74 1.44E-02 0.83 rs7479183 WHRadjBMI chr11:777695:783530 AP006621.5 7.14E-22 rs7104929 G 2.82E-25 -0.68 7.08E-02 0.83

rs2034088 HIPadjBMI chr17:424978:425234 VPS53 1.27E-10 rs4968124 C 4.99E-14 -0.51 6.83E-02 0.82 rs672356 WHRadjBMI chr17:18236602:18243357 SHMT1 1.52E-47 rs2273028 A 2.66E-71 -1.11 2.09E-06 0.81 rs672356 WHRadjBMI chr17:18236602:18238873 SHMT1 9.16E-23 rs3794772 T 8.65E-31 0.79 4.93E-03 0.81 rs7795371 WHRadjBMI chr7:77313568:77314519 RSBN1L-AS1 5.51E-13 rs34160426 C 2.61E-15 -0.52 1.95E-01 0.80 rs672356 WHRadjBMI chr17:18238989:18243357 SHMT1 1.20E-18 rs12946058 T 6.00E-26 0.73 5.26E-03 0.80 rs672356 WHRadjBMI chr17:18233985:18243357 SHMT1 3.14E-14 rs12946058 T 1.68E-17 -0.61 6.57E-02 0.80 rs3772071 T2D chr2:161114557:161126713 ITGB6 1.69E-10 rs764729 C 1.70E-13 -0.67 3.58E-02 0.80 rs12631248 BMI chr3:49722317:49722445 MST1 4.02E-06 rs62260813 G 1.28E-06 0.74 4.95E-01 0.80 rs4307732 LDL chr11:126225737:126275991 ST3GAL4 3.67E-13 rs10893500 C 1.36E-16 -0.68 7.72E-02 0.80 cholesterol Table 3.4: sQTL colocalizations with cardiometabolic traits This table shows the results of the LD and conditional analysis method that met

2 -5 2 the criteria for colocalization (LD r > 0.8 and PcondonGWAS > 2.03 ´ 10 ). LD r for this table was calculated using 10K Finnish men from the wider

METSIM study imputed to HRC. The allele corresponding to the effect size (b) of the sQTL is shown in the “Eff All” column. Abbreviations include 1

40 fasting glucose (FGlu) and 2hr glucose (2hGlu).

Signal rsID-allele Locus Name or MAF GWAS GWAS β GWAS trait GWAS publication Designation Nearest Gene P-value

A rs3136441-T F2 0.12 2.00E-08 -0.03 HDL-C Hoffmann 2018 B rs10838692-T MADD 0.31 6.00E-20 -0.03 HDL-C Hoffmann 2018 C rs10838686-T MADD 0.17 1.00E-22 -0.04 HDL-C Hoffmann 2018 D rs10501320-G MADD 0.28 1.10E-88 0.08 proinsulin Strawbridge 2011 E rs35233100-T MADD 0.04 8.00E-15 -0.10 proinsulin Huyghe 2013 F rs10838687-T MADD 0.20 7.00E-12 0.03 proinsulin Strawbridge 2011 G rs11039014-A MTCH2 0.23 3.28E-12 0.02 BMI Yengo 2018 H rs7124681-A MTCH2 0.41 2.61E-87 0.04 BMI Yengo 2018 I rs34312154-A RAPSN 0.09 3.69E-10 0.02 WHRadjBMI Pulit 2018 Table:3.5: Distinct GWAS signals around the MADD-NR1H3 locus A summary of the cardiometabolic GWAS signals in the MADD-NR1H3

region. Variants from the same papers were found to be distinct either by conditional analysis or by GCTA. 1

4

1

HDL-C HDL-C HDL-C proinsulin proinsulin proinsulin BMI BMI WHRadjBMI

rs3136441 rs10838692 rs10838686 rs10501320 rs35233100 rs10838687 rs11039014 rs7124681 rs34312154

A B C D E F G H I

A 0.92 0.93 0.71 1.00 0.93 1.00 0.88 0.81

B 0.33 1.00 1.00 1.00 1.00 0.35 0.81 0.90

C 0.59 0.58 1.00 1.00 1.00 0.57 0.82 0.91

D 0.03 0.13 0.08 1.00 1.00 0.63 0.40 0.88

E 0.01 0.02 0.01 0.13 1.00 0.38 0.51 1.00

F 0.48 0.70 0.82 0.09 0.01 0.44 0.76 0.91

G 0.55 0.09 0.26 0.04 0.00 0.19 0.91 0.60

H 0.14 0.30 0.18 0.02 0.00 0.18 0.57 1.00

I 0.53 0.39 0.69 0.05 0.01 0.57 0.24 0.22

Table 3.6: Pairwise LD between nine GWAS signals in the MADD-NR1H3 region Pairwise LD between the cardiometabolic GWAS variants in

2 1 the MADD-NR1H3 region was calculated using Finnish LD in 1000G, phase 3 data. The numbers below the diagonal are the pairwise LD r 42

between the variants, and those above the diagonal are the pairwise D’ values.

GWAS GWAS variant association eSNP Association Conditioned on with gene with NR1H3 GWAS Variant Trait Gene P β Lead eQTL Conditional P β P β LD r2 variant eQTL rs10838686 HDL-C NR1H3 4.58E-25 0.72 rs75393320 Primary 2.80E-29 0.78 2.19E-01 0.09 0.93 rs10501320 proinsulin NR1H3 4.19E-11 -0.55 rs11039149 Secondary 2.39E-11 -0.56 9.49E-01 -0.01 1.00 Table 3.7: Gene-level colocalizations at NR1H3 This table includes the conditional analysis of two significant eQTLs at NR1H3, which are the

primary and secondary gene-level signals. The initial association between the GWAS variant and NR1H3 is shown, followed by the association of

the lead eQTL variant before and after conditioning on the GWAS variant. LD r2 was calculated in 10K Finnish males from the larger METSIM

study.

1 43

rsID chr:pos Alleles MAF R2 Correlated_Alleles RegulomeDB Function

rs1449626 chr11:47290759 (A/C) 0.28 1.00 A=A,C=C 4

rs59567949 chr11:47300155 (G/A) 0.28 1.00 A=G,C=A 5

rs4752977 chr11:47300429 (T/C) 0.28 1.00 A=T,C=C 6

rs10838685 chr11:47305176 (G/A) 0.28 1.00 A=G,C=A 7 rs1051006 chr11:47306585 (G/A) 0.28 1.00 A=G,C=A 1f miss

rs10838686 chr11:47311367 (G/T) 0.28 1.00 A=G,C=T 2b

rs58023804 chr11:47312374 (A/G) 0.28 1.00 A=A,C=G 5

rs5791755 chr11:47317625 (AAAAG/-) 0.28 1.00 A=AAAAG,C=- .

rs11493249 chr11:47323821 (G/C) 0.28 1.00 A=G,C=C 6

rs4752978 chr11:47324862 (G/A) 0.28 1.00 A=G,C=A 5

rs74236439 chr11:47333786 (T/G) 0.28 1.00 A=T,C=G 6

144 rs10838689 chr11:47334524 (G/A) 0.28 1.00 A=G,C=A 1f

rs2290148 chr11:47346145 (G/A) 0.28 1.00 A=G,C=A 1f

rs7124958 chr11:47347260 (C/T) 0.28 1.00 A=C,C=T 1f

rs11341424 chr11:47347688 (G/-) 0.28 1.00 A=G,C=- 5

rs753992 chr11:47349846 (G/A) 0.28 1.00 A=G,C=A 1f

rs11570124 chr11:47352082 (CT/-) 0.28 1.00 A=CT,C=- .

rs4752983 chr11:47352138 (C/T) 0.28 1.00 A=C,C=T 1b

rs3729805 chr11:47352757 (C/T) 0.28 1.00 A=C,C=T 6

rs3729804 chr11:47352862 (G/A) 0.28 1.00 A=G,C=A 5

rs2290146 chr11:47353498 (G/A) 0.28 1.00 A=G,C=A 2b

rs3729802 chr11:47354068 (G/A) 0.28 1.00 A=G,C=A 4

rs2254240 chr11:47357170 (G/A) 0.28 1.00 A=G,C=A 7

rs11039189 chr11:47358035 (C/T) 0.28 1.00 A=C,C=T 6

rs7928073 chr11:47361857 (C/T) 0.28 1.00 A=C,C=T 4

rs10769253 chr11:47362339 (G/A) 0.28 1.00 A=G,C=A 1f

rs10769255 chr11:47367371 (C/T) 0.28 1.00 A=C,C=T 1f

rs113681661 chr11:47289065 (-/T) 0.28 0.95 A=-,C=T .

rs12574081 chr11:47286414 (G/A) 0.28 0.95 A=G,C=A 3a

rs77532017 chr11:47284317 (C/T) 0.28 0.95 A=C,C=T 6

rs6144337 chr11:47284302 (CCATGTGCCTCA/-) 0.28 0.95 A=CCATGTGCCTCA,C=- . rs2279238 chr11:47282024 (C/T) 0.28 0.95 A=C,C=T 6 syn

rs11039148 chr11:47274126 (T/A) 0.28 0.95 A=T,C=A 6

rs3758670 chr11:47272665 (G/A) 0.28 0.95 A=G,C=A 7

rs16938581 chr11:47271039 (G/A) 0.28 0.95 A=G,C=A 2b

rs7118396 chr11:47269936 (C/T) 0.28 0.95 A=C,C=T 1b

145 rs2242261 chr11:47266808 (T/G) 0.28 0.95 A=T,C=G 1f

rs2242262 chr11:47266540 (G/T) 0.28 0.95 A=G,C=T 1f

rs74948330 chr11:47265745 (C/A) 0.28 0.95 A=C,C=A 7

rs7120158 chr11:47265194 (G/C) 0.28 0.95 A=G,C=C 1b

rs11039144 chr11:47264044 (G/A) 0.28 0.95 A=G,C=A 4

rs4752972 chr11:47263060 (C/T) 0.28 0.95 A=C,C=T 6

rs7933246 chr11:47262648 (G/C) 0.28 0.95 A=G,C=C 4

rs3781620 chr11:47259264 (G/C) 0.28 0.95 A=G,C=C 6

rs3824866 chr11:47258853 (C/T) 0.28 0.95 A=C,C=T 1f

rs4647754 chr11:47257340 (C/T) 0.28 0.95 A=C,C=T 5

rs3781619 chr11:47255317 (G/A) 0.28 0.95 A=G,C=A 1f

rs830083 chr11:47254051 (G/C) 0.28 0.95 A=C,C=G 6

rs4647741 chr11:47254010 (T/C) 0.28 0.95 A=T,C=C 7

rs4647737 chr11:47253244 (T/G) 0.28 0.95 A=T,C=G 6

rs830084 chr11:47251652 (T/C) 0.28 0.95 A=C,C=T 6

rs2596398 chr11:47249463 (G/C) 0.28 0.95 A=C,C=G 7

rs2633083 chr11:47249420 (T/C) 0.28 0.95 A=C,C=T 7

rs2633080 chr11:47247560 (A/C) 0.28 0.95 A=C,C=A 5

rs76832686 chr11:47247287 (C/T) 0.28 0.95 A=C,C=T 2b

rs7395496 chr11:47245803 (A/G) 0.28 0.95 A=G,C=A 6

rs4647726 chr11:47245494 (A/T) 0.28 0.95 A=A,C=T 6

rs4647725 chr11:47245389 (T/C) 0.28 0.95 A=T,C=C 6

rs10769254 chr11:47362465 (G/C) 0.29 0.93 A=G,C=C 1f

rs2306353 chr11:47256708 (C/T) 0.28 0.93 A=C,C=T 2b

rs11039155 chr11:47280762 (G/A) 0.26 0.93 A=G,C=A 4

rs75393320 chr11:47266471 (G/C) 0.26 0.93 A=G,C=C 5

rs326224 chr11:47255598 (T/C) 0.27 0.90 A=C,C=T 4

146

rs59002769 chr11:47278205 (-/A) 0.26 0.88 A=-,C=A .

rs11305920 chr11:47339325 (T/-) 0.31 0.84 A=T,C=- .

rs10838687 chr11:47312892 (T/G) 0.32 0.82 A=T,C=G 1b

rs11530166 chr11:47321966 (C/T) 0.32 0.82 A=C,C=T 6

rs60562402 chr11:47322007 (G/-) 0.32 0.82 A=G,C=- .

rs6485751 chr11:47336442 (G/C) 0.32 0.82 A=G,C=C 6

rs4752979 chr11:47339180 (A/G) 0.32 0.82 A=A,C=G 1a

rs753993 chr11:47349969 (C/A) 0.32 0.82 A=C,C=A 1f Table 3.8: Proxy variants for the lead sQTL variant at NR1H3 (rs1449626): This table shows the proxy variants for rs1449626 in Finns (r2>0.8).

Functional variants are denoted miss for missense and syn for synonymous.

CHAPTER 4: DISCUSSION

One of the main challenges in the genetic analysis of complex traits is identifying the target genes affected by variants, especially noncoding variants, detected by genetic association studies. In this dissertation, I address this challenge for cardiometabolic traits through use of genotypes, adipose gene expression data, and clinical traits measurements from the same individuals. After discovering and addressing cell-type heterogeneity in adipose tissue, I investigated two types of molecular mechanisms through which variants may affect genes and explored statistical approaches to colocalization with GWAS signals and association with clinical traits.

In previous chapters, I have described the analysis of subcutaneous adipose tissue expression levels from 434 Finnish men from the METSIM cohort. In Chapter 2, I first examined the distinctions between primary and secondary cis-eQTLs, showing that primary eQTLs are nearer to transcription start sites of genes and are more enriched in both promoters and enhancers than secondary eQTLs. I then described the results of colocalization between adipose eQTLs and cardiometabolic GWAS signals, including the identification of 318 candidate genes. When I examined these eQTLs to determine whether gene expression was more commonly increased or decreased when associated with stronger cardiometabolic risk in GWAS, I found that gene expression was decreased with risk approximately 53% of the time, but that the strongest eQTLs by significance level were gene expression lowering. I identified

49 associations between genes and clinical traits measured within the same samples and evidence of six genes mediating the relationship between variant and trait at GWAS loci. In Chapter 3, I compared sQTL discovery with that of gene and exon-level eQTLs, and found expression differences that indicate the activity of pathways related to lipid homeostasis and inflammation. I identified 104 variants associated with 132 splice junctions in 74 genes as colocalized with cardiometabolic GWAS signals. At the complex

MADD-NR1H3 locus, I described a novel splice junction that removes the zinc-finger domain of NR1H3, potentially rendering it incapable of binding DNA, suggesting a specific mechanism underlying a variant association with HDL cholesterol levels.

147

One challenge I addressed through this work is the unexpectedly high cell-type heterogeneity of adipose tissue samples collected by needle biopsy. When compared with tissue from GTEx, some of the samples contained a high proportion of whole blood-like expression. Initially, I attempted to diminish the effect of this variation by using high numbers of covariates from PEER (k > 100), but found that this was insufficient to address the problem of expression level differences between tissues. When I removed samples with a high percentage of blood-like expression (>50%), the cis-eQTLs identified were more numerous, stronger, and matched known adipose eQTLs better with a lower number of covariates included in the model. In the future, studies may address differences in cell-type by using single-cell sequencing to identify expression profiles for the many cell types native to adipose tissue. These data may then be used to more cleanly identify eQTLs for the tissue at large, and for specific cell-types within, such as infiltrating macrophages.

As GWAS studies continue to increase in sample size, many more loci, some with multiple distinct signals, are reported to be associated with clinical traits and diseases. However, the genes and mechanisms at hundreds of these signals remain unknown. One of several approaches to identifying the genes that may influence trait risk at GWAS loci is colocalization with eQTLs in tissues relevant to clinical traits and disease16,126,127. The utility of eQTLs to associate genes with GWAS loci originates from the hypothesis that variants are unlikely to be associated with GWAS trait risk and functionally impact expression of nearby genes by chance. However, it is not sufficient to simply verify that the GWAS variant is associated with the expression of a gene. The GWAS variant may be significantly associated with the eQTL gene due to moderate LD with the lead eQTL variant and still not be among the variants with the most significant effects on gene expression. Colocalization methods formally test for evidence that the

GWAS and eQTL signals are part of the same association. These analyses offer the opportunity to identify the gene or genes associated with multiple GWAS signals by using conditional eQTLs and to identify potential mechanisms based on the inclusion or exclusion of different exons via sQTL analysis.

There are a number of potential obstacles to successful colocalization analyses. One difficulty is the differences in LD patterns across populations. Even within Europeans, allele frequencies differ between countries. The linkage patterns between our eQTL data in Finns and broader European or even cross-ancestry meta-analysis can differ substantially, but the eQTL population LD has been shown to

148

impact the colocalization more than that of the GWAS data133. One potential explanation for this is that most GWAS studies are meta-analyses with differences in sample size between variants, whereas most eQTL studies are conducted within one study population, causing LD estimation to be comparatively cleaner. Another barrier to colocalization analyses is the presence of multiple signals, either in the GWAS data or the eQTL data or both. Many colocalization methods are not equipped to detect multiple signals65.

Using conditional analysis in primary data or joint conditional analysis methods like GCTA17 can separate these signals to better assess the distinct association signals.

Colocalization has been a very active area in methods development for several years. Some methods, such as conditional analysis and RTC67, require access to primary data for either the eQTL data, the GWAS data. The LD and conditional analysis method determines whether signals are colocalized using the LD between the lead variants from the GWAS and eQTL data and the significance of remaining association of the lead eQTL variant after conditioning on the lead GWAS variant. This method is sensitive to the identification of the lead variants in the GWAS and the eQTL study, which can vary according to allele frequency and sample size133. The RTC method assesses the change in significance of the eQTL after including the GWAS SNP in the model, accounting for LD using recombination hotspots67.

Because it is not always possible to obtain the primary data or conduct conditional analysis, especially when the GWAS data are from a large-scale meta-analysis, several methods have been developed to conduct colocalization analysis using only summary data65,66,68–70. COLOC is a method that uses summary statistics to test five hypotheses, of which the most important for colocalization are

(Hypothesis 3) that the eQTL and GWAS signals both exist, but are not colocalized and (Hypothesis 4) that both signals exist and are colocalized. COLOC has been shown to be sensitive the initial Bayesian priors72 and the authors note that COLOC is not designed for the presence of multiple signals in either the

GWAS or the eQTL data66. eCAVIAR is a Bayesian method that attempts to find the causal variant(s) shared between GWAS and eQTL summary statistics, which can reportedly detect up to six signals at a given locus; however, the model assumes that the casual variant will not be in high LD with a block of variants with similar p-value65, which may not reflect the linkage patterns of complex loci. All of these methods are sensitive to the thresholds or priors that are used.

149

Due to the ever-growing catalogue of GWAS signals and the expanding set of tissues assayed for eQTLs, colocalization method and analyses will continue to make important contributions to identifying candidate genes for functional testing and potentially pharmaceutical intervention. When primary data is accessible, assessment of LD between the eQTL and GWAS variants and conditional analysis provides very strong evidence of colocalization. This method fast to implement and provides a clear way to wrangle multiple signals in both the GWAS and eQTL data. When only summary statistics are available,

COLOC using GCTA-conditioned statistics provides the best chance of success in teasing out the often- complex relationships between variants at loci with multiple identified signals147.

Chapters 2 and 3 focus on eQTLs for different aspects of adipose expression data. Notably, I found fewer sQTLs than gene-level eQTLs and consequently, fewer colocalizations with GWAS signals.

While it is possible that fewer of these types of regulations exist on the splicing-level compared with the promoter and enhancer level, the change in observed eQTLs could be explained by differences in power to detect expression changes. Due to the use of only reads which span splice junctions between exons, the number of available reads in the splice analysis is greatly reduced compared with the gene-level, resulting in a large difference in power. With greater sample size or greater sequencing depth, it is likely that more sQTLs would be detected and proportionally more sQTL-GWAS colocalizations would be discovered.

The analyses presented in this dissertation have limitations. Associations are not necessarily causal. I attempted to address this in chapter 2 through use of mediation analyses, which test for the effect of a genetic variant through gene expression on clinical traits. Mediation analyses are a more direct test of the influence of gene expression on traits, and provide further support for the role of colocalized genes in changes in clinical trait values. These analyses were valuable to include, but will benefit in the future from increased sample size. Another potential limitation is that eQTLs might not always point to a true target gene. Since enhancers can be shared among multiple genes in a region, eQTL variants that fall in enhancers may affect multiple genes, which does not necessarily implicate all of these genes in the colocalization. It is also important to consider other factors at multi-gene loci, such as the biological functions of those genes, if known, and any evidence of mediation. Future methods development and improved understanding of mediation will allow for wider usage of this method to support colocalization

150

efforts in implicating genes in GWAS traits. Strategies for interpretation of biological function and annotation of eQTL variants by additional methods, such as open chromatin, will enable eQTL data to be most useful for interpretation of genetic studies.

As eQTL studies swell in sample size, both primary and conditional signals will grow more numerous and potentially lead to more complete annotation of candidate genes and mechanisms at

GWAS signals through colocalization. Additionally, eQTL studies in more diverse ancestry groups will prove invaluable to connect genes with GWAS signals that are rare or non-existent in European ancestry data. Integration of multiple -omics strategies, such as open chromatin and chromatin conformation capture, will enhance the utility of eQTLs further by narrowing the candidate functional variants. In the future, eQTL studies will continue to provide evidence for association between genes and GWAS signals across many tissues and many disease states, prioritizing candidates for functional study and potentially, pharmaceutical intervention.

151

REFERENCES

1. Roth, G. A. et al. Global, Regional, and National Burden of Cardiovascular Diseases for 10 Causes, 1990 to 2015. J. Am. Coll. Cardiol. 70, 1–25 (2017).

2. Narayan, K. M. V., Boyle, J. P., Geiss, L. S., Saaddine, J. B. & Thompson, T. J. Impact of Recent Increase in Incidence on Future Diabetes Burden: U.S., 2005-2050. Diabetes Care 29, 2114–2116 (2006).

3. Benjamin, E. J. et al. Heart Disease and Stroke Statistics—2019 Update: A Report From the American Heart Association. Circ. Cardiovasc. Genet. 139, e56–e528 (2019).

4. Mottillo, S. et al. The Metabolic Syndrome and Cardiovascular Risk. J. Am. Coll. Cardiol. 56, 1113– 1132 (2010).

5. Sookoian, S. & Pirola, C. J. Metabolic Syndrome: From the Genetics to the Pathophysiology. Curr. Hypertens. Rep. 13, 149–157 (2011).

6. Cornier, M.-A. et al. The metabolic syndrome. Endocr. Rev. 29, 777–822 (2008).

7. Schousboe, K. et al. Twin study of genetic and environmental influences on adult body size, shape, and composition. Int. J. Obes. 28, 39–48 (2004).

8. Kathiresan, S. et al. A genome-wide association study for blood lipid phenotypes in the Framingham Heart Study. BMC Med. Genet. 8, S17 (2007).

9. DIAGRAM Consortium et al. New genetic loci implicated in fasting glucose homeostasis and their impact on type 2 diabetes risk. Nat. Genet. 42, 105–116 (2010).

10. Visscher, P. M. et al. 10 Years of GWAS Discovery: Biology, Function, and Translation. Am. J. Hum. Genet. 101, 5–22 (2017).

11. Kessler, T., Vilne, B. & Schunkert, H. The impact of genome-wide association studies on the pathophysiology and therapy of cardiovascular disease. EMBO Mol. Med. 8, 688–701 (2016).

12. Mahajan, A. et al. Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps. Nat. Genet. 50, 1505–1513 (2018).

13. Xue, A. et al. Genome-wide association analyses identify 143 risk variants and putative regulatory mechanisms for type 2 diabetes. Nat. Commun. 9, (2018).

14. Pulit, S. L. et al. Meta-analysis of genome-wide association studies for body fat distribution in 694,649 individuals of European ancestry. Hum. Mol. Genet. (2018) doi:10.1093/hmg/ddy327.

15. Hoffmann, T. J. et al. A large electronic-health-record-based genome-wide study of serum lipids. Nat. Genet. 50, 401–413 (2018).

16. Cannon, M. E. & Mohlke, K. L. Deciphering the Emerging Complexities of Molecular Mechanisms at GWAS Loci. Am. J. Hum. Genet. 103, 637–653 (2018).

152

17. Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: A Tool for Genome-wide Complex Trait Analysis. Am. J. Hum. Genet. 88, 76–82 (2011).

18. Spracklen, C. N. et al. Association analyses of East Asian individuals and trans-ancestry analyses with European individuals reveal new loci associated with cholesterol and triglyceride levels. Hum. Mol. Genet. 26, 1770–1784 (2017).

19. Dixon, A. L. et al. A genome-wide association study of global gene expression. Nat. Genet. 39, 1202– 1207 (2007).

20. Visscher, P. M., Hill, W. G. & Wray, N. R. Heritability in the genomics era — concepts and misconceptions. Nat. Rev. Genet. 9, 255–266 (2008).

21. Cookson, W., Liang, L., Abecasis, G., Moffatt, M. & Lathrop, M. Mapping complex disease traits with global gene expression. Nat. Rev. Genet. 10, 184–194 (2009).

22. Gilad, Y., Rifkin, S. A. & Pritchard, J. K. Revealing the architecture of gene regulation: the promise of eQTL studies. Trends Genet. 24, 408–415 (2008).

23. Aguet, F. et al. Genetic effects on gene expression across human tissues. Nature 550, 204–213 (2017).

24. Dobbyn, A. et al. Landscape of Conditional eQTL in Dorsolateral Prefrontal Cortex and Co- localization with Schizophrenia GWAS. Am. J. Hum. Genet. 102, 1169–1184 (2018).

25. Wood, A. R. et al. Allelic heterogeneity and more detailed analyses of known loci explain additional phenotypic variation and reveal complex patterns of association. Hum. Mol. Genet. 20, 4082–4092 (2011).

26. Baralle, F. E. & Giudice, J. Alternative splicing as a regulator of development and tissue identity. Nat. Rev. Mol. Cell Biol. 18, 437–451 (2017).

27. Rodríguez, S. A. et al. Global genome splicing analysis reveals an increased number of alternatively spliced genes with aging. Aging Cell 15, 267–278 (2016).

28. Vernia, S. et al. An alternative splicing program promotes adipose tissue thermogenesis. eLife 5, e17672 (2016).

29. Frankish, A. et al. GENCODE reference annotation for the human and mouse genomes. Nucleic Acids Res. 47, D766–D773 (2019).

30. O’Leary, N. A. et al. Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res. 44, D733–D745 (2016).

31. Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).

32. Delaneau, O. et al. A complete tool set for molecular QTL discovery and analysis. Nat. Commun. 8, 15452 (2017).

153

33. Odhams, C. A. et al. Mapping eQTLs with RNA-seq reveals novel susceptibility genes, non-coding RNAs and alternative-splicing events in systemic lupus erythematosus. Hum. Mol. Genet. 26, 1003– 1017 (2017).

34. Li, Y. I. et al. Annotation-free quantification of RNA splicing using LeafCutter. Nat. Genet. 50, 151– 158 (2018).

35. Liu, B. et al. Genetic analyses of human fetal retinal pigment epithelium gene expression suggest ocular disease mechanisms. Commun. Biol. 2, 186 (2019).

36. Battle, A. et al. Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals. Genome Res. 24, 14–24 (2014).

37. Saferali, A. et al. Analysis of genetically driven alternative splicing identifies FBXO38 as a novel COPD susceptibility gene. PLOS Genet. 15, e1008229 (2019).

38. Ma, L., Jia, P. & Zhao, Z. Splicing QTL of human adipose-related traits. Sci. Rep. 8, 318 (2018).

39. Li, Y. I., Wong, G., Humphrey, J. & Raj, T. Prioritizing Parkinson’s disease genes using population- scale transcriptomic data. Nat. Commun. 10, 994 (2019).

40. Raj, T. et al. Integrative transcriptome analyses of the aging brain implicate altered splicing in Alzheimer’s disease susceptibility. Nat. Genet. 50, 1584–1592 (2018).

41. The GTEx Consortium et al. The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans. Science 348, 648–660 (2015).

42. Lemoine, A. Y., Ledoux, S. & Larger, E. Adipose tissue angiogenesis in obesity. Thromb. Haemost. 110, 661–669 (2013).

43. Massiéra, F. et al. Adipose angiotensinogen is involved in adipose tissue growth and blood pressure regulation. FASEB J. 15, 2727–2729 (2001).

44. Friedman, J. 20 YEARS OF LEPTIN: Leptin at 20: an overview. J. Endocrinol. 223, T1–T8 (2014).

45. Gustafson, B., Hedjazifar, S., Gogg, S., Hammarstedt, A. & Smith, U. Insulin resistance and impaired adipogenesis. Trends Endocrinol. Metab. 26, 193–200 (2015).

46. Smith, U. & Kahn, B. B. Adipose tissue regulates insulin sensitivity: role of adipogenesis, de novo lipogenesis and novel lipids. J. Intern. Med. 280, 465–475 (2016).

47. Coelho, M., Oliveira, T. & Fernandes, R. Biochemistry of adipose tissue: an endocrine organ. Arch. Med. Sci. 2, 191–200 (2013).

48. Al-Daghri, N. M. et al. Association of Type 2 Diabetes Mellitus related SNP genotypes with altered serum adipokine levels and metabolic syndrome phenotypes. Int. J. Clin. Exp. Med. 8, 4464–71 (2015).

49. Porter, S. A. et al. Abdominal Subcutaneous Adipose Tissue: A Protective Fat Depot? Diabetes Care 32, 1068–1075 (2009).

154

50. Ongen, H. et al. Estimating the causal tissues for complex traits and diseases. Nat. Genet. 49, 1676– 1683 (2017).

51. The Multiple Tissue Human Expression Resource (MuTHER) Consortium et al. Mapping cis- and trans-regulatory effects across multiple tissues in twins. Nat. Genet. 44, 1084–1089 (2012).

52. Civelek, M. et al. Genetic Regulation of Adipose Gene Expression and Cardio-Metabolic Traits. Am. J. Hum. Genet. 100, 428–443 (2017).

53. Sajuthi, S. P. et al. Mapping adipose and muscle tissue expression quantitative trait loci in African Americans to identify genes for type 2 diabetes and obesity. Hum. Genet. 135, 869–880 (2016).

54. Franzen, O. et al. Cardiometabolic risk loci share downstream cis- and trans-gene regulation across tissues and diseases. Science 353, 827–830 (2016).

55. Gamazon, E. R. et al. Using an atlas of gene regulation across 44 human tissues to inform complex disease- and trait-associated variation. Nat. Genet. 50, 956–967 (2018).

56. The MuTHER Consortium. Identification of an imprinted master trans regulator at the KLF14 locus related to multiple metabolic phenotypes. Nat. Genet. 43, 561–564 (2011).

57. Small, K. S. et al. Regulatory variants at KLF14 influence type 2 diabetes risk via a female-specific effect on adipocyte size and body composition. Nat. Genet. 50, 572–580 (2018).

58. Esteve Ràfols, M. Adipose tissue: Cell heterogeneity and functional diversity. Endocrinol. Nutr. Engl. Ed. 61, 100–112 (2014).

59. Tchoukalova, Y. D., Sarr, M. G. & Jensen, M. D. Measuring committed preadipocytes in human adipose tissue from severely obese patients by using adipocyte fatty acid binding protein. Am. J. Physiol.-Regul. Integr. Comp. Physiol. 287, R1132–R1140 (2004).

60. Mutch, D. M. et al. Needle and surgical biopsy techniques differentially affect adipose tissue gene expression profiles. Am. J. Clin. Nutr. 89, 51–57 (2009).

61. Newman, A. M. et al. Robust enumeration of cell subsets from tissue expression profiles. Nat. Methods 12, 453–457 (2015).

62. Gong, T. & Szustakowski, J. D. DeconRNASeq: a statistical framework for deconvolution of heterogeneous tissue samples based on mRNA-Seq data. Bioinformatics 29, 1083–1085 (2013).

63. Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, (2014).

64. Glastonbury, C. A., Couto Alves, A., El-Sayed Moustafa, J. S. & Small, K. S. Cell-Type Heterogeneity in Adipose Tissue Is Associated with Complex Traits and Reveals Disease-Relevant Cell-Specific eQTLs. Am. J. Hum. Genet. 104, 1013–1024 (2019).

65. Hormozdiari, F. et al. Colocalization of GWAS and eQTL Signals Detects Target Genes. Am. J. Hum. Genet. 99, 1245–1260 (2016).

155

66. Giambartolomei, C. et al. Bayesian Test for Colocalisation between Pairs of Genetic Association Studies Using Summary Statistics. PLoS Genet. 10, e1004383 (2014).

67. Nica, A. C. et al. Candidate Causal Regulatory Effects by Integration of Expression QTLs with Complex Trait Genetic Associations. PLoS Genet. 6, e1000895 (2010).

68. He, X. et al. Sherlock: Detecting Gene-Disease Associations by Matching Patterns of Expression QTL and GWAS. Am. J. Hum. Genet. 92, 667–680 (2013).

69. Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48, 481–487 (2016).

70. Wen, X., Pique-Regi, R. & Luca, F. Integrating molecular QTL data into genome-wide genetic association analysis: Probabilistic assessment of enrichment and colocalization. PLOS Genet. 13, e1006646 (2017).

71. Zeng, B. et al. Comprehensive Multiple eQTL Detection and Its Application to GWAS Interpretation. Genetics 212, 905–918 (2019).

72. Guo, H. et al. Integration of disease association and eQTL data using a Bayesian colocalisation approach highlights six candidate causal genes in immune-mediated diseases. Hum. Mol. Genet. 24, 3305–3313 (2015).

73. Aschard, H., Vilhjálmsson, B. J., Joshi, A. D., Price, A. L. & Kraft, P. Adjusting for Heritable Covariates Can Bias Effect Estimates in Genome-Wide Association Studies. Am. J. Hum. Genet. 96, 329–339 (2015).

74. Laakso, M. et al. The Metabolic Syndrome in Men study: a resource for studies of metabolic and cardiovascular diseases. J. Lipid Res. 58, 481–493 (2017).

75. Wang, Z., Gerstein, M. & Snyder, M. RNA-Seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 10, 57–63 (2009).

76. Wang, Y., Rimm, E. B., Stampfer, M. J., Willett, W. C. & Hu, F. B. Comparison of abdominal adiposity and overall obesity in predicting risk of type 2 diabetes among men. Am. J. Clin. Nutr. 81, 555–563 (2005).

77. Canoy, D. Distribution of body fat and risk of coronary heart disease in men and women: Curr. Opin. Cardiol. 23, 591–598 (2008).

78. Pischon, T. et al. General and Abdominal Adiposity and Risk of Death in Europe. N. Engl. J. Med. 359, 2105–2120 (2008).

79. Jansen, R. et al. Conditional eQTL analysis reveals allelic heterogeneity of gene expression. Hum. Mol. Genet. 26, 1444–1451 (2017).

80. Buil, A. et al. Gene-gene and gene-environment interactions detected by transcriptome sequence analysis in twins. Nat. Genet. 47, 88–91 (2015).

81. Emilsson, V. et al. Genetics of gene expression and its effect on disease. Nature 452, 423–428 (2008).

156

82. Locke, A. E. et al. Genetic studies of body mass index yield new insights for obesity biology. Nature 518, 197–206 (2015).

83. Shungin, D. et al. New genetic loci link adipose and insulin biology to body fat distribution. Nature 518, 187–196 (2015).

84. Dastani, Z. et al. Novel Loci for Adiponectin Levels and Their Influence on Type 2 Diabetes and Metabolic Traits: A Multi-Ethnic Meta-Analysis of 45,891 Individuals. PLoS Genet. 8, e1002607 (2012).

85. McCarthy, S. et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet. 48, 1279–1283 (2016).

86. Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

87. International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001).

88. Westra, H.-J. et al. MixupMapper: correcting sample mix-ups in genome-wide datasets increases power to detect small genetic effects. Bioinformatics 27, 2104–2111 (2011).

89. Jun, G. et al. Detecting and Estimating Contamination of Human DNA Samples in Sequencing and Array-Based Genotype Data. Am. J. Hum. Genet. 91, 839–848 (2012).

90. Patro, R., Duggal, G., Love, M. I., Irizarry, R. A. & Kingsford, C. Salmon provides fast and bias-aware quantification of transcript expression. Nat. Methods 14, 417–419 (2017).

91. Soneson, C., Love, M. I. & Robinson, M. D. Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences [version 2; peer review: 2 approved]. F1000Research 4, 1521.

92. Stegle, O., Parts, L., Piipari, M., Winn, J. & Durbin, R. Using probabilistic estimation of expression residuals (PEER) to obtain increased power and interpretability of gene expression analyses. Nat. Protoc. 7, 500–507 (2012).

93. Ongen, H., Buil, A., Brown, A. A., Dermitzakis, E. T. & Delaneau, O. Fast and efficient QTL mapper for thousands of molecular phenotypes. Bioinformatics 32, 1479–1485 (2016).

94. Karro, J. E. et al. Pseudogene.org: a comprehensive database and comparison platform for pseudogene annotation. Nucleic Acids Res. 35, D55–D60 (2007).

95. Wang, L., Wang, S. & Li, W. RSeQC: quality control of RNA-seq experiments. Bioinformatics 28, 2184–2185 (2012).

96. Wang, L. et al. Measure transcript integrity using RNA-seq data. BMC Bioinformatics 17, (2016).

97. Kundaje, A. et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).

98. Cannon, M. E. et al. Open Chromatin Profiling in Adipose Tissue Marks Genomic Regions with Functional Roles in Cardiometabolic Traits. G3 GenesGenomesGenetics g3.400294.2019 (2019) doi:10.1534/g3.119.400294.

157

99. Buenrostro, J. D., Giresi, P. G., Zaba, L. C., Chang, H. Y. & Greenleaf, W. J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–1218 (2013).

100. Schmidt, E. M. et al. GREGOR: evaluating global enrichment of trait-associated variants in epigenomic features using a systematic, data-driven approach. Bioinformatics 31, 2601–2606 (2015).

101. Buniello, A., MacArthur, J. A. L., Cerezo, M. & Harris, L. W. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47, D1005–D1012 (2019).

102. Yengo, L. et al. Meta-analysis of genome-wide association studies for height and body mass index in ∼700000 individuals of European ancestry. Hum. Mol. Genet. 27, 3641–3649 (2018).

103. Imai, K., Keele, L. & Yamamoto, T. Identification, Inference and Sensitivity Analysis for Causal Mediation Effects. Stat. Sci. 51–71 (2010).

104. Huang, Y.-T., VanderWeele, T. J. & Lin, X. Joint analysis of SNP and gene expression data in genetic association studies of complex diseases. Ann. Appl. Stat. 8, 352–376 (2014).

105. Titus, A. J., Gallimore, R. M., Salas, L. A. & Christensen, B. C. Cell-type deconvolution from DNA methylation: a review of recent applications. Hum. Mol. Genet. 26, R216–R224 (2017).

106. Jaffe, A. E. & Irizarry, R. A. Accounting for cellular heterogeneity is critical in epigenome-wide association studies. Genome Biol. 15, R31 (2014).

107. Lappalainen, T. & Greally, J. M. Associating cellular epigenetic models with human phenotypes. Nat. Rev. Genet. 18, 441–451 (2017).

108. Teslovich, T. M. et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707–713 (2010).

109. Voight, B. F. et al. Twelve type 2 diabetes susceptibility loci identified through large-scale association analysis. Nat. Genet. 42, 579–589 (2010).

110. Kong, A. et al. Parental origin of sequence variants associated with complex diseases. Nature 462, 868–874 (2009).

111. Lee, T. I. & Young, R. A. Transcriptional Regulation and Its Misregulation in Disease. Cell 152, 1237–1251 (2013).

112. Cai, K. & Sewer, M. B. Diacylglycerol kinase θ couples -dependent signalling to Akt activation and glucose homoeostasis in hepatocytes. Biochem. J. 454, 267–274 (2013).

113. Saxton, R. A. & Sabatini, D. M. mTOR Signaling in Growth, Metabolism, and Disease. Cell 168, 960–976 (2017).

114. Akiyama, M. et al. Genome-wide association study identifies 112 new loci for body mass index in the Japanese population. Nat. Genet. 49, 1458–1467 (2017).

158

115. Kissil, J. L. et al. Isolation of DAP3, a Novel Mediator of Interferon-γ-induced Cell Death. J. Biol. Chem. 270, 27932–27936 (1995).

116. Engin, A. B. & Engin, A. The Interactions Between Kynurenine, Folate, Methionine and Pteridine Pathways in Obesity. Adv. Exp. Med. Biol. 960, 511–527 (2017).

117. Harris, R. M., Waring, R. H., Kirk, C. J. & Hughes, P. J. Sulfation of “Estrogenic” Alkylphenols and 17β-Estradiol by Human Platelet Phenol Sulfotransferases. J. Biol. Chem. 275, 159–166 (2000).

118. Emaus, A. et al. 17- -Estradiol in relation to age at menarche and adult obesity in premenopausal women. Hum. Reprod. 23, 919–927 (2008).

119. Gutierrez-Aguilar, R., Kim, D.-H., Woods, S. C. & Seeley, R. J. Expression of New Loci Associated With Obesity in Diet-Induced Obese Rats: From Genetics to Physiology. Obesity 20, 306– 312 (2012).

120. Matsuda, M. & DeFronzo, R. A. Insulin sensitivity indices obtained from oral glucose tolerance testing: comparison with the euglycemic insulin clamp. Diabetes Care 22, 1462–1470 (1999).

121. Levy-Strumpf, N., Deiss, L. P., Berissi, H. & Kimchi, A. DAP-5, a novel homolog of eukaryotic translation initiation factor 4G isolated as a putative modulator of gamma interferon-induced programmed cell death. Mol. Cell. Biol. 17, 1615–1625 (1997).

122. Hug, C. et al. T-cadherin is a receptor for hexameric and high-molecular-weight forms of Acrp30/adiponectin. Proc. Natl. Acad. Sci. 101, 10308–10313 (2004).

123. Herman, M. A. et al. A novel ChREBP isoform in adipose tissue regulates systemic glucose metabolism. Nature 484, 333–338 (2012).

124. Kazanskaya, O. et al. The Wnt signaling regulator R-spondin 3 promotes angioblast and vascular development. Development 135, 3655–3664 (2008).

125. Ming, G. et al. JAZF1 can regulate the expression of lipid metabolic genes and inhibit lipid accumulation in adipocytes. Biochem. Biophys. Res. Commun. 445, 673–680 (2014).

126. Tak, Y. G. & Farnham, P. J. Making sense of GWAS: using epigenomics and genome engineering to understand the functional relevance of SNPs in non-coding regions of the human genome. Epigenetics Chromatin 8, (2015).

127. Gallagher, M. D. & Chen-Plotkin, A. S. The Post-GWAS Era: From Association to Function. Am. J. Hum. Genet. 102, 717–730 (2018).

128. Pai, A. A., Pritchard, J. K. & Gilad, Y. The Genetic and Mechanistic Basis for Variation in Gene Regulation. PLoS Genet. 11, e1004857 (2015).

129. Wang, E. T. et al. Alternative isoform regulation in human tissue transcriptomes. Nature 456, 470–476 (2008).

130. Nissim-Rafinia, M. & Kerem, B. The splicing machinery is a genetic modifier of disease severity. Trends Genet. 21, 480–483 (2005).

159

131. Prokunina-Olsson, L. et al. Tissue-specific alternative splicing of TCF7L2. Hum. Mol. Genet. 18, 3795–3804 (2009).

132. Kaminska, D. et al. Regulation of alternative splicing in human obesity loci: Alternative Splicing in Obesity Loci. Obesity 24, 2033–2037 (2016).

133. Raulerson, C. K. et al. Adipose Tissue Gene Expression Associations Reveal Hundreds of Candidate Genes for Cardiometabolic Traits. Am. J. Hum. Genet. 105, 773–787 (2019).

134. Robinson, M. D. & Oshlack, A. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol. 11, R25 (2010).

135. Storey, J. D. & Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl. Acad. Sci. 100, 9440–9445 (2003).

136. Foroushani, A. B. K., Brinkman, F. S. L. & Lynn, D. J. Pathway-GPS and SIGORA: identifying relevant pathways based on the over-representation of their gene-pair signatures. PeerJ 1, e229 (2013).

137. Schneider, C. A., Rasband, W. S. & Eliceiri, K. W. NIH Image to ImageJ: 25 years of image analysis. Nat. Methods 9, 671–675 (2012).

138. De Buck, M. et al. The cytokine-serum amyloid A-chemokine network. Cytokine Growth Factor Rev. 30, 55–69 (2016).

139. Ronsein, G. E. & Vaisar, T. Inflammation, remodeling, and other factors affecting HDL cholesterol efflux: Curr. Opin. Lipidol. 28, 52–59 (2016).

140. Demozay, D. et al. Fatty Aldehyde Dehydrogenase: POTENTIAL ROLE IN OXIDATIVE STRESS PROTECTION AND REGULATION OF ITS GENE EXPRESSION BY INSULIN. J. Biol. Chem. 279, 6261–6270 (2004).

141. Mysore, R. et al. MicroRNA-192* impairs adipocyte triglyceride storage. Biochim. Biophys. Acta BBA - Mol. Cell Biol. Lipids 1861, 342–351 (2016).

142. Takashima, K. et al. GBF1-Arf-COPI-ArfGAP–mediated Golgi-to-ER Transport Involved in Regulation of Lipid Homeostasis. Cell Struct. Funct. 36, 223–235 (2011).

143. Tiss, A. et al. Immunohistochemical profiling of the heat shock response in obese non-diabetic subjects revealed impaired expression of heat shock proteins in the adipose tissue. Lipids Health Dis. 13, 106 (2014).

144. Jakobsson, T. et al. GPS2 Is Required for Cholesterol Efflux by Triggering Histone Demethylation, LXR Recruitment, and Coregulator Assembly at the ABCG1 Locus. Mol. Cell 34, 510– 518 (2009).

145. Strawbridge, R. J. et al. Genome-Wide Association Identifies Nine Common Variants Associated With Fasting Proinsulin Levels and Provides New Insights Into the Pathophysiology of Type 2 Diabetes. Diabetes 60, 2624–34 (2011).

146. Xiong, H. Y. et al. The human splicing code reveals new insights into the genetic determinants of disease. Science 347, 1254806–1254806 (2015).

160

147. Wu, Y. et al. Colocalization of GWAS and eQTL signals at loci with multiple signals identifies additional candidate genes for body fat distribution. Hum. Mol. Genet. In press.

161