This electronic thesis or dissertation has been downloaded from the King’s Research Portal at https://kclpure.kcl.ac.uk/portal/

The Genetics of Expression and its relationship with Adiposity

Glastonbury, Craig Anthony

Awarding institution: King's College London

The copyright of this thesis rests with the author and no quotation from it or information derived from it may be published without proper acknowledgement.

END USER LICENCE AGREEMENT

Unless another licence is stated on the immediately following page this work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International licence. https://creativecommons.org/licenses/by-nc-nd/4.0/ You are free to copy, distribute and transmit the work Under the following conditions:  Attribution: You must attribute the work in the manner specified by the author (but not in any way that suggests that they endorse you or your use of the work).  Non Commercial: You may not use this work for commercial purposes.  No Derivative Works - You may not alter, transform, or build upon this work.

Any of these conditions can be waived if you receive permission from the author. Your fair dealings and other rights are in no way affected by the above.

Take down policy

If you believe that this document breaches copyright please contact [email protected] providing details, and we will remove access to the work immediately and investigate your claim.

Download date: 04. Oct. 2021 The Genetics of Gene Expression and its relationship with Adiposity

Craig A. Glastonbury King’s College London

A thesis submitted for the degree of Doctor of Philosophy

2016 This thesis is dedicated to my parents, Mark & Jacqueline. To my sister Natalie, nephew Mason, and my partner Rodrigo. Your daily support allows me to do what I love. Thank you. Acknowledgements

Whilst this thesis is the product of my own work, there have been many collaborations, discussions, coffee chats and rants that have really shaped how I think about science and that have influenced my work substantially over the last three years. Many people at TwinsUK have been incredibly supportive in all aspects of my studies. I would first like to thank Ana Vi˜nuela.From the start of my PhD to when you left to join Emmanouil Dermitzakis in Geneva, you were incredibly supportive, insightful and fun to talk to. As a stand-in supervisor whilst Kerrin was on maternity leave, you really guided me, and not just with a constant supply of M&M’s!

There are many current colleagues and friends I would like to thank here, that have made my three years extremely enjoyable and stimulating. Alex Couto Alves, you are a fantastic person to work with, and have filled the role as a mentor to me. You have influenced my ability to think crit- ically and our countless discussions over coffee have been a real source of joy in my academic life. I hope to see you one day leading your own group, as you’re a fantastic scientist with your heart in the right place. Antonino Zito and Abhishek Nag, you are two close friends and colleagues of mine who are extremely talented academics and have often helped in many aspects of both my PhD and life outside academia. Abhishek, you are a sporting force to be reckoned with! Never have I felt so exhausted than after playing squash with you. Antonino, you’re a strong minded and independent scientist. I know you will succeed in your future and that we will stay close friends. I would of course also like to thank my supervisor Kerrin Small. Kerrin, you are a kind, caring mentor and an excellent scientist. You have made my PhD a very enjoyable and stim- ulating experience and you have always been incredibly supportive and insightful. As your first PhD student, I am grateful for working with you and I am sure we will continue to work well into the future whilst I am at Oxford. I would also like to thank Tim Spector. Tim, you constantly surprise me with your ability to manage so much so seamlessly. I’ve never seen you flustered or too busy to stop and chat. Thank you for running such a successful and enjoyable place to work.

I would like to thank my Partner Rodrigo Pracana. Both of us pursu- ing PhDs has been stressful with many of our weekends lost to genetics. However, it hasn’t tested us at all, and after five years in your company, you remain the closest and most loved person in my life. My best friends, Carla Curtis-Tansley, Slavina Georgieva and Isha Puri. All of you have been incredibly supportive and understanding. You’re the kindest friends anyone could wish for and I know we will continue to be friends into the distant future.

For three years, every Monday, I would meet twins to conduct clinical visits and perform a wide range of tests. Whilst this was tiring and at times tested me, I have never met such generous people. So in part, this thesis is dedicated to all volunteers, as none of my work and many other pieces of research would be possible, if not for the Twins.

4 0.1 Publications

A list of publications to which I have contributed to during my PhD:

Chapter3 :

Menni, C, Glastonbury, C, K Nikolaou, K Small, K Mahney, T Spector, and A.M Valdes. “Metabolomic profiling to dissect the role of visceral fat in cardio metabolic health”. In: Obesity

Bailey, et al.*. “Genome-wide association analysis identifies TXNRD2, ATXN2 and FOXC1 as susceptibility loci for primary open angle glaucoma”. In: Nature Genetics.

Small, K, L Quaye, A Hough, M Todorcevic, A Mahajan, M Horikoshi, A Buil, A Vinuela, Glastonbury, C, J Brown A Bell, R Cox, Gloyn A, Karpe F, and Mc- Carthy M. “Characterisation of the KLF14 trans-regulatory network”. In: Submitted (Nature).

J.S. El-Sayed Moustafa, J. Fernandez, M. Civelek, C. Glastonbury, M. Todorce- vic, A. Mahajan, M. Horikoshi, I.Yet, M.Simon, G. Thorleifsson, U. Thorsteinsdottir, J. Bell, A. Gloyn, R. Cox, A. Lusis, F. Karpe, M. McCarthy K. Small. ”Mechanistic and functional properties of the KLF14 trans-eQTL network associated to risk of type 2 diabetes” In: Preparation

Kaul, S, H Xu, E Maruko, Glastonbury, C, K Small, G Dallinga-Thie, M Civelek, M Thomas, I Goldberg, and M Sorci-Thomas. “Procollagen C-endopeptidase enhancer 2 (PCPE2) Deficiency Profoundly Affects Adipose Distribution in Mice and Humans Linking HDL Metabolism to Adipocyte Biology”. In: Preparation.

Chapter4 :

Glastonbury, C, A. Vinuela, A. Buil, R. Durbin, E. Dermitzakis, T. Spector, and K. Small. “Adiposity-dependent regulatory effects on multi-tissue transcriptomes”. In: AJHG. Other:

Tsai, PC*, Glastonbury, C*, A Vineula, E Dermitzakis, T Spector, and K. Small. “Tobacco smoke modulates gene expression and DNA methylation via genetic variation in multiple human tissues”. In: Preparation

6 0.2 Attributions

This page outlines the specific attributions per (research) chapter. Analysis performed by collaborators was used to perform additional and novel research presented in this thesis. Anything not listed on this page was performed by myself.

0.2.1 Chapter 3 - Pervasive eect of cardio-metabolic traits on peripheral tissue gene expression regulation exon level eQTLs, genotype QC and heritability calculations performed and previ- ously described in Buil et al.(2015).

0.2.2 Chapter 4 - BMI-dependent regulatory effects on multi- tissue transcriptomes exon level quantification’s and genotype QC performed and published described in Buil et al.(2015).

0.2.3 Chapter 5 - Population level variability in adipose tis- sue cell-type composition and its link to obesity

Genotype QC performed and previously described in Buil et al.(2015).

7 Abstract

The major focus of this thesis will concern the consequences and down- stream impacts of obesity and adiposity-related traits on the human body by utilising RNA-sequencing measurements from three primary tissues (Subcutaneous adipose tissue, Whole blood & Skin) and one cell line, LCLs (Lymphoblastoid cell lines).

I will discuss how we can use gene expression and population genetic vari- ation to understand the heterogeneous nature of obesity outcome in the population and to uncover the complex relationship between the effects of obesity on peripheral tissue biology, the environment and the conse- quences of obesity on gene regulation. First, I will examine the extent of gene expression association measured in peripheral tissues to multiple cardio-metabolic, hormonal and adiposity related measurements. I will characterise the heritability of gene expression in these four sources and discuss the tissue specificity of both trait associations and genetic effects on gene expression. Second, I will describe how BMI can act as a po- tent modifier of gene expression in adipose tissue by modelling BMI as an exposure/environment to detect and for the first time replicate BMI- dependent eQTLs (G × BMI) that are specific to adipose tissue. Lastly, I will explore the cell type heterogeneity of adipose tissue, a pertinent problem for many investigators performing gene expression based stud- ies in bulk complex tissues. I will show how many BMI gene expression associations are driven by macrophage heterogeniety amongst samples, that cell type variability is heritable and describe examples of cis-eQTLs driven by macrophage proportion in adipose tissue. Contents

0.1 Publications...... 5 0.2 Attributions...... 7 0.2.1 Chapter 3 - Pervasive eect of cardio-metabolic traits on periph- eral tissue gene expression regulation...... 7 0.2.2 Chapter 4 - BMI-dependent regulatory effects on multi-tissue transcriptomes...... 7 0.2.3 Chapter 5 - Population level variability in adipose tissue cell- type composition and its link to obesity...... 7

1 Introduction1 1.0.1 Overview...... 1 1.1 Obesity biology and associated co-morbidities...... 1 1.1.1 Prevalence and impact...... 1 1.1.2 Heritability of obesity and related traits...... 2 1.1.3 Genetic etiology of Obesity...... 3 1.1.4 Systemic effects of obesity...... 10 1.1.5 Adipokines & their effect on peripheral metabolism...... 11 1.1.6 Obesity’s effect on Adipose tissue...... 12 1.1.7 Obesity biology and its consequences: - A summary...... 14 1.2 Regulation of Gene expression...... 15 1.2.1 Transcriptional control: The basal transcriptional complex.. 15 1.2.2 Regulatory elements...... 17 1.2.3 Cell specific gene expression regulation...... 19 1.3 The Genetic regulation of Gene Expression...... 22 1.3.1 Expression Quantitative Trait Loci (eQTL) mapping..... 22 1.3.2 Gene expression heritability...... 24 1.3.3 Tissue and cell-type specificity of eQTLs...... 25 1.3.4 Context: environment/treatment dependent eQTLs...... 26

i 1.3.5 Summary...... 29

2 Data, Samples & common analysis methods 31 2.1 Introduction...... 31 2.2 The TwinsUK cohort...... 31 2.2.1 TwinsUK data collection...... 32 2.2.2 MuTHER/EuroBATS study...... 32 2.2.3 Tissue biopsies...... 32 2.2.4 Demographics of EuroBATS study...... 33 2.3 RNA-sequencing...... 33 2.3.1 Meta-exon quantification...... 34 2.3.2 RNA-sequencing covariates...... 35 2.4 Genotypes...... 36 2.5 A note on PEER correction...... 36

3 Pervasive effect of cardio-metabolic traits on peripheral tissue gene expression regulation 38 3.1 Introduction...... 38 3.2 Methods...... 40 3.2.1 Phenotype collection...... 40 3.2.2 Quality control of phenotypic data...... 40 3.2.3 TWAS model...... 41

3.2.4 Tissue specificity of traits: π1 analysis...... 41 3.2.5 Heritability of cardio-metabolic associated exons...... 41 3.2.6 TWAS false discovery rate (FDR)...... 42 3.2.7 eQTL discovery...... 42 3.3 Results...... 42 3.3.1 Pervasive association of cardio-metabolic traits with peripheral tissue gene expression...... 42 3.3.2 enrichment analysis...... 44 3.3.3 Tissue specificity of cardio-metabolic expression associations. 45 3.3.4 Gene expression heritability and cardio-metabolic trait associ- ations...... 47 3.3.5 cis-eQTL regulation of expression associated cardio-metabolic traits...... 48 3.3.6 Discussion...... 52

ii 4 BMI-dependent regulatory effects on multi-tissue transcriptomes 54 4.1 Introduction...... 54 4.1.1 Heterogeneous outcomes of metabolic complications amongst obese individuals...... 54 4.1.2 Gene-by-environment interactions (G × E) in GWA and eQTL studies...... 55 4.2 Methods...... 57 4.2.1 cis-G × BMI eQTL discovery...... 57 4.2.2 Permutations and FDR estimation...... 57 4.2.3 Discovery of trans G × BMI eQTLs...... 57 4.2.4 deCODE genetics cohort...... 58 4.2.5 deCODE replication...... 58 4.2.6 Overlap with GWAS loci...... 59 4.2.7 Gene ontology enrichment analysis...... 60 4.2.8 Statistical mediation analysis...... 60 4.2.9 ENCODE and Epigenome Roadmap enrichment analysis... 61 4.2.10 DXA visceral fat collection...... 61 4.2.11 Interactive website...... 61 4.3 Results...... 61 4.3.1 BMI is associated to thousands of gene expression traits in mul- tiple peripheral tissues...... 61 4.3.2 Detection of G × BMI cis-eQTL regulatory variants..... 63 4.3.3 Replication of G × BMI eQTLs...... 64 4.3.4 Robustness of G × BMI cis-eQTLs discovery...... 67 4.3.5 G × BMI eQTLs and their expression are highly adipose specific 71 4.3.6 G × BMI eQTLs and their link to related traits...... 72 4.3.7 trans G × BMI eQTL detection in adipose tissue...... 74 4.3.8 Properties of G × BMI eQTLs below genome-wide significance 76 4.3.9 Enrichment analysis of G × BMI eQTLs highlights both metabolic and immune processes...... 77 4.3.10 Visceral fat DXA measurements recapitulate G × BMI eQTL findings...... 77 4.3.11 Discussion...... 80

iii 5 Population level variability in adipose tissue cell-type composition and its link to obesity 82 5.1 Introduction...... 82 5.1.1 Gene expression deconvolution methods...... 83 5.1.2 Accuracy of cell type estimation methods...... 84 5.1.3 Utility of gene expression-derived cell type proportions.... 85 5.2 Methods...... 86 5.2.1 Purified cell type data...... 86 5.2.2 GTEx RNA-seq quantification...... 86 5.2.3 RNA-seq alignment and gene quantification...... 87 5.2.4 Construction of an adipose signature matrix...... 87 5.2.5 In-Silico mixture simulations...... 87 5.2.6 Estimating cell types from adipose RNA-seq data...... 88 5.2.7 Heritability estimation...... 89 5.2.8 DXA phenotype collection...... 89 5.2.9 BMI TWAS...... 89 5.2.10 Weighted gene co-expression network analysis (WGCNA)...... 89 5.2.11 Gene-by-environment interaction modeling...... 90 5.2.12 Cis-eQTL analysis...... 90 5.3 Results...... 91 5.3.1 Construction of an adipose tissue signature matrix...... 91 5.3.2 Benchmarks and simulations to assess signature matrix accuracy 92 5.3.3 Estimation of relative cell type proportions using RNA-seq from primary subcutaneous adipose tissue biopsies...... 97 5.3.4 Cell type proportions are heritable, explaining major compo- nents of gene expression variance...... 100 5.3.5 Macrophage estimates are associated to whole-body obesity traits but not age...... 103 5.3.6 Adjusting for Macrophage heterogeneity accounts for 22% of all BMI TWAS associations...... 103 5.3.7 GTEx adipose tissue gene expression strongly influenced by is- chemic time...... 105 5.3.8 Correcting for macrophage heterogeneity in adipose tissue in- creases cis-eQTL discovery yield...... 107

iv 5.3.9 CECR1 - rs1807517 is a macrophage proportion-dependent cis- eQTL in adipose tissue...... 108 5.3.10 G × Macrophage eQTLs at relaxed threshold...... 110 5.4 Discussion...... 111

6 Concluding Remarks 113 6.0.1 Improvements to this study...... 114 6.0.2 Future work...... 115

Bibliography 117

A 139

v List of Figures

1.1 Simple representation of the leptin-melanocortin pathway...... 5 1.2 Schematic representation of a GWAS study design: Populations are genotyped using genotyping arrays and linear associations between a trait and genetic variant are performed...... 7 1.3 Lean to obese adipose tissue transition. Redrawn and modified from Figure 2 of Dalmas et al.(2011)...... 13 1.4 Schematic of splicing and alternative splicing...... 16 1.5 Local and distal regulatory elements. A) The core promoter consists of multiple elements that initiate transcription. B) Enhancer elements act distally (typically in the Mega Base (MB) range) to ’enhance’ basal transcriptional levels. C) Silencers, unlike enhancers, can dampen or repress transcription. D) Insulators act to block the interaction be- tween an enhancer or a silencer with a promoter. E) control regions can consist of several distal regulatory elements such as enhancers, that influence the expression of more than one gene. (Colours do not correspond to specific elements). Adapted from (Mas- ton et al. 2006)...... 18 1.6 mRNA-sequencing protocol. A) mRNA obtained from a Poly-A enrich- ment protocol. B) mRNA is converted into double stranded cDNA and fragmented. C) Fragments have sequencing adapters ligated to both ends D) Fragments undergo an (optional) PCR amplification step using primers as depicted and are sequenced...... 21 1.7 eQTL study design A. SNP association in cis. B) SNP-gene association in trans. C) Additive linear relationship between genotype dosage and gene expression. D) cis-eQTLs are typically enriched around TSS’s.. 23

vi 1.8 Types of G×E response: A) ’True interaction’: Exposure to an en- vironment results in a linear relationship between allele dosage and gene expression. B) Both exposed and non-exposed react to the en- vironment, but with different magnitudes of effect. C) ’crossover’ or qualitative-interaction - opposing directions of effect observed in either environment. Where blue = exposure, red = not exposed...... 27

2.1 BMI and age distribution of EuroBATs samples (n=856)...... 33 2.2 Distribution of RNA seq GC content and insert size (n=856)..... 35 2.3 BMI’s correlation to the first five PEER factors...... 37

3.1 π1 statistic calculated in blood, skin and LCLs, reflecting the propor- tion of shared cardio-metabolic trait associations found in adipose tissue. 46

3.2 π1 values of blood significant hits in adipose, skin and LCLs...... 47 3.3 Exon expression h2 estimates in adipose, skin, whole blood and LCLs. 49 3.4 BMI exon expression associations significantly enriched for highly her- itable gene expression. “FDR5%”: Top 10% significantly associated exon-expression traits to BMI (best exon per gene). “Permuted”: Median of 1000 permutations of randomly sampled exon-expression heritabilities. “All”: Heritability for all exons expressed. P-values represent the result of t-test comparisons of associated vs permuted heritability estimates...... 50 3.5 cis-eQTL enrichment for trait associations as compared to the back- ground cis-eQTL rate per tissue. Each bar represents the significance of a hyper-geometric enrichment test comparing the number of associ- ated genes with cis-eQTLs to the background tissue rate of cis-eQTLs. 51

4.1 Distribution of BMI TWAS association P-values. Enriched for signifi- cant P-values observed in adipose, skin and whole blood...... 62

4.2 π1 P-value distribution from each tissue. π1 represents the proportion of trait-associations shared across tissues. For example, the top left histogram is the P-value distribution of all BMI associated adipose tissue genes matched in skin. 53% of BMI associated adipose tissue gene expression is also associated to BMI in skin tissue...... 63

vii 4.3 The three replicated G × BMI eQTLs. For example, individuals ho- mozygote effect allele (EA) for rs7143432 there is a positive association between CHURC1 gene expression and BMI. Individuals homozygote for the other allele (OA) show the opposite relationship...... 67 4.4 A) Visualisation of global trans-associations of ALG9. B) cardiovas- cular disease network which has an over-representation of ALG9 -trans genes. C) Lead trans-G × BMI eQTL ZNF423...... 75 4.5 A) Effect size (β) of TwinsUK G × BMI eQTL discovery and replica- tion in deCODE genetics. B) 127 G × BMI eQTLs are proximal to the TSS. C) Visceral fat DXA measurements (G × Visceral Fat eQTLs) recapitulate G × BMI eQTLs...... 77

4.6 π1 analysis of G × BMI eQTLs main effects in each tissue. Each histogram represents the P-value distribution of the main effect eQTL of each exon-SNP G × BMI matched in a different tissue (e.g. Top left histogram is the P-value distribution of Adipose G × BMI eQTLs matched to the main effect adipose eQTL results.) Approximately 10% of adipose G × BMI eQTLs have main effects...... 78 4.7 BMI’s correlation to visceral fat area in expression individuals who also have DXA measurements (n = 682)...... 79

5.1 Hierarchical clustering of reference cells that are used to produce the signature matrix. Coloured by unsupervised k-means clustering (where k = 5). Biological hierarchy recapitulated: Non-immune (Adipocytes, MVEC) and immune cell fractions (Macrophage and CD4+ t-cells) cluster separately...... 92 5.2 Proportion of estimated cell types using the adipose signature matrix from in-silico mixture simulations...... 94 5.3 Estimated cell types using comparable ranges as those estimated from the TwinsUK Adipose tissue samples. All estimated cell types are highly correlated with ground truth simulations R = [0.988-0.995] and have small mean absolute differences (mAD) [0.003- 0.01]. MVEC range of detection ≥ 2.5%...... 95

viii 5.4 In-silico cell type estimates with unknown content added. Additional cell types known to be present in adipose tissue (Smooth muscle, Neu- trophils, Dendritic Cells) but that are not estimated by the adipose signature matrix, were included in simulated adipose tissue mixtures (Adipocytes, CD4+, MVEC, Macrophages) to assess estimation accu- racy with varying amounts of unknown content. The adipose tissue signature matrix is robust to unknown cell types, with cell estimates maintaining a highly linear relationship with ground truth data. An unlikely scenario of 50% unaccounted for mixture content, resulted in systematic overestimation, yet a linear relationship was still maintained. 96 5.5 Cells type estimates from in-silico simulations with added scaled Gaus- sian noise (10, 50, 90% respectively)...... 97 5.6 Estimated Adipocyte proportion in TwinsUK...... 98 5.7 Estimated Macrophage proportion in TwinsUK...... 98 5.8 Adipose tissue RNA-seq PC2 captures macrophage proportion hetero- genity amongst samples...... 101 5.9 (a) Macrophage proportion is highly correlated to the ‘green’ WGCNA co-expression module which is enriched for immune response processes. (b) Correlation of macrophage proportion and all WGCNA co-expression modules in TwinsUK adipose tissue...... 102 5.10 Dotted lines – Bonferroni correction threshold (P-value = 4.2 × 10-7. Blue – significant only in unadjusted TWAS, red not significant in both, teal – only significant in Macrophage adjusted TWAS. Green - significant in both TWAS...... 104 5.11 Majority of Subcutaneous adipose tissue gene expression variance in GTEx dataset can be explained by ischaemic time (mins) variation.. 105 5.12 Distribution of relative adipocyte estimates in TwinsUK and GTEx.. 106 5.13 Distribution of ADIPOQ expression (CPM) in TwinsUK and GTEx samples...... 107 5.14 Inverse normalized macrophage proportion vs CECR1 PEER expres- sion residuals...... 109

ix Chapter 1

Introduction

1.0.1 Overview

This thesis will focus on understanding adiposity and adiposity related traits using gene expression in peripheral tissues. I will explore adiposity related traits and the influence that both the environment and population genetic variation has on gene expression. I will therefore start by introducing this chapter with some background on obesity, its downstream consequences and its genetic basis. Finally, I will introduce the topic of gene expression regulation and how some genetic effects on gene expression are highly tissue, context, condition and environmentally specific.

1.1 Obesity biology and associated co-morbidities

1.1.1 Prevalence and impact

Obesity is defined as an excess of fat mass. The excess adiposity defined by obesity, is associated to a range of adverse complications. There are many measures of obesity, but the most commonly used metric is Body Mass Index (BMI) which has units kg/m2. Being overweight or obese in terms of BMI is defined as >25kg/m2 and >30kg/m2, respectively. In the United Kingdom alone, 24% of adults are obese and 36% are overweight (Baker & Bate 2016). The health-care related cost attributed to obesity is estimated to be £6.3bn and is responsible for 30,000 preventable deaths (6% of all mortality) (Baker & Bate 2016). From 1989 to 2013, prevalence of obesity increased by 41% (Baker & Bate 2016). This rise in obesity and its co-morbidities (such as Type-II diabetes, cardiovascular disease and insulin resistance) emphasises

1 the need to understand all of the causes and consequences of obesity in order to develop strategies to prevent and treat disease.

1.1.2 Heritability of obesity and related traits

Obesity is the result of an energy imbalance that can be combated by lifestyle choices such as increasing physical exercise and maintaining a healthy diet. However, it is clear that inherited factors influence adiposity and obesity has been repeatedly demonstrated to be heritable, commonly measured using narrow sense heritability (h2) (see Box: 1.1). The heritability of obesity is estimated to be between 60-84% (Farooqi & O’Rahilly 2005), depending on the study and the population it is mea- sured within. Obesity has been noted to be more heritable in early life than in adulthood, potentially because children that are obese are more likely to harbour monogenic forms of obesity, inflating common-obesity heritability estimates (Haworth et al. 2008).

Obesity comorbidities and other traits related to obesity are also heritable in a range of populations. These traits include; Type II-diabetes (T2D) (h2 = [0.26-0.56]) (Hunt et al. 1989), blood pressure (h2 = [0.15-0.42]) (North et al. 2003, Hsueh et al. 2000), fasting glucose, (h2 = [0.10-0.75]) (North et al. 2003, Poulsen et al. 1999), High Density Lipoprotein (HDL) (h2 = [0.45-0.66]) (Mitchell et al. 1996) and Low Density Lipoprotein (LDL), (h2 = [0.39-0.40]) (Mitchell et al. 1996). Many of these traits are correlated and/or causally dependent with obesity (i.e. increased BMI decreases HDL cholesterol) as demonstrated using Mendelian randomisation. (Holmes et al. 2014).

2 Box 1.1: narrow-sense heritability (h2)

Narrow sense heritability is the proportion of total phenotypic variability (P) in a population that can be attributed to additive-only genetic variation. h2 can be defined as:

V ar(A) h2 = (1.1) V ar(P ) where:

V ar(A) = Additive genetic variance V ar(P ) = Total phenotypic variance

A commonly used and intuitive method for calculating heritability is Falcon- ers equation. Falconers equation compares the correlation of a trait between Monozygotic (MZ) and Dizygotic (DZ) twin pairs:

2 h = 2(rMZ − rDZ ) (1.2)

As MZ twins are 100% genetically identical and DZ twins only share half of their genome, it is possible to attribute the difference in a trait to genetics, with the explicit assumption that MZ and DZ twins share the same environment. Heritability estimates conducted in twin studies today typically utilise more complex variance decomposition methods in which the phenotypic variance is broken down into additive genetic (A), common environment (C) and unique environment (E) components:

VP = VA + VC + VE (1.3)

Briefly, ACE calculations use structural equation models (SEMs) to estimate unknown latent factors (the model parameters A,C and E) that best fit the observed twin covariances of both twin types (MZ/DZ) using maximum like- lihood, (see Rijsdijk & Sham 2002) for detail on using SEMs for heritability modelling.

1.1.3 Genetic etiology of Obesity

Understanding the genetic cause of obesity can help to understand the biology of obesity. Whilst obesity prevalence is high in the UK, monogenic forms of obesity are

3 much less common, with an approximate incidence of 5% in the population. (Farooqi & O’Rahilly 2005, Jimenez et al. 2012). Although uncommon, these monogenic forms of obesity provided some of the first insights into the biological mechanisms of obe- sity. The first characterisation of an inherited form of obesity published was in 1997 (Montague et al. 1997). This study examined a consanguineous family of Pakistani- origin with children who had a severe form of early onset obesity and no detectable levels of circulating leptin, a hornome that circulates in the blood proportional to body fat and regulates appetite (Farooqi & O’Rahilly 2005). The determined cause of this case of monogenic obesity was a frame shift mutation (∆G133) in the Ob gene which produces leptin. Symptoms of this monogenic form of obesity were predom- inately extreme hyperphagia, hyperinsulinemia and advanced bone age. (Allison & Jr. 2014). This study was the first that linked obesity in humans to a disruption in the leptin- melanocortin pathway. The leptin-melanocortin pathway controls energy balance homeostasis via signaling between peripheral endocrine organs (such as the pancreas, the stomach and adipose tissue) and the brain (Box 1.2). Subsequent studies have shown that many other monogenic forms of obesity are caused by the disruption of this pathway. These studies suggest that obesity involves a number of signaling processes and point to the brain as the central causal organ for obesity (Bell et al. 2005).

4 Box 1.2: The Leptin-Melanocortin pathway A fundamental pathway of energy homeostasis and of central importance to the biology of obesity is the Leptin- Melanocortin pathway which is medi- ated through the gut-brain axis (Os- wal & Yeo 2007). Multiple periph- eral endocrine organs (Stomach, Adi- pose tissue and Pancreas) act on the brain to regulate short and long term appetite and energy needs (Walley et al. 2009). Several signals from the gut are received by the hypothalamus, includ- ing Ghrelin, Peptide YY and Chole- Figure 1.1: Simple representation of cystokinin (CCK), as well as feedback the leptin-melanocortin pathway from mechanosensitive receptors in the stomach that measure distension (Wal- ley et al. 2009). These hormones can have activatory or inhibitory roles in the brain, depend- ing on the neuronal system they act on (proopionomelanocortin (POMC) and agouti-related protein (AGRP) neurons respectively).

Are the genetic/biological determinants of monogenic obesity representative of common obesity? The advent of genotyping microarrays helped answer this ques- tion as it allowed for massive-scale rapid genotyping of thousands of individuals in population-based studies, so-called Genome-Wide Association Studies (GWAS) (Box: 1.3). The first GWAS variant associated to obesity was found in the intron of the Fat-mass and obesity related gene (FTO)(Frayling et al. 2007). Fto knockout mice highlighted the possible role the FTO locus plays in obesity, although the causal gene remains the topic of many lines of investigation (Smemo et al. 2014, Clauss- nitzer et al. 2015). Since the FTO locus discovery, power to detect associations has improved due to the availability of large sample sizes. The latest obesity GWAS

5 meta-analysis, based on 339,224 unrelated individuals, discovered 97 independent ge- netic loci associated with BMI (Locke et al. 2015). All genes previously discussed in this chapter that were implicated in monogenic obesity harboured common variants also associated to BMI, demonstrating both monogenic and common forms of obesity share overlapping biology. This fact is recapitulated in a gene-set enrichment analysis performed by Locke et al.(2015), which demonstrates that genes harbouring common variants associated with BMI are active in the Central Nervous System (CNS) and the brain (P-value <5 × 10-4). The second most enriched category was the haemic and immune system. Whilst not statistically significant, this enrichment is interesting in the light of adipose tissue biology and obesity related complications that will be discussed in later sections.

6 Box 1.3: Genome-Wide Association Study (GWAS) A Genome-wide association study (GWAS) is a popular study design to identify genetic variants that are associated with a disease or complex trait of interest. GWAS’s utilise large population and/or cohorts to geno- type thousands of individuals using a genotyping chip or whole genome sequencing to identify millions of common variants. Results of GWAS are usually visualised in the form of a Manhattan plot, which plots SNP- coordinates vs -log10(P-value) of each SNP-trait association (Figure: 1.2).

Imputation is commonly used to in- crease the number of SNPs available for investigation, a method that leverages Figure 1.2: Schematic representa- haplotype reference panels to impute tion of a GWAS study design: Pop- variants that were not directly geno- ulations are genotyped using geno- typed (Loh et al. 2016). GWAS are typing arrays and linear associa- tions between a trait and genetic sensitive to a number of confounding variant are performed. effects, an example of which is popu- lation structure. Population structure is when genetic differentiation between populations is associated with pheno- typic differences, regardless of whether it has a phenotypic effect. This has led to the use of more sophisticated mod- elling such as linear mixed models, in which the population genetic covariance can be explicitly modelled as a random effect (Price et al. 2010).

7 Many measures of adiposity have also been studied under a GWAS framework. For instance, 49 independent genetic loci were found to be associated with Waist-Hip Ratio (WHR), in a model adjusted for BMI (Shungin et al. 2015). These loci were enriched for adipocyte, adipose-tissue and insulin regulation biology, substantially different to BMI loci enrichment (Locke et al. 2015). This distinction between overall adiposity (measured as BMI), body-fat distribution (WHR) and their respective bio- logical and genetic architectures, is becoming increasingly recognised. In addition to this, body fat distribution loci show significant sexual dimorphism, with 39% of WHR associated loci displaying larger effect sizes in women. These findings are important when planning follow up studies to investigate the mode of action of the identified genetic loci.

GWAS has been extensively applied to many additional cardio-metabolic traits with substantial loci discovery and loci-sharing (Shungin et al. 2015, Fuchsberger et al. 2016, Willer et al. 2013, Dastani et al. 2012). Most GWAS’s have focused on common Single Nucleotide Polymorphisms (SNPs) with Minor Allele Frequency (MAF) of either >1% or >5% (Fritsche et al. 2016, Kato et al. 2015, Locke et al. 2015, Shungin et al. 2015). However, Rare Variant Association Studies (RVAS) will become more common place as whole-genome sequencing (WGS) becomes cheaper and more widely utilised (Walter et al. 2015). GWAS’s have been and will continue to be critical in understanding the genetic basis of common obesity and many other complex traits and diseases. However, whilst many genetic loci have been found to be associated to these traits, the proportion of phenotypic variance explained remains miniscule, making prediction or risk-stratification based on common-genotypes difficult. For example, all 97 loci associated with BMI explain only 2.7% of the variance or 21.6 ± 2.2% of the variance using all HapMap 3 SNPs (Locke et al. 2015). This has led to the coining of the term, ‘missing heritability’ (Box: 1.4).

8 Box 4: What is missing heritability? Missing heritability is the discrepancy between the phenotypic variance ex- plained by GWAS loci compared to the estimated heritability of the trait under investigation (Eichler et al. 2010). Many arguments have been put forward to explain the apparent missing heritability. As GWAS are largely restricted to common, additive effect variants, some proponents believe that much of the phenotypic variance will be explained by non-additive effects and/or rare vari- ants (Manolio et al. 2009). Non-additive effects include gene-gene interactions (G×G), gene-environment interactions (G×E) and . Due to power issues, there are few replicated examples of non-additive effects, with most con- tributing a small fraction of additional variance explained (Young et al. 2016, Moutsianas et al. 2015, Wood et al. 2016). The role that non-additive varia- tion plays in complex traits can be addressed when much larger sample sizes are available. RVAS also suffer from the same limitations and their ability to detect novel variation will be revealed with larger sample sizes (Walter et al. 2015). Other explanations for missing heritability include not testing all variant types. For example, copy number and structural variants are largely ignored if they are not tagged by common variants.

Fisher’s infinitesimal model predicts that as sample size (N) increases, smaller and smaller effect size variants will be discovered that contribute diminishing variance explained (Fisher 1930). Many of these variants are likely to be present in GWAS performed today, but due to conservative multiple testing correction (P-value <5×10-8), they are largely ignored. This provides an explanation of why chip heritability, the variance explained by all genotyped SNPs is much higher than only SNPs that are genome-wide significant (Yang et al. 2015). Other proponents argue heritability estimates from twin and family studies are grossly upwardly biased - with very little missing heritability in reality. Finally, measurement error or disease misdiagnoses would bias meta-analysed GWAS effect size estimates towards zero - resulting in underestimation of the variance explained. It is likely that all explanations highlighted contribute to the missing heritability problem.

9 1.1.4 Systemic effects of obesity

Whilst the genetics of obesity has elucidated the role of the hypothalamus and the CNS in obesity development, the downstream consequences of obesity are systemic. Obesity has been shown to increase the risk of type-2 diabetes (T2D), heart disease, stroke, several cancers and liver disease; all of which dramatically impact quality of life (Holmes et al. 2014, Thrift et al. 2014). The primary immediate consequence of obesity is increased fat-mass as excess fat is stored in adipose tissue compartments throughout the body. Until relatively recently, the storage of fat as adipose tissue was deemed a passive consequence of obesity, with adipocytes only participating as fat storing cells. (Box: 1.5).

Box 1.5: Types of adipose tissue and adipocytes Two main types of adipose tissue exist, white and brown adipose tissue (WAT/BAT), which are composed of different types of adipocyte and arise from distinct embryological origins. Omental BAT is located throughout the body around major organs such as the pancreas and kidney whilst subcuta- neous BAT surrounds the anterior neck muscles and is present under the clavicle (Sacks & Symonds 2013). WAT is largely subcutaneous but also surrounds the majority of major organs. WAT is composed of two types of adipocyte, white and beige. Beige adipocytes are white adipocytes that have differentiated into brown adipocytes, a process known as browning (Lo & Sun 2013).

Brown adipocytes are found in BAT (Harms & Seale 2013). BAT adipocytes contain a larger number of mitochondria relative to WAT, as their primary function is thermo-regulation, whereas WAT’s primary function is lipid storage (Harms & Seale 2013, Saely et al. 2012, Enerb¨ack 2009). As well as type of adipose tissue, fat can also be separated into compartments or depots, such as subcutaneous or omental adipose tissue. Subcutaneous adipose tissue occurs just beneath the skin, whilst omental, commonly known as visceral fat, sur- rounds major vital organs. Both excessive visceral and subcutaneous adipose tissue have been independently associated to adverse metabolic outcomes but the presence of both adipose tissue depots is highly correlated (Goodpaster et al. 1997).

This simplistic interpretation of adipose as a mere storage organ has subsequently

10 been overturned, as extensive evidence has demonstrated that adipose tissue is a com- plex organ with autocrine, paracrine and endocrine hormone secretion, metabolism and inflammatory properties (Galic et al. 2010, Trayhurn & Beattie 2001, Grant & Dixit 2015). Whilst many hormones secreted from adipose tissue have been described, Adiponectin and Leptin remain the best understood (Kawano & Arora 2009, Kelesidis et al. 2010).

1.1.5 Adipokines & their effect on peripheral metabolism

Adipokines are cytokines that are released from adipose tissue and bare many en- docrine functions. The two most well understood adipokines to date are qdiponectin and leptin. Adiponectin is also known as a complement-related protein as it is struc- turally similar to Complement 1Q (C1Q). It is specific to adipocytes, where it is highly expressed (Kawano & Arora 2009). Adiponectin has a insulin sensitivity enhancing effect on both muscle and liver tissue and increases free fatty acid (FFA) oxidation in several highly metabolically active tissues (Lihn et al. 2005). In a study using mice, lean mice administered adiponectin whilst given a high fat diet (HFD) resulted in the mice having a reduced post-prandial increase in glucose, FFAs and triglyceride concentrations (Peake et al. 2003). This effect was attributed to an increased clear- ance rate rather than reduced intestinal absorption (Peake et al. 2003). In humans, adiponectin concentration is negatively correlated with BMI and reduced levels of adiponectin are associated with insulin resistance and hyperinsulemia. Adiponectin exhibits potent anti-inflammatory effects and it is therefore worth noting that in con- ditions in which adiponectin concentration is low such as obesity, type 2 diabetes and Coronary Heart Disease (CHD), individuals exhibit systemic low-grade inflam- mation (Matsubara et al. 2002, Greenberg & Obin 2006). Adipokines exhibit many immune modulating effects and the topic of systemic inflammation and adipose tissue inflammation is one I will return to.

As well as leptin’s adaptive response to starvation and to regulate appetite as discussed previously, it also has several important peripheral metabolic roles. In mice engineered to have no adipose tissue (a congenital lipodystropy mouse model) leptin therapy reversed insulin resistance and diabetes (Shimomura et al. 1999). Similarly, it has been observed in humans that have lipoatrophic diabetes, a condition in which patients have little to no fat mass and have elevated triglyceride concentrations requir- ing regular plasmapheresis, administered leptin therapy reduced plasma triglyceride levels and reduced the dependence on anti-diabetic medication (Oral et al. 2002). It

11 is clear that leptin and adiponectin both play important roles in energy balance and peripheral metabolism. It has been shown that the reduction in triglyceride concen- tration and increased insulin sensitivity achieved by leptin and adiponectin is through increasing the rate of FFA oxidation, similar to the effect of exercise (Fruebis et al. 2001, Minokoshi et al. 2002). Other adipokines have subsequently been characterised, including resistin which increases LDL production in human liver cells, visfatin which matures B cells and prevents the apoptosis of neutrophils, and asprosin, a recently discovered glucogenic hormone (Steppan et al. 2001, Fukuhara et al. 2005, Romere et al. 2016).

1.1.6 Obesity’s effect on Adipose tissue

As an individual becomes obese, their average adipocyte volume increases (hypertro- phy) whilst evidence suggests the abundance of adipocytes remains constant through- out adult life (Bl¨uher 2009, Spalding et al. 2008). Adipocyte hypertrophy quickly causes the diffusion limit of oxygen to be reached in obese adipose tissue, which al- ters the local tissue micro-environment. Despite this increase in adipocyte volume and therefore greater storage potential, FFA release increases, which is known to promote insulin resistance in muscle tissue (Boden 1997, Lewis et al. 2002, McGarry 1998). Additional changes in obese adipose tissue occur, such as hypoxia, fibrosis, unilocular lipid droplet formation and profound immune cell recruitment and pro-inflammatory cytokine release (Trayhurn 2013, Berg & Scherer 2005, Rutkowski et al. 2015, Bai & Sun 2015).

Adipose tissue can be characterised as an immununological organ (Grant & Dixit 2015). Evolutionarily this is supported, as invertebrates combine both their innate immune system and metabolism in a single organ known as a ’fat body’ (Azeez et al. 2014). The fat-body contains toll-like receptors (TLR) that trigger the NF- κB pathway, resulting in the release of anti-microbial peptides. The fat body also processes and stores lipids (Azeez et al. 2014). Vertebrates at some point in their evolutionary history, split these immune and metabolic functions into separate organs; the liver and adipose tissue. One trigger of adipose inflammation is the increased rate of adipocyte necrosis during obesity, which has been observed to be thirty times greater than lean controls (Cinti et al. 2005). This is thought to be the primary cause of macrophage infiltration. Macrophages form distinct crown like structures in obese adipose tissue, surrounding adipocytes that are undergoing necrosis (Box: 1.6) (Boutens & Stienstra 2016). Macrophages start to form multi-nucleated cells that

12 are typical of chronic inflammation, taking on the role of adipocytes to phagocytose excess lipids (Cinti et al. 2005).

Box 1.6: Adipose tissue inflammation in obesity Lean, metabolically healthy adipose tissue has little inflammation and is charac- terised by an anti-inflammatory adipokine/cytokine milieu, such as high levels of adiponectin, SFRP5, IL-10 and TGFβ secretion (Dalmas et al. 2011). Whilst multiple immune subtypes are present in the Stromal Vascular Fraction (SVF), there are relatively infrequent in the adipose tissue of lean individuals com- pared to obese subjects (Berg & Scherer 2005). The immune cell repertoire and function changes dramatically with sustained weight gain.

Figure 1.3: Lean to obese adipose tissue transition. Redrawn and modified from Figure 2 of Dalmas et al.(2011).

In obesity, additional immune cells infiltrate adipose tissue such as CD8+ T- cells and M2-polarised macrophages. Adipose tissue macrophages (ATMs) are characteristically pro-inflammatory and form crown-like structures that sur- round necrosising and apoptosed adipocytes. There is increased vascularisation of obese adipose tissue and extracellular membrane deposition (fibrosis). Obese tissue is characterised by high levels of pro-inflammatory adipokines/cytokines, such as IL-6, CCL2, CXCL5, Resistin and Leptin (Berg & Scherer 2005).

13 Macrophages, whilst not the only immune cell present in adipose tissue, are re- sponsible for the majority of cytokine release in obese adipose tissue - such as Tumor Necrosis Factor α (TNFα) and IL-6. In a study using mice, Xu et al.(2003), were able to show that many immune specific genes expressed in macrophages are expressed in adipose tissue obtained from obese mice and that this gene expression change pre- ceded extensive production of insulin. By treating the mice with an insulin-sensitising drug (rosiglitazone) the expression of these macrophage-specific genes decreased, sug- gesting macrophage infiltration and subsequent low-grade inflammation, particularly in adipose tissue, is a major protagonist to obesity induced insulin resistance. Other work supports the idea of inflammation due to obesity causing insulin resistance. Administered salicylate, a potent anti-inflammatory drug has been shown to allevi- ate insulin resistance in diabetics, providing more evidence that obesity-associated inflammation is casual (Yuan et al. 2001). This vast remodelling and expansion of adipose tissue takes place in most obese fat pads, but not all adipose tissue expansion is associated to pathology. Interestingly, there are obese individuals that can pre- serve insulin sensitivity. The ‘Metabolically healthy’ hypothesis suggests that these individuals have healthy adipose tissue expansion (Kl¨otinget al. 2010). The differ- ences of co-morbidity prevalence amongst the obese and the cellular and inflammatory properties of adipose tissue will be explored in chapter4 and5 respectively.

1.1.7 Obesity biology and its consequences: - A summary

I have described how obesity is a complex trait caused by a mixture of environ- ment, lifestyle choice, common and rare (monogenic) genetic variation. Whilst the mechanism of obesity is largely thought to center around the hypothalamus, CNS and the control of appetite, the downstream consequences of obesity are peripheral and systemic. Obesity results in dysfunctional adipose tissue expansion and systemic low-grade inflammation that leads to co-morbidity development such as insulin re- sistance, T2D and CVD. Whilst genetic studies are fruitful for the discovery of new variation that influences all of these comorbidities, characterisation of how these ge- netic variants interact with obesity and confer risk is limited. In the next introductory section, I will detail the utility of gene expression studies in understanding GWAS loci mechanisms, namely through performing genetics of gene expression studies (eQTL analysis).

14 1.2 Regulation of Gene expression

1.2.1 Transcriptional control: The basal transcriptional com- plex

Whilst every cell in the human body contains the same DNA sequence, (except from somatic mutational differences), the protein and RNA output of each cell can be vastly different. The transcriptional output of a cell is what determines cell-type de- velopment and how a cell responds to internal and external stimuli. In Eukaryotes, gene expression is controlled by the binding of specific transcription factors to euchro- matin regions of the genome, whose combination determines the specificity of a given genes expression and the level of transcription performed by RNA polymerase II (Lee & Young 2013). Several constitutively expressed transcription factors (TFs) (TFIID, TFIIA, E & F) and RNA polymerase II itself bind to a functional unit known as the promoter, which is immediately upstream of the transcription start site (TSS) of any given gene (Juven-Gershon & Kadonaga 2010). This complex of transcription factors form the stable basal transcriptional complex.

Once transcription initiation has occurred, the next stage is transcriptional elon- gation, in which a single stranded mRNA molecule is produced by one of the two strands of DNA (the template strand) by RNA polymerase II. Transcription occurs in the 5’ to 3’ direction, producing a pre-mRNA which is 5’ capped and 3’ poly(A) tailed. This cap regulates the mRNAs export from the nucleus to the cytosol and the poly-A tail determines the rate of mRNA decay (stability) (Latchman 2011). The pre-mRNA contains both exons and introns. Exons are the unit of pre-mRNA which are translated into protein whilst the introns are spliced out (Box 1.7). 40-60% of genes undergo splicing (Matlin et al. 2005). Splicing, and alternative forms of splicing are heavily regulated processes that allow the complete coding repertoire of the approximately 25,000 genes in the to have a profound functional diversity and responsiveness to external stimuli (Matlin et al. 2005).

15 Box 1.7: Splicing and Alternative splicing RNA splicing is the process of removing introns from a pre-mRNA tran- script. All exons are then ligated together to form a mature mRNA. This process mostly commonly occurs co-transcriptionally but can also occur post- transcription (Tilgner et al. 2012). Splicing is catalyzed by the spliceosome, a multi-part complex of small nuclear ribonucleoproteins (snRNPs). The spliceo- some recognises an acceptor (GU), donor, (AG) and branch point site present in introns (Matlin et al. 2005). This process is known as the lariat pathway. The lariat pathway accounts for the majority of splicing performed by the spliceo- some. Although, rare introns with different recognition sequences are spliced out using the less used minor pathway.

Figure 1.4: Schematic of splicing and alternative splicing

Alternative splicing allows for diversity in mature-mRNA and protein produc- tion, by allowing combinatorial ligation of specific exons and exclusion of oth- ers, mRNA diversity is increased (Figure: 1.4). Intron retention can also take place, in which certain introns are not removed and form part of the mature mRNA (Wong et al. 2016). Proteomic experiments confirm that most genes have primary transcripts, in which their is a dominantly expressed and trans- lated mRNA isoform (Gonz`alez-Porta et al. 2013). Alternative splicing has been shown to be under genetic control, as many alternative splicing QTLs (asQTLs) have been described (Ongen & Dermitzakis 2015, Zhao et al. 2013, Li et al. 2016). alternative splicing also acts as a dynamic way to regulate gene expression that is responsive to the environment (Pai et al. 2016, Pleiss et al. 2007).

16 After splicing, a mature mRNA transcript is formed and it is exported from the nucleus to the cytosol. This is not a necessary step for all mRNAs transcribed, as many RNA species exist whose final form is RNA, not protein (Blignaut 2012), but I will focus on mRNA for the remainder of this chapter. Post transcriptionally, transcript decay rate, stability and spatial position can all be differentially regulated. mature mRNAs in the cytosol are translated by the ribosome into a polypeptide. The mRNA is typically bound and stabilised by the ribosome at the 3’ and 5’ UnTranslated Regions (UTRs), whilst the coding sequence (composed of exons) is translated into protein by the recruitment of transfter RNA (tRNA) molecules which bind specific amino acids and allow the ribosome to produce a polypeptide that is determined by each successive sequence of three nucleotides in the mRNA (codon/triplet code). The polypeptide elongation terminates when the ribosome reaches a stop codon, and the final polypeptide is released. Additional post-translational modifications then take place, including the modification of certain amino acid residues, addition of carbohydrates, phosphate groups and proteolysis.

1.2.2 Regulatory elements

The transcriptional rate of the basal transcriptional complex is small but can be enhanced by several orders of magnitude by the binding of additional factors at (local) upstream promoter elements (UPEs), regulatory elements, and many other (distal) features (enhancers, silencers, insulators & Locus Control Regions (LCRs)) (Shlyueva et al. 2014) (Box: 1.8).

17 Box 1.8: Local and distal regulatory elements Regulation of gene expression through regulatory elements takes place both locally and distally. Locally, many regulatory elements exist as part of the core promoter (Figure: 1.5A). Distal regulation is controlled by enhancers, silencers, insulators and locus control regions, which the latter having the potential to regulate multiple genes in a locus (Maston et al. 2006) (Figure: 1.5B-E).

Figure 1.5: Local and distal regulatory elements. A) The core promoter consists of multiple elements that initiate transcription. B) Enhancer elements act dis- tally (typically in the Mega Base (MB) range) to ’enhance’ basal transcriptional levels. C) Silencers, unlike enhancers, can dampen or repress transcription. D) Insulators act to block the interaction between an enhancer or a silencer with a genes promoter. E) Locus control regions can consist of several distal regula- tory elements such as enhancers, that influence the expression of more than one gene. (Colours do not correspond to specific elements). Adapted from (Maston et al. 2006).

One mechanism to control the specificity of transcription is through combinatorial binding of transcription factors that are expressed in specific settings (cell type, en-

18 vironment, ligand/co-factor binding, intra and inter-cellular signalling response) to a combination of these regulatory elements. Upstream promoter elements (UPEs) reg- ulate transcription in response to specific cues (Maston et al. 2006). A common and well studied UPE is the Heat Shock Element (HSE) which is in the promoter of genes whose expression is regulated by changes in temperature. Experimental introduc- tion of HSE elements into promoters of non-heat responsive genes, makes them heat inducible (Pelham 1982). Many other inducible regulatory elements have been sub- sequently described such as elements responsive to cyclic Adenosine Monophosphate (cAMP) and glucocorticoids, controlling specific transcriptional programmes. The understanding and extent of gene expression regulation via regulatory ele- ments is still far from complete. For example, core common regulatory motifs found in the promoter include the TATA-box, initiator element (Inr), downstream promoter element (DPE) & B-recognition element (BRE) (Maston et al. 2006). However, a study that performed computational and statistical analysis of over 10,000 promoter sequences concluded that only 50% of promoters contain an Inr, 25% contain a DPE & BRE motif and only 12.5% contain a TATA-box (Gershenzon & Ioshikhes 2005). This suggests many additional novel regulatory motifs may exist and the fundamental role of the promoter for many genes is still naively understood.

1.2.3 Cell specific gene expression regulation

Many transcription factors are constitutively and universally expressed in all cell- types, whilst others are regulated in a cell-type specific manner. Cell type fate de- termination is due to the expression of specific master transcriptional regulators. For example, the constitutive over-expression of MyoD, a transcription factor specific to myocytes, is sufficient to transform undifferentiated fibroblasts into differentiated muscle cells (Davis et al. 1987). Other cell type differentiation programs are more complex, such as adipocytes. Mesenchymal Stromal Cells (MSCs) express S6K1 to differentiate into pre-adipocytes. Adipocytes derive from pre-adipocytes that express the transcription factors C/EBPs and PPARγ (Carnevalli et al. 2010, Rosen & Mac- Dougald 2006).

Cellular differentiation usually results in a specialised cell type, as defined by a restricted set of genes being expressed compared to its precursor. Whilst the DNA se- quence is the same in each cell of the human body, the conformation of DNA is vastly different allowing control of transcription factor binding accessibility to different reg- ulatory elements. As transcription factors are commonly dependent on co-factors,

19 the precise subset of genes that can be expressed in a cell define what transcriptional programmes can be activated and therefore define the cell type through a unique global gene expression pattern.

High throughput applications to understand genome-wide gene expression regula- tion and its specificity have made significant progress since the advent of microarrays and RNA-sequencing (RNA-seq) applied in projects such as The Encyclopedia of DNA Elements (ENCODE) (ENCODE Consortium 2012) (Box: 1.9). ENCODE has contributed to the understanding of global cell type specific gene expression regula- tion significantly by mapping chromatin accessibility (DNase-seq), gene expression (RNA-seq), transcription factor binding (ChIP-seq) and long range chromatin inter- actions (5C) in a total of ∼150 different cell types (Neph et al. 2012, Djebali et al. 2012, Wang et al. 2012, Sanyal et al. 2012).

20 Box 1.9: High throughput gene expression analysis: RNA-seq mRNA-sequencing is a type of RNA-sequencing experimental procedure that aims to sequence all mRNA species present in a given sample (Mortazavi et al. 2008). The transcription of a gene results in multiple transcripts being present at a certain abundance. By sequencing all mRNA transcripts in a given sample, it is possible to assess the abundance of each mRNA. This measure is a relative abundance, given that there is a dependence on sequencing depth.

Figure 1.6: mRNA-sequencing protocol. A) mRNA obtained from a Poly- A enrichment protocol. B) mRNA is converted into double stranded cDNA and fragmented. C) Fragments have sequencing adapters ligated to both ends D) Fragments undergo an (optional) PCR amplification step using primers as depicted and are sequenced.

The sequenced cDNA fragments obtained from Figure: 1.6 are then aligned to a reference genome. The most simplistic, non-normalised measure of gene expression is a read count for each gene in a given sample. Several RNA-seq abundance units now exist, such as Counts Per Million (CPM) and Reads per kilobase of transcript Per million reads Mapped (RPKM) (Li et al. 2015).

21 ENCODE has shown that the activity of many regulatory elements are cell type specific (ENCODE Consortium 2012, Heinz et al. 2015). Cell type specific gene ex- pression regulation can be controlled at many different levels, for example: chromatin accessibility, transcription dynamics, isoform switching and DNA methylation. All of these processes can be determined either genetically - for example variants that influence chromatin confirmation, or environmentally - external cell signalling and transduction. As cell type gene regulation has been studied in populations of cells, future work will focus on the ability to study the chromatin and RNA dynamics of single cells at different developmental stages, treatments and time points, allowing us to fully characterise the extent and breadth of cell types that exist and the specificity of gene expression regulation.

1.3 The Genetic regulation of Gene Expression

Whilst GWAS for all complex traits have been a successful strategy in discovering risk loci, identifying the mechanism of how these loci confer risk is a difficult challenge (Visscher et al. 2012). GWAS loci associated with complex traits are often composed of mant variants in tight Linkage Disequilibrium (LD). As gene regulation can often occur over MB scales and therefore identifying both the causal variant of the locus and the gene the causal variant acts upon is non-trivial. This challenge is compounded by the fact most GWAS signals are located in the non-coding genome, with an abundance of recent evidence showing that most GWAS variants act through different types of gene regulation, such as gene expression (Zhu et al. 2016, Li et al. 2016).

1.3.1 Expression Quantitative Trait Loci (eQTL) mapping

Genotyping microarrays and subsequent development of sensitive gene expression profiling technologies such as RNA-sequencing applied to human genetics led to the rapid discovery of hundreds of thousands of genetic variants that affect almost every gene expression trait measured (Aguet et al. 2016), termed expression quantitative trait loci (eQTL) (Box: 1.10)(Lappalainen et al. 2013).

22 Box 1.10: eQTL mapping gene-SNP associations from eQTL mapping can be either local (cis) or distal (trans) acting. Distance for cis and trans have no strict definitions, but a commonly used threshold for cis is within a 1MB window of a given genes TSS (Figure 1.7A). trans associations are any SNP-gene association that occur at a 5MB distance or greater from the TSS and also includes regulation of gene expression from other (Figure 1.7B).

A: cis

B: trans

C D 14 ) 12 10 8 P-value 6 4 -log10(

GeneExpression 2 0 AA AG GG -1MB TSS 1MB Genotype

Figure 1.7: eQTL study design A. SNP association in cis. B) SNP-gene asso- ciation in trans. C) Additive linear relationship between genotype dosage and gene expression. D) cis-eQTLs are typically enriched around TSS’s. eQTLs are typically modelled as an additive linear relationship between geno- type dosage and normalised gene expression. Equation (1.4).

y = β0 + β1X +  (1.4)

Where y is a gene expression value measured across samples and X is the genotype dosage. A cis-eQTL is typically visualised as in Figure 1.7C. cis- eQTLs with larger effects, and as previously mentioned, shared across tissues, tend to be tightly clustered around the TSS. Figure 1.7D.

23 Gene expression can be thought of as an endophenotype, in which each gene expression measurement from a genome-wide assay (microarray/RNA-seq) can be treated as a single quantitative trait. ‘Genetical genomics’ was first used in yeast cross experiments, in which linkage analysis identified expression traits that were linked to one or more genetic loci (Brem et al. 2002). These associations could be broadly divided into two categories, local (cis) acting and distal (trans) acting genetic regulation of gene expression. This pioneered further studies into the genetic basis and heritability of natural population level variation in gene expression, which has subsequently been widely explored in several cell types, tissues and organs of many organisms including humans, worms, fruit flies, plants and many other species (Grundberg et al. 2012, Vi˜nuelaet al. 2010, K¨olling 2016).

1.3.2 Gene expression heritability

Studies in model organisms, human twins and populations have been used to establish that gene expression is a highly heritable quantitative trait (Wright et al. 2014). A wide-range of narrow-sense heritability estimates having been estimated across the transcriptome, reflecting that many genes are under genetic control. Gene expression heritability has been performed in over 40 tissues to date, using multiple methods (GCTA, ACE & family models), providing comparable results but also highlighting the tissue specific nature of gene expression heritability (Wheeler et al. 2016). Gene expression heritability can be partitioned into cis and trans components in which heritability of a gene’s expression is calculated using all local genetic variants (e.g. 1MB, cis) and all other genome-wide variants (trans). This method therefore does not require the specific cis or trans-eQTL variant to be known (Wheeler et al. 2016). However, sample sizes are currently not large enough to have robust estimates without large confidence intervals. trans effects have been found to account for the majority of gene expression variance (65-75%) (Grundberg et al. 2012). In yeast, it has been demonstrated that the effect of trans-eQTLs are more dependent on the environment compared to cis-effects, with trans-effects being more variable between treatments and exposures compared to cis (Smith & Kruglyak 2008). Larger sample sizes from multiple cell types and environments and contexts should therefore resolve the extent of trans-regulation of gene expression and uncover the extent of trans-eQTL tissue- sharing and whether many complex trait loci act through such distal regulation.

24 1.3.3 Tissue and cell-type specificity of eQTLs

As previously discussed, some gene expression is highly tissue specific. The genetic regulation of gene expression is also partially cell type/tissue specific. Let us consider three scenarios in which the effect of a genetic variant would only be observed in a specific cell type. 1) A given genes expression is restricted to cell type A, and not expressed at all in cell type B. 2) A genetic variant alters the binding motif of a cell type specific transcription factor. If this occurs, the variant will only be detected as an eQTL in cell-types that express the transcription factor and influences the expression of the eQTL gene. 3) A genetic variant falling into an enhancer that drives increased expression of a gene. If a variant falls within an enhancer that is active in a specific cell type, it could impact the enhancers efficacy to drive gene expression. However, if the enhancer is in an inactive (condensed heterochromatin) state in other cell types, the genetic effect will not be present and therefore no eQTL will be detected.

Large human multi-tissue gene expression projects such as MuTHER, EuRO- BATs, GTEx and ENCODE have shown that the majority of protein-coding genes have at least a single cis-eQTL and that the majority of cis-eQTLs are tissue shared (Grundberg et al. 2012, Buil et al. 2016, Aguet et al. 2016, ENCODE Consortium 2012). A recent study recently shown that the average genetic correlation (a measure of shared genetic effects) across tissues for cis-eQTL discovery is approximately 78% (Liu et al. 2016). cis-eQTLs are enriched around the transcription start site (TSS). Recent evidence using partitioned LD-score regression (pLDSC) from eQTL discov- ery results in 15 tissues suggests cis-eQTLs are enriched in super enhancers (5.2×), conserved regions (2.3×) and coding regions (3.7×)(Liu et al. 2016). Tissue specific cis-eQTLs show no particular enrichment when separated from all cis-eQTLs, but it’s possible that the enhancers they fall within, are tissue specific. Complex trait loci from GWAS show the same overall category enrichment’s (i.e. SNPs fall within enhancer regions), but complex trait associated SNPs show stronger enrichment’s for each category, compared to cis-eQTLs (for example: coding regions: 7.1× vs 3.7×, respectively) (Liu et al. 2016).

Whilst cis-eQTLs tend to be tissue-shared, trans-acting eQTLs are highly tissue specific (Jo et al. 2016). This could reflect many particular aspects of trans-eQTLs , such as a high False Discovery Rate (FDR)/low replicability or an underlying biologi- cal difference between cis and trans-acting genetic regulation of gene expression. It is likely that the behaviour of trans-eQTLs and the extent of gene regulation that takes

25 place in trans, will be elucidated when eQTL discovery samples are much larger. A trans-eQTL studied recently published uncovered 590 trans-eQTLs from 40 tis- sues, with a strong proportionality observed between sample size and the number of trans-eQTLs detected (Jo et al. 2016). Replicability of these trans-eQTLs was low, consistent with other publications on trans-eQTLs that fail to replicate the specific trans-eQTL network, but demonstrate an overall broad effect on expression, observed through the enrichment of global low P-values. Very few replicable trans-eQTL have been described in human, such as rs4731702-KLF14 (MuTHER Consortium 2011). In yeast by contrast, many trans-eQTL hotspots have been discovered affecting the expression of hundreds of genes. This difference between experimental yeast crosses and human populations likely reflect the difference in power due to allele frequency (AF), LD-structure and fixed environment (Yvert et al. 2003).

1.3.4 Context: environment/treatment dependent eQTLs

Gene expression is under both genetic and environmental regulation. However, within a genetically diverse population it has been observed that an individuals genotype can effect gene expression differently, under certain environments, contexts, conditions or treatments - so called ’Gene-by-Environment interactions on expression’ (G×E eQTLs) (Box: 1.11)(Fairfax et al. 2014, Francesconi & Lehner 2014, Glass et al. 2013, Glastonbury et al. 2016). Identifying the environment in which eQTLs act holds promise to understand the mechanism of variation in transcript levels within a population and ultimately variation of whole-body complex traits/phenotypes.

26 Box 1.11: Types of Gene-by-environment interaction AG×E when fitted in a linear regression framework is simply expressed as the product of two independent variables of interest (interaction term). For exam- ple, if Y is gene expression, a G×E can be interpreted as a unit change in gene expression (Y ) for the dose of a given genetic variant (G) as the environment (E) changes by one unit:

E[Y |G, E] = β0 + β1G + β2E + β3GE +  (1.5)

G×E by definition cannot therefore be expressed as a purely additive relation- ship (function ν) between a genetic variant G and an environment E:

E[Y |G, E] = η0(G) + η1(E) (1.6)

Figure 1.8: Types of G×E response: A) ’True interaction’: Exposure to an environment results in a linear relationship between allele dosage and gene expression. B) Both exposed and non-exposed react to the environment, but with different magnitudes of effect. C) ’crossover’ or qualitative-interaction - opposing directions of effect observed in either environment. Where blue = exposure, red = not exposed.

If it is important to interpret the main effects (G, E) in an interaction model, independent variables should be centered. When independent variables are centered, the effect β1 represents a unit difference of Y with respect to G when E is fixed at its mean value. if the variables are not centered, E would be fixed at zero, which in some scenarios, makes the main effects (β1, β2) uninterpretable. If main effect interpretability does not matter, centering is unimportant as the interaction coefficient (β3) does not change. These are important considerations for G × E study design.

27 One of the first experiments to discover G×E eQTL was by Li et al.(2006) who performed a mapping experiment in C. elegans recombinant inbred lines (RILs) grown at two different temperatures (16◦C & 24◦C). Two strains, Hawaii and Bristol were used, representing two different extremes in ecology and genetic background. By measuring the effect of different temperatures on the same individual (paired mea- surements) experimental noise and the possibility of confounding was reduced. Li et al.(2006) found that most environmentally responsive eQTL acted in trans (59%), whilst only 8% of cis-acting loci differed between environments. This suggests that the majority of heritable gene expression variation that is responsive to the environ- ment acts in trans.

G×E eQTLs have also been discovered in yeast, worms and humans (Smith & Kruglyak 2008, Francesconi & Lehner 2014, Kukurba et al. 2016). Model organ- isms such as yeast provide a convenient model to detect G×E eQTLs because they can be grown in controlled conditions. Futhermore, they can be crossed to produce segregants with extensive LD, which results in high power to detect effects. Smith & Kruglyak(2008) used two fixed growth environments (glucose or ethanol carbon sources) to discover interaction effects on yeast gene expression. Using 100 yeast seg- regants derived from two strains of yeast (BY and RM), 47% of measured transcripts demonstrated a strain-condition interaction, with many transcripts being responsive to several environments. Strain-condition interactions explained 9% of the total gene expression variation whilst strain and condition accounted for much more gene expres- sion variance (21% & 36%, respectively). To map G×E eQTLs, 109 yeast segregants were genotyped. 31% of transcripts showed at least one G×E eQTL (total of 1,555 linkages) at an False Discovery Rate (FDR) of 5%. Notably, Smith & Kruglyak(2008) confirm that G×E eQTLs are more likely to act in trans rather than cis, with only 11% of G×E eQTL acting in cis. trans effects tended to be active only in specific conditions, whilst cis interactions were more likely to be present in both conditions, but differ in magnitude.

G×E eQTL mapping in humans has been most successful in isolated cell lines, such as monocytes, macrophages, LCLs & osteoclasts subjected to different treatments. Immune cells are particularly good models to study context specific eQTLs as aspects of inflammation are well understood and known to contribute to certain disease risk. Fairfax et al.(2014) subjected CD14+ monocytes from 228 individuals to two different treatments, mapping eQTLs in a naive, IFNγ and LPS (2-hour & 24-hour) treated

28 state. 11,476 gene expression microarray probes had at least one eQTL, with 43% demonstrating an eQTL only after treatment. 54% of eQTLs found to be significant in naive untreated cells, disappeared after treatment. Similar to both Li et al.(2006), Kukurba et al.(2016) findings, (Fairfax et al. 2014) found G ×E trans-eQTL effects were highly context specific, with associations being enriched for cytokine release, enzymes and transcription factors. Importantly, eQTL studies have utility when understanding the mechanism of complex disease GWAS loci. Whilst GWAS loci are enriched for ’baseline’ non-context specific eQTLs, it is possible that environmentally- dependent gene expression regulation could explain how some GWAS loci are likely to act. For example, G×E eQTL specific to IFNγ could provide a mechanism for diseases involving an inflammatory response. An example is rs609261-ATM, a unique cis-eQTL to IFNγ response which is in perfect LD with a ‘metformin response in diabetes’ locus (Fairfax et al. 2014). It is therefore a promising strategy that some GWAS loci that regulate gene expression to confer disease risk, will only be discovered when mapping G×E eQTL.

Many other environments and conditions have been used to successfully map G×E eQTL, including ionizing radiation, lead exposure, geography, oxidative stress, gluco- corticoids, cytokine treatment and infection (Smirnov et al. 2009, Ruden et al. 2009, Idaghdour et al. 2010, Romanoski et al. 2010, Maranville et al. 2011, Barreiro et al. 2012, Fairfax et al. 2014). Whilst treatment experiments are extremely useful to learn about genetic regulation and general properties of G×E eQTL I expect that they may represent extremes as compared to the more complex, continuous or subtle environmental exposures important for human health.

1.3.5 Summary

In chapter3 I start by exploring the relationship between several cardio-metabolic traits and their impact on peripheral tissue gene expression. By performing Tran- scription Wide Association Studies (TWAS) I detail both causal/non-causal gene expression regulation that spans both environmentally and genetically induced gene expression. I use heritability estimates of gene expression and cis-eQTL analysis to attribute the proportion of genetic and environmental regulation of gene expression on cardio-metabolic phenotypes. I find that approximately 50% of trait associated gene expression is explained by local genetic effects. I find a number of traits that show adipose gene expression specificity, but also highlight traits such as BMI that

29 have a pervasive role on peripheral tissues gene expression. Building on what I find in Chapter3 and previous work by others detailing the heterogenous nature of co- morbidity development in the obese population, Chapter4 explores work I have done in identifying primary tissue G×E eQTLs by using BMI as a physiological environ- ment. I identify 16 cis and 1 trans G×E eQTL and demonstrate for the first time, the robust replication of cis-acting G×E variants in an independent cohort. By inter- secting these findings with the GWAS literature, I describe one such variant that is an esophageal cancer risk locus. In chapter5, I explore whether cell type heterogene- ity can confound any of my analysis presented in this thesis, but I also explore how knowing the cell type composition of a tissue, can answer biological questions. To estimate cell type composition, I implement CIBERSORT, a state of the art method to estimate cell type proportions from whole tissues. I find that it is possible to obtain robust estimates of cell type proportions in whole adipose tissue biopsies and go on to demonstrate that macrophage content is correlated to obesity traits. I show that both macrophage and adipocyte proportion estimates are heritable and demonstrate that by adjusting for cell type estimates in both TWAS and cis-eQTL analysis, greater sensitivity and power is achieved respectively. Finally, I rule out that G×BMI eQTLs are due to cell type composition differences and use macrophage estimates to identify macrophage driven cis-eQTLs.

30 Chapter 2

Data, Samples & common analysis methods

2.1 Introduction

This short chapter aims to provide a background to TwinsUK and characterise all data and samples that are commonly used across each of the three research chapters (3-5) presented in this thesis. Any samples, data or experimental method specific to an individual research chapter will be detailed independently in each chapters method section.

2.2 The TwinsUK cohort

TwinsUK consists of approximately 12,000 mono and dizygotic healthy ageing twins that live dispersed throughout the United Kingdom (UK). The ratio of identical to non-identical twins is approximately 50:50. Twin zygosity is determined either through the use of a zygosity questionnaire or by genotyping. The recruitment to TwinsUK is not inherently biased by ascertainment as no individual is recruited for a specific disease or trait. The TwinsUK cohort have previously been shown to be representative of the general UK population (Moayyeri et al. 2013). The majority of participants are female. Studies with assays or measurements including RNA- sequencing used a sub-sample of female-only participants (see MuTHER/EuroBATS study) (Buil et al. 2016).

31 2.2.1 TwinsUK data collection

TwinsUK phenotypic measurements are obtained through standardised question- naires and through regular 3-6 hour visits to the Clinical Research Facility (CRF) based in St Thomas’ Hospital. Approximately 30% of participants have had mul- tiple visits and therefore repeated longitudinal phenotypic measurements have been recorded. Samples collected at each visit date include urine, feaces, saliva, blood, hair and multiple sources of DNA. Examples of phenotypic measurements include blood pressure, height and weight, balance, hearing tests, pain threshold tests and grip strength, Dual x-ray absoptiometry measurements of body fat, muscle and bone mineral density composition.

2.2.2 MuTHER/EuroBATS study

This thesis will primarily focus on a subset of twins (n = 856) that have measures of cardio-metabolic and anthropometric traits that relate to obesity, as obtained through clinical visits and/or quantitative blood sample assays. These individuals are part of The Multiple Tissue Human Expression Resource (MuTHER), a study that set out to analyse the relationship between gene expression variation in two primary human tissues (Subcutaneous adipose tissue, Skin) & Lymphoblastoid cell lines (LCLs) to characterise its genetic basis and relationship to complex traits. The collection period for the MuTHER study was from 2007-2009. The MuTHER Consortium generated gene expression microarray’s from the two primary tissues and cell line mentioned above. The MuTHER consortium also used new or previously generated genotyping arrays (Grundberg et al. 2012). Subsequent to this study, The European Identification of Biomarkers of Ageing using whole Transcriptome Sequencing (EuroBATs) project was established, which generated transcriptomic data (RNA-seq) from the previously mentioned three tissues as well as Whole Blood RNA-seq samples collected from the same individuals.

2.2.3 Tissue biopsies

For all tissue biopsies, volunteers provided informed consent and signed additional consent forms upon visit to the CRF. All procedures performed were in accordance to St Thomas’ Hospitals ethics committee (Ref: 07/H0802/84). Subcutaneous adi- pose tissue biopsies were taken via a punch biospy procedure from the abdominal region inferior and adjacent to the umbilicus. Clinicians performed a dissection step

32 to separate skin from the adipose tissue biopsies and both samples were then weighed, washed and frozen with liquid nitrogen. At each visit, peripheral blood samples were taken. B-lymphocytes were separated from the whole blood samples and cultured after Epstein-Barr virus (EBV) transformation was performed at the European Col- lection of Cell Cultures Agency.

2.2.4 Demographics of EuroBATS study

Participant of the EuroBATs study (n=856) have an age range of 38-84 (median age = 60) and BMI range of 16-47 (median = 25) (Figure 2.1). Whilst the median BMI is typical of the wider UK population, this study represents an older healthy ageing population (median UK age = 40).

Figure 2.1: BMI and age distribution of EuroBATs samples (n=856).

Extremes of the BMI distribution (BMI > 35) shown in Figure 2.1 as a skewed right tail were confirmed by validating measurements against Dual X-ray absorptiom- etry (DXA) measurements of visceral fat. All of these individuals were in the right tail of the visceral fat (VF) distribution (3SD > x¯VF ).

2.3 RNA-sequencing

All RNA-sequencing and genotyping data used throughout this thesis was generated and used as part of several publications (Vinuela et al. 2016, Buil et al. 2015, Brown

33 et al. 2014, Buil et al. 2016, Glastonbury et al. 2016). All RNA-Sequencing was prepared using a TrueSeq protocol on a HiSeq 2000. The resulting 49bp paired-end unstranded reads were aligned to the human reference genome (GRCh37) using BWA allowing for 2 mismatches in the seed (first 32 bases). Only properly paired reads were retained (samtools -f 3) with a mapping quality (MAPQ > 10) as used in Brown et al.(2014). Samples were excluded if the total mapped reads was less than 10 million. Allele specific expression (ASE) analysis was also used to identify possible sample swaps in which RNA-sequencing data did not correspond to genotyping data.

2.3.1 Meta-exon quantification

We used the GENCODE v10 gene annotation to construct meta-exon units in which overlapping exons are merged into a ’consensus’ exon. Reads whose start or end coordinates overlapped a meta-exon boundary were counted towards that feature. All meta-exon count data were normalised to the total number of mapped reads and only genes that were expressed in at least 90% of samples were taken forward. Meta-exon normalised counts were then inverse-normal rank transformed to minimise outlier effects. Table 2.1 breaks down the RNA-seq quantification summary statistics per tissue.

Tissue Total reads (Million) Exonic reads (Million) Adipose 36.34 22.43 LCL 43.64 22.37 Skin 33.72 19.27 Blood 30.0 15.8

Table 2.1: RNA-seq summary across tissues

As total read depth, mappability and sample size vary across tissues, total number of exons/genes quantified per tissue differ and are specified in Table 2.2.

Tissue No of samples No of meta-exons No of genes Adipose 766 118,643 19,111 LCL 814 116,529 18,229 Skin 716 114,377 19,901 Blood 384 85,811 16,149

Table 2.2: Exon and gene quantification across tissues

34 2.3.2 RNA-sequencing covariates

Throughout this thesis, models of gene expression, genetic variation and phenotypic data are adjusted for a number of covariates. RNA-sequencing has some inherent biases that are important to correct for, they include: GC content, insert size, PCR primer index from multiplexing (fat = 16, skin = 24, blood = 33, LCLs = 24) and library size. Whilst library size is corrected for as counts are normalised by total number of reads mapped, the remaining covariates are included as fixed effects in any linear mixed effects modelling performed. The distribution of insert size mode and mean GC content are plotted for each tissue in Figure: 2.2.

125 100 150

75 100 50 count count 50 25 0 0 0.46 0.48 0.50 0.52 0.54 0.56 25 50 75 100 125 Adipose − Mean GC content Adipose − Insert size mode

150 75

50 100 count count 25 50

0 0 0.48 0.51 0.54 0.57 40 60 80 100 Skin − Mean GC content Skin − Insert size mode

100 200 75 150 50 100 count count 25 50

0 0 0.450 0.475 0.500 0.525 25 50 75 100 125 LCL − Mean GC content LCLs − Insert size mode 125 30 100

20 75 50 count count 10 25 0 0 0.50 0.52 0.54 0.56 40 50 60 70 80 90 Blood − Mean GC content Blood − Insert size mode

Figure 2.2: Distribution of RNA seq GC content and insert size (n=856).

35 Additional covariates that are always adjusted for are family, zygosity, date of sequencing and batch. These variables are modelled as random effects. Zygosity and family are used to capture the variance attributed to the twin structure/relatedness of the data and are coded as the following indicator variables: twins (i,j ) of the same twin pair, familyi = familyj ; if they are monozygotic, zygosityi = zygosityj . If the twins are dizygotic zygosityi 6= zygosityj . For unrelated twins i.e from different families; familyi 6= familyj and zygosityi 6= zygosityj .

2.4 Genotypes

Twins were genotyped on three different arrays (1M-Duo, HumanHap610Q and Hu- manHap300 Illumina arrays). Quality control, processing, merging of arrays and its use have been previously described extensively. All data presented in this thesis was imputed using IMPUTE2 using the 1000 Genomes Phase 1 reference panel (In- terim freeze November 10, 2010). Both sample and SNP outliers were removed based on misreported sex, ethnicity, missingness call rate, heterozygosity rates, Hardy- Weinberg equilibrium (HWE) violations and outliers from Principle Components Analysis (PCA). The TwinsUK study population have been previously shown not to suffer from population structure, as determined by the use of STRUCTURE (Richards et al. 2008). All SNPs of INFO score ≥ 0.8 and MAF ≥ 5% were retained for analysis.

2.5 A note on PEER correction

It has become common practice to adjust for gene expression principle components, PEER factors, or surrogate variables. Methods such as PEER, aim to identify latent variables that capture unwanted batch effects and confounding variables that have broad effects on gene expression. It has been demonstrated for cis-eQTL studies that correction for PEER factors can increase cis-eQTL yield by three-fold. Additionally, for mapping G × E interactions on expression, PEER has been shown to also increase discovery ability (Westra et al. 2015). In this study, I similarly chose to adjust for PEER factors that maximize cis-eQTL yield. Several PEER factors correlated strongly to BMI (rmax = 0.52) (Figure: 2.3).

36 PEER factor 1 vs BMI PEER factor 2 vs BMI PEER factor 3 vs BMI cor=−0.13 cor=−0.52 cor=−0.36 0.10 0.06 0.05 0.05 0.04 0.02 0.00 0.00 Factor 1 Factor 2 Factor 3 Factor 0.00 0.05 − 0.02 0.05 − − 0.10 − 0.04 −

15 20 25 30 35 40 45 15 20 25 30 35 40 45 15 20 25 30 35 40 45

BMI BMI BMI

PEER factor 4 vs BMI PEER factor 5 vs BMI cor=0.17 cor=0.03 0.10 0.10 0.05 0.05 0.00 0.00 Factor 4 Factor 5 Factor 0.05 − 0.05 − 0.10 − 0.10 − 0.15 −

15 20 25 30 35 40 45 15 20 25 30 35 40 45

BMI BMI

Figure 2.3: BMI’s correlation to the first five PEER factors.

Like others (see Westra et al. 2015), I found both cis-eQTL and cis-G × E eQTL discovery yield to be significantly improved after correcting for PEER factors, as I maximise the ability to distinguish genetic effects which PEER does not capture, and non-main effect terms. In chapter5 I show cell composition heterogeneity between samples is well captured by PEER, potentially reducing confounding with BMI, and I believe PEER will capture many additional variables I have not measured.

37 Chapter 3

Pervasive effect of cardio-metabolic traits on peripheral tissue gene expression regulation

3.1 Introduction

Gene expression is a dynamic and responsive process that is regulated by both genetic variation and the environment (physiology, exposure and lifestyle). Gene expression can be both causal to a disease/trait, or responsive. Causal genes can inform us about the underlying etiology of a disease and can represent potential drug targets for treatment or prevention. On the other hand, gene expression that responds to a disease or trait is useful for understanding the biological mechanism of disease progression and co-morbidity development. Differential gene expression analysis is a method able to compare the expression of genes across conditions or continuous ‘exposures’ to make such insights.

Transcriptome-Wide Association Studies (TWAS)1 have been widely applied for multiple traits on both gene expression microarrays and RNA-seq from multiple hu- man tissues and cell lines (Vinuela et al. 2016, Keildson et al. 2013, Emilsson et al. 2008, Huan et al. 2015, Fadista et al. 2014, Greenawalt et al. 2011). For example Emilsson et al.(2008) analysed gene expression microarray data from 673 unrelated Icelandic individuals and showed that Body Mass Index (BMI) was associated to 72%

1TWAS in this context should not be confused with recent studies which use the term TWAS to describe the process of predicting gene expression from large GWAS cohorts that only have genotyping data (Gusev et al. 2016).

38 of genes expressed in adipose tissue. They demonstrated that BMI-associated genes were enriched for macrophage infiltration, extensive changes in adipocyte morophol- ogy, tissue remodelling and hypoxia processes. Emilsson et al.(2008) also found that 2,784 adipose expression traits could explain up to 20% of the BMI variance, whereas associations between BMI and blood gene expression explained a substantially smaller fraction. This finding suggests that adipose tissue is a more ‘disease-relevant’ tissue than blood for obesity comorbidity development. Emilsson et al.(2008) identified an inflammatory network called the ‘Macrophage-Enriched Metabolic Network’ (MEMN) and Chen et al.(2008) was able to demonstrate that the MEMN is likely to be causal of traits associated with metabolic syndrome. TWAS have therefore been shown to have utility in identifying gene expression changes in disease relevant tissues for phe- notypes of interest. TWAS enables the association of a trait or disease to previously unknown biological pathways and potential discovery of modifiable gene expression targets.

This chapter examines the extent of association between cardio-metabolic pheno- types and peripheral tissue gene expression regulation. I analysed a total of 2,680 RNA expression profiles from 856 twins in three primary tissues and one cell line (LCLs) and conducted a TWAS across 23 traits spanning five biological categories (adipokines, glycaemic, lipids, anthropometric, adiposity, and diet traits). These analyses were able to identify gene expression associated to multiple cardio-metabolic traits influenced by both the environment and underlying genetic regulatory varia- tion. To assess the interplay between cardio-metabolic traits, gene expression and genetics, several lines of analysis were conducted. First, I calculated gene expres- sion heritability estimates. Next, I demonstrated that genes whose expression are associated to a cardio-metabolic trait tended to be more heritable compared to the background tissue heritability rate. Finally, in an attempt to examine and explain this heritability enrichment, I conduct a cis-eQTL analysis and intersect the results with the TWAS findings. I show that approximately half of all genes associated with cardio-metabolic traits are regulated by at least one cis-eQTL, and a significant cis-eQTL enrichment was observed for several traits compared to the background tissue eQTL discovery rate. Whilst trait-associated genes are significantly enriched for local cis-eQTL genetic effects, many of the genes seem to be environmentally (non-genetically) regulated. Finally, I demonstrated that BMI has a profound impact on gene expression in multiple peripheral tissues as compared to every other trait

39 examined. These results form the basis for chapter4, where I examine the ability of BMI to modify the genetic regulation of expression.

3.2 Methods

3.2.1 Phenotype collection

All phenotypes were collected concurrently at clinical visits or determined via ques- tionnaire prior to clinical visits (Diet questionnaires). Appendix table: A.1 Lists the number of individuals with each phenotypic measurement.

3.2.2 Quality control of phenotypic data

All phenotypic data outliers were removed (3σ ≤ µ > 3σ). Diabetics and individuals reported to be ’non-fasting’ were also discarded. Whilst there are no assumptions in linear models for independent variables to be normally distributed, data trans- formation does have the property of reducing any remaining outlier effects, and of encouraging normality of residuals, a fundamental linear modelling assumption. All phenotypes were therefore inverse normal transformed:

rank(x − 0.5) y = i (3.1) i P(x) Where:

yi = normally distributed (transformed) phenotype th xi = phenotype measured in i individual

The five dietary phenotypes used in this study have been previously described and shown to capture major variance components of the participants dietary habits (Teucher et al. 2007). Each trait was derived from multiple Food Frequency Question- naires (FFQ). Principle Components Analysis (PCA) was then applied to extract five diets (first five PCs) that explained the majority of the FFQ variance. Interpretation of the PC loading’s allowed the assignment of names for certain diet characteristics such as a ‘Fruit and vegetable’ and ‘high alcohol’ (Teucher et al. 2007). The diet data was collected over a five year period and reflects up to three combined FFQ questionnaires. Dietary habits in TwinsUK has been previously shown to be stable over this length of time (Teucher et al. 2007).

40 3.2.3 TWAS model

The expression of each exon was treated as a quantitative trait and fitted as a depen- dent variable in a linear mixed effects model using lme4 (Bates et al. 2014). The ‘full model’ (Equation: 3.2) included the cardio-metabolic phenotype as an independent variable and was compared to a ‘null model’ (Equation: 3.3) excluding the cardio- metabolic phenotype. Model fit was assessed using a one degree of freedom analysis of variance (ANOVA).

yi = β0 + β1P + β2V +  (3.2)

yi = β0 + β1V +  (3.3)

Where:

th yi = expression of i exon P = Cardio-metabolic phenotype of interest V = Matrix of covariates  = Residual error

Covariates (V) conditioned on in both the ‘full’ and ‘null’ model include age, age2, mean GC content and median primer insert size per sample, date of sequencing, batch and twin structure.

3.2.4 Tissue specificity of traits: π1 analysis To assess the degree of tissue sharing between gene-expression trait associations across tissues, I used π1 estimates. π0 is a metric first proposed by Storey & Tibshirani

(2003) as a measure of false positive associations. π1 is estimated from the TWAS P- value distribution to ascertain the likely False Positive Rate (FPR). 1-π0 = π1 which represents the degree of True Positive Associations (TPA) in a replication sample (in this case, another tissue). This avoids any hard thresholding effects.

3.2.5 Heritability of cardio-metabolic associated exons

Heritability estimates were calculated using a variance components methodology (Visscher et al. 2004, Buil et al. 2015). All FDR5% significant cardiometabolic trait associated exon expression heritability estimates were compared against a randomly permuted sample of exons of the same sample size. A student’s t-test was performed

41 to assess whether the significant trait-expression associated exons were more heritable than a set of randomly sampled exon expression traits from the same tissue.

3.2.6 TWAS false discovery rate (FDR)

To determine which trait-exon expression associations were genome-wide significant and to account for multiple testing, I applied a 5% False Discovery Rate (FDR) using QVALUE.

3.2.7 eQTL discovery

To discover eQTLs, I took the following approach. Gene expression residuals were calculated by fitting a linear mixed effects model in GenABEL. Using GenABEL, I calculated a kinship matrix to account for the relatedness amongst and between twin pairs. Additional covariates were adjusted for RNA-seq technical covariates and 50 PEER components (detailed in chapter2). To determine genome-wide significance, 1000 permutations were generated in which sample labels were randomly permuted amongst each gene expression vector, and eQTLs calculated, as in the discovery stage. Observed (Discovery) test-statistics for each exon-SNP were then compared to the generated empirical null-distribution. Any permutation corrected P-value ≤ 0.01 was determined to be significant, corresponding to a 1% FDR. To determine whether any significant cardio-metabolic associated genes had an eQTL, the gene ID of each gene was matched to the corresponding FDR1% significant eQTL. To test for significant enrichment of eQTLs amongst cardio-metabolic associated genes, each set was compared to the background eQTL rate per tissue and a hypergeometric test was performed.

3.3 Results

3.3.1 Pervasive association of cardio-metabolic traits with peripheral tissue gene expression

To understand the extent of gene expression association to 23 different cardio-metabolic traits spanning four major categories (Adipokines/Hormones, Lipids, Adiposity (both anthropometric measures and DXA) and Diet, I performed Transcriptome-Wide Asso- ciation Studies (TWAS) in a total of 856 individuals from TwinsUK in three primary tissues (Subcutaneous adipose tissue, Skin, Whole Blood) and one transformed cell

42 line (LCLs). Each exon-expression value was modelled as a dependent variable (y) in a linear mixed effects model (see equations 3.2, 3.3 in methods) and cardio-metabolic traits fitted as independent variables, conditioned on BMI and age, two variables that have been shown to have strong effects on peripheral tissue gene expression previously (Emilsson et al. 2008, Vinuela et al. 2016)).

To determine the number of statistically significant associations, a 5% false dis- covery rate (FDR) was estimated per TWAS using QVALUE. Table 3.1 shows all TWAS results across each tissue. Many noticeable trends are apparent in the distri- bution of TWAS associations. First, BMI is the only trait with pervasive association across all tissues. BMI also shows the largest number of associations compared to any other trait. The large number of gene expression traits associated to BMI in adi- pose tissue is consistent with previous work using microarrays (Emilsson et al. 2008). Many tissues undergo systemic inflammation during obesity and therefore the distri- bution of BMI-associations across all tissues could reflect this low-grade inflammation (Greenberg & Obin 2006, Emilsson et al. 2008). Second, many traits show exclusive association to genes expressed in single tissues. For example, Leptin, Adiponectin, Insulin, Glucose, HDL, Triglycerides, WHR, all DXA and dietary components show almost exclusive association with adipose tissue gene expression, whilst Total choles- terol is specific to whole blood. Finally, LCLs show no association with any trait, potentially reflecting how LCLs are cultured and derived (see discussion). Whilst there were many cardio-metabolic trait associations in whole blood that are shared with adipose tissue, it is important to consider that the sample size for whole blood is approximately half that of the other tissues. Therefore, the power for discovery in whole blood should reflect the number of gene-trait associations.

43 # Adipose # Blood # Skin # LCL Trait Categories Phenotype associated genes associated genes associated genes associated genes Adiponectin 11,268 — — — Adipokines Leptin 607 — 1 1 Insulin 8,289 6 19 — Glycaemic Glucose 1,275 2 5 — LDL 4,272 3,409 4 — HDL 13,002 5 50 2 Lipids Total Cholesterol 9 1,095 10 1 Triglycerides 11,268 180 6 1 BMI 16,817 6,640 9,216 — Waist 3,897 2,420 5 — Anthropometric Hip 26 — — — WHR 7,232 8 1 — Trunk fat 10,922 378 2 — Lean mass 566 25 — — Trunk % fat 13,078 378 2 — Adiposity (DXA) Whole-body fat 1,622 — 2 — Whole-body lean 50 — 1 — Whole-body % fat 7,619 — 2 — Fruit and Vegetable 496 1 2 1 High alcohol 136 — 28 — Diet Traditional English 17 — 7 — Dieting 30 — 1 — Low meat 25 — 1 —

Table 3.1: Number of gene expression traits associated to individual phenotypes across all tissues (FDR5%).

3.3.2 Gene ontology enrichment analysis

To understand what classes of genes are associated with each cardio-metabolic trait, Gene ontology enrichment analysis was performed using TopGO (Alexa & Rahnen- fuhrer 2010). TopGO utilises 5505 individual gene ontology categories to test for enrichment of user-supplied genes amongst each GO term using a fishers-exact test. Additionally, I applied a strict Bonferroni correction to account for multiple GO term testing (P-value ≤ 9.1 × 10-6). As many GO categories were enriched across multiple phenotypes and tissues, I counted how many GO terms were shared or were exclusive to particular traits. Neurotrophin TRK receptor signaling pathway (GO:0048011) was significant across all traits that had at least 1000 gene expression -6 -30 associations (P-valuerange = [3.7 × 10 to 1 × 10 ]). Interestingly TrκB agonists in mice have been shown to ameliorate obesity and metabolic syndrome associated conditions (Tsao et al. 2008). Other over represented GO terms specific to adi- pose tissue traits were vascular endothelial growth factor receptor signaling pathway -6 -30 (GO:0048010) (P-valuerange = [3.3 × 10 to 1 × 10 ]), innate immune response -7 -30 (GO:0045087) (P-valuerange = [2.5 × 10 to 1 × 10 ]) , small molecule metabolic -6 -30 process (GO:0044281) (P-valuerange = [3.0 × 10 to 1 × 10 ]) and positive regulation -6 -30 of GTPase activity (GO:0043547) (P-valuerange = [2.9 × 10 to 1 × 10 ]). Many of

44 these GO enrichment’s highlight the downstream inflammatory and expansive impact obesity has on peripheral tissue biology.

3.3.3 Tissue specificity of cardio-metabolic expression asso- ciations

Gene expression has been shown to have a large tissue specific component. To deter- mine the extent of tissue sharing between gene-expression trait associations, I esti- mated π1 values for each cardio-metabolic trait across each tissue (traits with ≥ 500 gene expression associations) (see methods). For this purpose, π1 = 1 represents com- plete tissue shared effects, whilst π1 = 0 represents tissue specificity. This method has been used extensively to assess shared statistical associations across multiple tissues (Lappalainen et al. 2013, Buil et al. 2016, Stranger et al. 2012). Of all traits, fifteen had significant association with gene expression in both adi- pose and other tissues. These traits had low π1 estimates (Majority of π1 estimates ≤ 0.50), suggesting adipose-tissue specific gene regulation (Figure: 3.1). Of these 15 traits, ten were shared most significantly with whole blood. Conversely, traits that had associations in blood tended to have higher π1 estimates (Figure: 3.2), with the largest degree of sharing being with adipose tissue. In skin, only BMI had a signifi- cant number of gene expression associations (Table: 3.1), with large π1 estimates for both adipose and blood. BMI shows the greatest degree of gene expression association shared across tissues

(Adipose: π1max = 0.53, Blood: π1max = 0.78, Skin: π1max = 0.78). This recapitulates previously published work that BMI has a systemic effect on peripheral tissue gene expression. In comparison, leptin’s gene expression associations were exclusive to adipose tissue (Blood π1 = 0.10; Skin π1 = 0.13). In total, 607 genes were associated to leptin in adipose tissue, a relatively low number as compared to other traits presented here. Leptin is an adipokine that is exclusively excreted from adipocytes, circulates in the body proportional to total fat mass (R2=0.645) and acts on the hypothalamus as part of the Leptin-Melanocortin pathway to regulate appetite (Mahabir et al. 2007). In light of this, it was apparent that gene expression associated to leptin could be dependent on BMI. In an analysis conducted without adjusting for BMI, 12,597 genes were significantly associated to leptin in adipose tissue (FDR5%). Of these genes, 55% were also significantly associated to BMI (π1 = 0.55). These findings demonstrate a strong asymmetric distribution of association in which some traits are highly tissue specific (leptin) whilst other traits (BMI) have

45 Figure 3.1: π1 statistic calculated in blood, skin and LCLs, reflecting the proportion of shared cardio-metabolic trait associations found in adipose tissue. systemic effects on peripheral tissue gene expression. By understanding the environ- mental and genetic regulation of expression using TWAS, it is possible to identify important tissues in which phenotypes such as lipids, insulin, and body-fat distri- bution act. From these analyses, it is clear that adipose is an important tissue for cardio-metabolic trait gene expression regulation and whilst measures of whole body adiposity such as BMI show systemic gene expression association in many tissues, body-fat distribution traits such as percentage trunk fat, are more adipose tissue specific.

46 Figure 3.2: π1 values of blood significant hits in adipose, skin and LCLs.

3.3.4 Gene expression heritability and cardio-metabolic trait associations

I was interested in investigating whether genes with association between expression and any of the traits tended to have high gene expression heritability. Narrow-sense heritability (h2) estimation of gene expression has been previously utilised as a method to derive how much gene expression variation is driven by genetic, common environ- ment and unique environment components. As part of the EuroBATs consortium, Buil et al.(2015) exploited the twin design to estimate heritability across all four tissues using a Restricted Maximum Likelihood (REML) linear mixed effects model. Heritability of exon-expression across all four tissues is presented in Figure: 3.3. Me- dian h2 was broadly comparable across skin and adipose exon-expression (adipose, skin h2 = 0.07) with blood and LCLs showing slightly higher h2 estimates (Blood h2

47 = 0.12, LCL h2 = 0.11). I compared the top 10% of gene expression trait-associations to a null distribution (see methods) and found that for all TWAS performed, exon expression associated with a given trait tend to be highly heritable (BMI for example, Figure: 3.4). In addition to this, cardio-metabolic trait associated exons tended to be more heritable in adipose tissue (Figure: 3.4).

3.3.5 cis-eQTL regulation of expression associated cardio- metabolic traits

To understand why trait-expression associations are more heritable on average than non-trait associated exon-expression levels, I utilised a cis-eQTL study design to reveal how much local genetic control contributes to gene expression heritability (see methods). 9165, 8730, 5312 and 9550 cis-eQTLs were discovered in Adipose, Skin, Blood and LCLs respectively (FDR%1) (Buil et al. 2015). Approximately 50% of all trait-associated exons were found to be regulated by at least one cis-eQTL.

Trait Categories Phenotype # Adipose eQTLs # Blood eQTLs # Skin eQTLs # LCL eQTLs Adiponectin 5728 — — — Adipokines Leptin 369 — 1 1 Insulin 4485 3 6 — Glycaemic Glucose 710 1 — — LDL 2340 1259 2 — HDL 6747 1 31 — Lipids Total Cholesterol 4 371 4 — Triglycerides 5939 68 6 — BMI 8492 2552 4494 — Waist 2156 865 3 — Anthropometric Hip 20 1 — — WHR 3924 3 — — Trunk fat 5780 171 2 — Trunk Lean 268 9 — — Adiposity (DXA) Trunk % fat 6838 12 2 — Whole-body fat 971 — 2 — Whole-body lean 24 — 1 — Whole-body % fat 4175 — 1 — Fruit and Vegetable 333 — — — High alcohol 73 — 16 — Diet Traditional English 8 — 3 — Dieting 17 — — — Low meat 9 — — —

Table 3.2: cis-eQTLs with gene expression from each tissue that is associated to a cardio-metabolic trait at a FDR1%.

To test whether trait associated exons were enriched for eQTLs compared to the background tissue rate, I performed hypergeometric tests per trait across all tissues. The majority of traits investigated were enriched for local cis-acting variants (Figure: 3.5). Overall, enrichment for adipose cis-eQTLs were more significant than

48 Figure 3.3: Exon expression h2 estimates in adipose, skin, whole blood and LCLs. the other tissues. Noticably, BMI displayed a significant enrichment for cis-eQTLs across all tissues, whilst body fat distribution traits had an adipose specific cis-eQTL enrichment. These results demonstrate that genes associated to a range of cardio- metabolic traits are under both significant genetic and environmental regulation.

49 Figure 3.4: BMI exon expression associations significantly enriched for highly heri- table gene expression. “FDR5%”: Top 10% significantly associated exon-expression traits to BMI (best exon per gene). “Permuted”: Median of 1000 permutations of randomly sampled exon-expression heritabilities. “All”: Heritability for all exons ex- pressed. P-values represent the result of t-test comparisons of associated vs permuted heritability estimates.

50 51

Figure 3.5: cis-eQTL enrichment for trait associations as compared to the background cis-eQTL rate per tissue. Each bar represents the significance of a hyper-geometric enrichment test comparing the number of associated genes with cis-eQTLs to the background tissue rate of cis-eQTLs. 3.3.6 Discussion

Here I have described the tissue specificity of genetic and environmental control of gene expression (as explored using TWAS) in three peripheral tissues and one cell line, and how gene expression is associated to a wide range of cardio-metabolic traits. The distribution of gene expression association across tissues is asymmetric, with some traits showing exclusive association to one tissue (Insulin - Adipose tissue) whilst other traits have associations shared across all tissues (BMI - Adipose, Skin, Blood). For the majority of traits analysed, a specific adipose enriched gene expression profile was present, suggesting adipose tissue as the most functionally relevant tissue profiled for the majority of these traits. BMI shows the most pervasive effect on peripheral tissue gene expression, whilst total cholesterol associations were confined primarily to whole blood (1,095 genes). This tissue specificity highlights the importance of interrogating the most disease relevant tissue when dissecting disease mechanisms.

By utilising cis-eQTL mapping, I show that approximately 50% of trait-associated gene expression traits are under local genetic control. Although cardio-metabolic trait associated exons do show a significant enrichment for cis-eQTLs, it is unlikely that cis-eQTLs on their own account for the total difference in heritability amongst associated and non-associated exons. Grundberg et al.(2012) demonstrated that ap- proximately 40% of gene expression heritability is explained by cis genetic variation and that 60% of gene expression heritability is explained in trans. This suggests that trans-eQTLs could contribute a significant proportion of gene expression heritability. Finally, BMI was shown to have the largest effect on gene expression, particularly in adipose tissue. BMI results in an increased risk for many co-morbidities and has been shown to causally influence several traits analysed in this study (e.g fasting insulin) (Holmes et al. 2014). Epidemiological evidence suggests not all obese individuals develop comorbidities, the so-called ‘metabolically healthy obese’ (McLaughlin et al. 2007). In chapter4, I explore the possibility that obesity can act as a modifier, inter- acting with genetically controlled gene expression. Under this hypothesis, individuals with certain alleles are either protected or at an increased risk of disease only when they are also overweight or obese.

Several caveats with this chapter should be considered. First, I focus the analysis on tissues that are easy to collect and are available. Therefore the full extent of tissue specificity will only be realised when many more human tissues are profiled with deep phenotyping of the individuals studied. Multi-tissue datasets are becoming much

52 more widely available with efforts such as The Genotype-Tissue Expression resource (GTEx) collecting 8,555 samples spanning 53 seperate tissues. However, studies such as GTEx tend to lack detailed phenotypic data. Second, I do not make any causality claims; I do not assume whether gene expression is causative of the cardio-metabolic trait, or reactive. Causality is not possible to test using a small observational, non- longitudinal design and Mendelian randomisation would be under powered. Third, LCLs have previously been shown to have utility in pharmocogenetic studies, but the genetic regulation of gene expression in LCLs is largely independent of whole blood. Additionally, donor-specific phenotype effects are not maintained in cultured LCLs (Wheeler & Dolan 2012, Powell et al. 2012, Thomas et al. 2015). It is therefore worth considering that the in-vitro culturing and transformation process with Epstein-Barr Virus (EBV) affects transcription globally - abrogating any utility in studying the non-genetic effects of transcription in LCLs (C¸alı¸skan et al. 2011). Fourth, sample size differences amongst tissues presented here could limit the ability to call associa- tions tissue specific and I am therefore likely to have overestimated tissue specificity. Finally, TWAS have been shown to overestimate the number of associations due to confounding causing inflation. Whilst I have tried to account for this by adjusting for genetic relatedness, I cannot rule out all latent variables. Confounding is not a unique property of TWAS but a general concern for any association analyses that does not utilise genetic variants that are fairly robust to confounding. This topic is addressed further in chapter5, where I explore the impact of cell type heterogeneity on both TWAS and cis-eQTL discovery.

53 Chapter 4

BMI-dependent regulatory effects on multi-tissue transcriptomes

4.1 Introduction

4.1.1 Heterogeneous outcomes of metabolic complications amongst obese individuals

Obesity has been shown to be causally associated to a wide range of cardio-metabolic traits and co-morbidities (Holmes et al. 2014, Fall et al. 2013). Many of these cardio- metabolic traits and co-morbidities manifest in peripheral tissues throughout the body, such as adipose tissue, muscle, liver and pancreas. Genetic variants that con- tribute to obesity seem to be functionally important in the brain and Central Nervous System (CNS), whilst genetic variants that contribute to body-fat distribution and obesity co-morbidities such as insulin resistance are functionally enriched in adipose tissue (Locke et al. 2015, Shungin et al. 2015, Lotta et al. 2016). However, there are several lines of evidence that suggest not all obese individuals develop metabolic complications such as insulin resistance (McLaughlin et al. 2007). It has been previously shown that both metabolically healthy obese and metabol- ically unhealthy lean individuals exist in the population. For example, Yip et al. (1998) demonstrated that insulin mediated glucose uptake (IMGU), a commonly used measure of insulin resistance, can vary six-fold amongst healthy glucose tolerant lean subjects. In a fourteen year follow-up study, subjects with the highest IMGU at the start of the study had an eight-fold increase in diabetes incidence and a two-fold increase in cardiovascular disease at the study’s conclusion (Zavaroni et al. 1999). Similar observations have been observed in obese individuals, in which individuals

54 with the lowest IMGU risk had much better metabolic profiles (lower blood pressure, higher HDL cholesterol, lower triglycerides) compared to their high IMGU counter- parts (McLaughlin et al. 2007). Despite these observations, it is clear that increased weight gain in susceptible subjects leads to increased risk of metabolic complications, and that metabolic disorders can be ameliorated upon weight loss. McLaughlin et al. (2006) showed that when insulin-sensitive and insulin-resistant subjects lose an equal amount of body weight, only the insulin-resistant group experience statistically sig- nificant changes in metabolic markers (such as hsCRP) (McLaughlin et al. 2006). Obesity therefore can act to influence metabolic risk. This observation has been demonstrated in twin heritability studies, in which BMI can modify the heritabil- ity of blood pressure, and insulin sensitivity (Simino et al. 2013, Wang et al. 2009). These studies provide indirect evidence of underlying gene-by-obesity interactions in which individuals that are obese differ in risk of developing co-morbid complications, potentially due to their genotype.

4.1.2 Gene-by-environment interactions (G × E) in GWA and eQTL studies

Detection of G × E interactions on whole-body phenotypes in genome-wide associa- tion study designs has had limited success. Very few examples of G × E have been reported in the literature and even fewer have robust replication. Whilst many tens of thousands of samples are required to detect small genetic main effects, it is esti- mated that to detect G × E interactions of the same magnitude, at least four times as many samples are required (Smith & Day 1984). In contrast, eQTL studies have high power to detect association, even in 100 samples (80% power). This is because a SNP has a more direct effect on gene expression than to a whole-body trait (SNP → gene expression trait; vs SNP → whole-body trait). This suggests that sample sizes ≥ 400 are sufficient to detect interaction effects on expression of similar magnitude to that of main effect cis-eQTLs.

As discussed in the introduction (Chapter:1), eQTL discovery has been extensive, with the majority of genes shown to be under some degree of genetic control from common genetic variation. Tissue and cellular specificity of eQTLs demonstrates that genetic regulation of gene expression can be context specific. Many studies have now demonstrated that eQTLs can be dependent on age, sex, development and treatment (Vinuela et al. 2016, Kukurba et al. 2016, Francesconi & Lehner 2014, Fairfax et al.

55 2014). These context specific eQTLs (termed G × E eQTLs) have utility in explaining why certain diseases are more prevalent in certain conditions as well as identifying the mechanism (gene expression) of how some GWAS susceptibility loci act, for example, autoimmune disease prevalence in females as compared to males (G × Sex eQTLs) (Kukurba et al. 2016).

As I demonstrated in chapter3, BMI has a pervasive effect on global gene expres- sion in multiple tissues. BMI has been shown to modify the heritability of traits and is associated to pervasive changes in peripheral tissue gene expression. This could explain the heterogeneous development of obesity-associated co-morbidities. There- fore, BMI could act to modify the genetic regulation of gene expression. Because of this, I expect to find eQTLs that are differentially active in obese and lean individuals (referred to here as G × BMI eQTLs). In this chapter, I consider BMI as a physio- logical environment that interacts with common genetic variation, and assess how it regulates gene expression of four previously described tissue RNA-seq datasets in 856 female twins. By performing a cis-eQTL interaction analysis genome-wide, I am able to detect 16 FDR5% significant G × BMI eQTLs that are specific to adipose tissue. I provide replicated examples of these G × BMI eQTLs in an independent adipose tissue cohort, DeCODE genetics (n = 754). I demonstrate that these G × BMI eQTLs are enriched for adipocyte enhancers, and both metabolic and inflammatory GO terms. I use colocalisation analysis to demonstrate that one G × BMI tags a GWAS locus for esophageal cancer risk that has been previously shown to interact with alcohol intake. I extend the eQTL analysis to look for trans-acting G × BMI eQTLs, as the majority of G × E effects on expression in model organisms have previously been shown to act in trans. I detect one trans G × BMI effect that regulates 53 genes across the adipose transcriptome. To aid other researchers to explore the results, I developed an online resource1called PhenoExpress, in which all TWAS association summary statistics are available and real-time gene expression-trait association and G × BMI analysis can be performed. As the number of available samples and measured exposures grows, I predict G × E eQTLs will have and increasingly important role in characterizing the functionality and mechanism of GWAS loci discovered now and in the future.

1http://expression.kcl.ac.uk

56 4.2 Methods

4.2.1 cis-G × BMI eQTL discovery

As discussed in previous chapters, I used gene expression residuals corrected for family structure, zygosity and RNA-seq technical cofactors described in chapter2, and 50 PEER factors. Each exon was tested separately for an interaction by fitting a linear model using the ’modellinear cross’ function in MatrixeQTL:

yi = β0 + β1A + β2BMI + β3SNP + β4BMI × SNP (4.1)

Where:

th yi = The i exon expression vector A = age BMI = BMI SNP = Dosages [0 - 2]

4.2.2 Permutations and FDR estimation

To assess statistical significance of G × BMI eQTLs, I implemented an approximate permutation strategy for interactions. For each exon residual (corrected for age, SNP and BMI), I randomly permuted sample IDs and tested for a G × BMI eQTL as in the discovery analysis. Each exon was permuted 100 times and P-values stored and ordered by rank. As genes differ in the number of exons they contain, separate FDRs were calculated based on the distribution of number of exons per gene, per tissue. A 5% FDR was estimated by calculating the ratio of permuted test statistics that were more significant than the observed test statistic, divided by the number of permutations performed. Any G × BMI eQTL with an FDR5% corrected P-value ≤ 0.05 was deemed genome-wide significant.

4.2.3 Discovery of trans G × BMI eQTLs

Because the analysis was under-powered to detect trans G × BMI eQTLs genome- wide, I used a two-step strategy. First, I mapped cis-G × BMI eQTLs and discovered four significant SNPs (FDR5%) (rs113368712, rs1464171, rs3851570, rs35662778). For each cis-acting variant, I tested for trans-effects using all genes at a distance greater than 5MB from each SNP and across chromosomes. A Bonferroni correction strategy was used to assess genome-wide significance (P-value = 1.1 × 10-7).

57 4.2.4 deCODE genetics cohort

The deCODE replication cohort consisted of RNA-sequencing samples obtained from subcutaneous adipose tissue biopsies from 421 females and 333 males from Iceland (n = 754). All individuals had imputed genotypes that have been previously described. The mean age of individuals in the deCODE replication set was 44 ± 14 (mean ± SD) and BMI of 30 ± 6.6, slightly higher than in TwinsUK. RNA-seq reads were aligned using Tophat (v2.0.12) to Build 38 (UCSC genome browser) of the human reference genome and the Refseq transcriptome annotation. Tophat was configured such that reads were first aligned to the transcriptome, and then if no alignment was possible, to the reference genome. All overlapping exons were merged into meta-exon features. Meta-exon features that had a zero read count in ≥ 90% of individuals were not considered. RPKM values were calculated, normalising counts by both total mapped reads and gene length. Each exon’s expression was rank normal transformed. Several covariates were modelled, including GC content, average fragment length, the percentage of reads originating from coding bases (PCT), number of genes detected, total number of mapped reads as well as 50 PEER factors (see chapter:2). Picard v1.79 and RNA-SeQC v1.1.6 were used to obtain these metrics.

4.2.5 deCODE replication

For replication, I took forward the 16 G × BMI eQTLs that were genome-wide signifi- cant at a 5% FDR. Of these, 13 genes were available in the deCODE quantification. As mentioned previously, as different genome builds were used between the two datasets, I used liftOver to map coordinates from hg19 to GrCh38, focusing on meta-exons that had at least 90% overlap. This strategy identified the following eight genes: IFNAR1 ,CIDEA, ZNF117, PEPD, CHURC1, HLA-DQB2, SCFD2 and ANXA5. Three genes (ADH1A, SPAG17 and ERV-1 ) had a partial overlap (Table: 4.1). Two genes had poor overlap with deCODE, as the TwinsUK gene annotation had intron retention events, whilst deCODE did not (PHACTR3, CAST ). As multiple exons in PHACTR3 were significant, I tested the second most significant PHACTR3 exon for replication, which had a 100% overlap with a deCODE meta-exon. As the overlap at the CAST meta-exon was low (4%) I tested all CAST exons in deCODE for replication and corrected for the number of tests performed (30 exons). Three remaining genes were unavailable in deCODE, they were: RP11-71E19.1 - not present in RefSeq, POU6F2 which was filtered out by deCODE for having low expression and SIK1, which was

58 available but whose SNP (rs12482956) and proxies were not available in the deCODE imputation.

4.2.6 Overlap with GWAS loci

I intersected the 16 G × BMI eQTL lead SNPs and their proxies (r2 ≥ 0.6) with the NHGRI GWAS catalog, focusing on only genome-wide significant GWAS loci. For the overlap at the ADH1A locus, I used the Regulatory Trait Concordance (RTC) colocalisation method to test whether the G × BMI eQTL shares the same causal SNP as the GWAS locus. By conditioning on the lead GWAS signal and retesting the eQTL association, it is possible to see if the GWAS variant abrogates the effect of the eQTL. RTC achieves this by ranking the effect of conditioning the GWAS variant against the effect of conditioning on all other variants in the interval, thus accounting for the LD-structure of the region. All common SNPs centered on a 1MB window around the index SNP were used. For each variant (1,129 in total) I fitted both an interaction model, and a main effect model, conditioned on each SNP:

yi = BMI × SNPn + BMI × rs1693457 (4.2)

yi = BMI + SNPn + rs1693457 (4.3)

Where:

yi = ADH1A expression

SNPn = SNP1 − SNP1129

The results are then ranked by test-statistic significance so if the GWAS variant abrogates the eQTL effect more than any other SNP in the region, it would rank first and have an RTC score of 1:

N − Rank RTC = SNP s GW AS SNP (4.4) NSNP s Where:

RTC = value from [0-1]. 0 = independent, 1 = tag same causal variant

NSNPS = Total number of SNPs in region (1,129)

RankGW AS SNP = Rank of GWAS variant conditioning result

59 4.2.7 Gene ontology enrichment analysis

Gene enrichment analysis was performed using QIAGENs Ingenuity IPA platform. Genes were tested for enrichment against a background set of genes from each tissue. All enrichment P-values were corrected using the Benjamini-Hochberg (B-H) method.

4.2.8 Statistical mediation analysis

To test whether ALG9 expression mediates the observed trans -G × BMI eQTL network I performed mediation analysis as follows:

Unconditioned model:

yi = β0 + β1A + β2BMI + β3SNP + β4BMI × SNP +  (4.5)

Conditioned model:

yi = β0 + β1E + β2A + β3BMI + β4SNP + β5BMI × SNP +  (4.6)

β − β Mediation score = 4 5 (4.7) β4 Where: th yi = Expression of the i trans-gene

β4 = β4 from model 4.5 E = mediator (ALG9 ) expression

β5 = β5 from model 4.6 A mediation score of zero represents no mediation, as the conditional model does not differ from the unconditional model. A mediation score of one would represent full mediation. Mediation significance can be assessed by using Sobel’s test statistic:

β × β Z = 4 5 (4.8) p 2 2 2 2 β4 × S4 + β5 × S5 Where:

β4 = β4 from model 4.5

β5 = β5 from model 4.6 2 S4 = variance β4 coefficient from model 4.5 2 S5 = variance of β5 coefficient from model 4.6

60 4.2.9 ENCODE and Epigenome Roadmap enrichment anal- ysis

Enhancer, promoter and other functional element enrichment analysis was conducted using the Epigenome roadmap and ENCODE data, available on HaploReg v4. Hap- loReg v4 uses a list of user supplied SNPs to test for overlap enrichment as well as any SNPs that are in high LD (r2 ≥ 0.8). A background set of 1000G common variants are used to test against. All ennhancer annotations were derived from the 15-state ChromHMM model.

4.2.10 DXA visceral fat collection

Each twin underwent a Dual X-ray absorbiotometry (DXA) scan (QDR 4500 Plus) at the time of biopsy. DXA data was processed in accordance to the manufacturers protocol (Hologic).

4.2.11 Interactive website

To facilitate discovery and exploration of these results I implemented a website (Phe- noExpress) that can perform real-time modelling of BMI-gene expression association in all four tissues, and G × BMI eQTL analysis in adipose. The website allows users to specify genes and SNPs of interest, change covariates for their own model fitting and also to assess the effect of PEER correction. I supply all TWAS summary statis- tics as flat-files for download: http://expression.kcl.ac.uk

4.3 Results

4.3.1 BMI is associated to thousands of gene expression traits in multiple peripheral tissues

As shown in chapter3, I performed a transcriptome wide association study (TWAS) for BMI. BMI was found to be associated with 16,817, 9,216, 6,640, and zero genes in adipose, skin, blood, and LCLs respectively (Figure: 4.1).

61 Figure 4.1: Distribution of BMI TWAS association P-values. Enriched for significant P-values observed in adipose, skin and whole blood.

Tissue specificity of associations as assessed using π1 analysis, demonstrated that adipose tissue captured the majority of associations between gene-expression and BMI

(skin π1 = 0.78, blood π1 = 0.76) whereas in blood and skin, only half of the BMI associations significant in adipose tissue were present (skin π1 = 0.53, blood π1 = 0.54) (Figure: 4.2). This adipose specificity has been demonstrated previously using gene expression microarray data obtained from subcutaneous adipose tissue biopsies and whole blood in an Icelandic population (Emilsson et al. 2008). RNA-sequencing and a large sample size improved the resolution to detect more differentially expressed genes. All BMI TWAS summary statistics are available on the ‘PhenoExpress’ web service (see methods).

62 Adipose vs Skin Adipose vs Blood Adipose vs LCLs BMI associations Pi1=0.53 BMI associations Pi1=0.54 BMI associations Pi1=0 6000 800 600 4000 3000 400 Frequency Frequency Frequency 2000 200 1000 0 0 0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 P−value P−value P−value

Skin vs Adipose Skin vs Blood Skin vs LCLs BMI associations Pi1=0.78 BMI associations Pi1=0.53 BMI associations Pi1=0 6000 400 300 4000 1500 200 Frequency Frequency Frequency 2000 100 500 0 0 0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 P−value P−value P−value

Blood vs Adipose Blood vs Skin Blood vs LCLs BMI associations Pi1=0.76 BMI associations Pi1=0.48 BMI associations Pi1=0 2000 300 3000 200 1000 Frequency Frequency Frequency 100 500 1000 0 0 0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 P−value P−value P−value

Figure 4.2: π1 P-value distribution from each tissue. π1 represents the proportion of trait-associations shared across tissues. For example, the top left histogram is the P-value distribution of all BMI associated adipose tissue genes matched in skin. 53% of BMI associated adipose tissue gene expression is also associated to BMI in skin tissue.

4.3.2 Detection of G × BMI cis-eQTL regulatory variants

I performed a G × BMI eQTL genome-wide scan to detect BMI-dependent regulatory effects in each tissue. The analysis was restriced to common variants (MAF5%) from the 1000G phase one imputation. I applied a FDR5% non-parametrically using permutations (see methods) to call significant G × BMI cis-eQTLs. I successfully identified 16 G × BMI cis-eQTLs in adipose, and no significant hits in the other

63 two tissues (skin, whole blood) and cell line (LCLs) (Table 4.1). Twelve out of the sixteen FDR5% G × BMI cis-eQTLs also had a significant main effect in at least one other tissue, with five G × BMI cis-eQTL displaying multi-tissue cis-eQTL effects. All main-effect eQTLs where computed using the same dataset, but were modelled without evaluation of the interaction term.

4.3.3 Replication of G × BMI eQTLs

I took forward the 16 FDR5% G × BMI eQTLs for replication in an independent adipose tissue RNA-seq dataset (DeCODE genetics n=754). Due to the use of differ- ent genome builds (hg19 vs GRCh38) gene annotations (GENCODE v19 vs RefSeq) and aligners (BWA vs TopHat) - only 13 out of 16 G × BMI eQTL were available to test. The coordinates of the quantified exons differed between studies. I iden- tified nine exons in the DeCODE data that corresponded to TwinsUK exons with at least ≥ 90% overlap. Three of these exons replicated and eight showed the same effect direction (Table: 4.2). The replicated genes were the following: CHURC1 (P- -12 -4 value TwinsUK = 2.0 × 10 , P-value deCODE = 8.5 × 10 ) PEPD (P-value TwinsUK = -10 -6 4.8 × 10 , P-value deCODE = 4.2 × 10 ) and PHACTR3 (P-value TwinsUK = 1.6 × -8 -4 10 , P-value deCODE = 1.1 × 10 ) (Figure 4.3). Four other genes showed partial over- lap with deCODE: ERV1 (85%), SPAG17 (69%), ADH1A (33%) and CAST (4%). Whilst none of them replicated, CAST did show a consistent direction of effect with P-value = 0.05. I tested all other exons at CAST and found that an exon in the range 96,076,448 - 96,076,487 was significantly associated after correcting for multi- ple CAST exons (30) (P-value = 0.001, P-value = 0.03). This is the first example, to my knowledge, of replicated G × BMI eQTLs.

64 TwinsUK main effect eQTL Gene SNP EA EAF TwinsUK β deCODE β deCODE P-value Enhancer P-value (FDR1%) CHURC1 rs7143432 A 0.78 0.026 2.0 × 10-12 0.011 8.5 × 10-4 Adipose, Skin, Blood, LCLs — CAST rs13160562 G 0.69 -0.032 3.9 × 10-12 -0.004 0.053* Adipose, Skin, Blood, LCLs Adipocyte CIDEA rs7505859 C 0.62 -0.028 3.1 × 10-11 -0.004 0.21 Adipose and Skin Adipocyte ZNF117 rs6948760 T 0.4 0.039 4.4 × 10-11 0.0009 0.79 Adipose, Skin, Blood, LCLs — ADH1A rs1693457 C 0.18 0.034 5.9 × 10-11 -0.0015 0.85∗ Adipose Adipocyte RP11-71E19.1 rs1980140 A 0.79 -0.058 6.1 × 10 -11 NA NA Adipose Adipocyte PEPD rs10415555 A 0.81 -0.044 4.8 × 10-10 -0.014 4.2 × 10-6 Adipose, Skin Adipocyte ANXA5 rs2306420 G 0.71 0.022 1.4 × 10-9 0.001 0.52 Adipose, Skin, Blood, LCLs Adipocyte SIK1 rs12482956 A 0.71 0.058 3.0 × 10-9 NA NA — Blood 65 HLA-DQB2 rs114370295 T 0.27 -0.05 3.5 × 10-9 -0.004 0.45 Adipose — ERV3-1 rs11979998 C 0.52 0.032 8.4 × 10-9 -0.0008 0.84∗ Adipose, Skin, Blood, LCLs Blood POU6F2 rs34792397 G 0.75 -0.041 9.9 × 10-9 NA NA Adipose — IFNAR1 rs2834098 C 0.78 -0.047 1.4 × 10-8 0.002 0.58 — Stem cells SCFD2 rs7687982 A 0.75 -0.059 1.5 × 10-8 -0.006 0.26 — aMSC PHACTR3 rs6070866 G 0.51 -0.044 1.7 × 10-8 -0.02 1.1 × 10-4 Adipose Brain SPAG17 rs9661038 G 0.64 0.043 2.8 × 10-8 -0.004 0.083* — —

Table 4.1: 16 FDR5% significant G × BMI eQTLs discovered in TwinsUK, replicated in DeCODE. Main effect eQTL column specifies FDR1% eQTLs from chapter3. HaploReg v4 was used to determine whether a SNP or its proxy ( r ≥ 0.8) fell within an active enhancer. EA = effect allele, EAF = effect allele frequency. Dash = no eQTL or no enhancer overlap. * Denotes less than 90% overlap with a DeCODE exon due to gene annotation and genome build differences (see methods and Table: 4.2) deCODE deCODE % coverage Gene exon start exon end Beta deCODE P-value TwinsUK exon (hg38) (hg38) SPAG17 117,971,862 117,972,047 0.010 0.083 69% CHURC1 64,932,137 64,935,366 0.011 8.5 × 10-4 95% CIDEA 12,262,824 12,262,969 -0.004 0.21 100% PEPD 33,386,948 33,387,481 -0.014 4.2 × 10-6 100% PHACTR3* 59,845,188 59,845,265 -0.014 0.005 4% PHACTR3* 59,774,242 59,774,490 -0.020 0.0001 100% IFNAR1 33,349,388 33,349,543 0.002 0.58 100% SCFD2 53,273,825 53,274,001 -0.006 0.26 100% ADH1A 99,282,345 99,282,606 -0.0015 0.85 33% ANXA5 121,667,996 121,668,527 0.001 0.52 93% CAST 96,770,530 96,770,602 -0.004 0.053 4% HLA-DQB2 32,758,849 32,759,131 -0.004 0.45 100% ZNF117 64,989,946 64,991,036 0.0009 0.79 100% ERV3-1 64,990,354 64,993,414 -0.0008 0.84 85%

Table 4.2: meta-exon coordinates in deCODE study. Length of each exon was com- pared to that in TwinsUK and expressed as a percentage. *Two PHACTR3 exons were significant G × BMI, and as the lead exon only overlapped a deCODE meta- exon by 4%, I tested the second exon as it overlapped the corresponding deCODE meta-exon 100% .

66 Figure 4.3: The three replicated G × BMI eQTLs. For example, individuals homozy- gote effect allele (EA) for rs7143432 there is a positive association between CHURC1 gene expression and BMI. Individuals homozygote for the other allele (OA) show the opposite relationship.

4.3.4 Robustness of G × BMI cis-eQTLs discovery

To evaluate the robustness of the interaction findings, I performed a range of analyses to assess the impact of scale, normalisation, modelling assumptions and confounding,

67 namely population structure on G × BMI discovery. Gene-by-environment interac- tions and statistical interactions in general are known to be dependent on scale. Two types of interaction exist, additive and multiplicative. Log transformation of the dependent variable (in this case, each gene expression vector) would allow us to in- vestigate multiplicative gene-by-environment interactions on an additive (log-linear) scale. However, crossover interactions, also known as qualitative interactions, do not depend on the scale being used (de Gonz´alezet al. 2007). All G × BMI cis-eQTLs dis- cussed in this chapter demonstrate opposite effect directions when homozygote allele classes are compared, making them scale-independent cross-over interactions. An- other potential concern when fitting statistical interactions are outlier effects, both in terms of dependent and independent variables. Whilst the dependent variable (gene expression) are normally distributed residuals, BMI, as seen in figure 2.1 in Chapter2 has a heavy tail (morbidly obese individuals). I tested for G × BMI cis-eQTLs after transforming BMI to a normal distribution with mean zero and variance one, using an inverse rank-normal transformation (IVT) (X ∼ N (0, 1)). Table 4.3 demonstrates that all 16 FDR5% G × BMI cis-eQTLs remain significant, with consistent directions of effect. I am therefore confident of the robustness of the results against tails/slight non-normality. I chose not to inverse rank normalize BMI for this study, as excessive use of IVT has been demonstrated to induce inflated type-1 error rates (Beasley et al. 2009).

68 Gene SNP β P-value CAST rs13160562 -0.17 2.88 × 10-14 ADH1A rs1693457 0.19 2.24 × 10-13 ZNF117 rs6948760 0.19 1.75 × 10-11 PHACTR3 rs6070866 -0.26 1.92 × 10-11 CHURC1 rs7143432 0.12 4.93 × 10-11 RP11-71E19.1 rs1980140 -0.28 6.27 × 10-11 CIDEA rs7505859 -0.14 9.79 × 10-11 PEPD rs10415555 -0.21 2.35 × 10-09 ANXA5 rs2306420 0.10 6.17 × 10-09 HLA-DQB2 rs114370295 -0.24 7.61 × 10-09 IFNAR1 rs2834098 -0.23 9.10 × 10-09 POU6F2 rs34792397 -0.20 1.12 × 10-08 SIK1 rs12482956 0.27 1.22 × 10-08 ERV3-1 rs11979998 0.15 1.35 × 10-08 SCFD2 rs7687982 -0.28 3.13 × 10-08 SPAG17 rs9661038 0.20 2.98 × 10-07

Table 4.3: G × BMI cis-eQTL coefficients (β) and P-values after rank normal trans- forming BMI.

Three G × BMI cis-eQTL genes (ZNF117, ANXA5, HLA-DQB2 ) presented here have been previously detected as variance eQTLs (v-eQTLs) in LCLs by Brown et al. (2014). A v-eQTL is a gene whose expression variance varies as a function of geno- type dosage. Because of this, I investigated whether heteroskedasticity was present. Heteroskedastic errors bias standard error estimation in linear models, whilst coef- ficient estimates remain robust. Thus, any hypothesis tests conducted using biased standard errors can either deflate or inflate test statistic significance. To test for heteroskedastic errors, I implemented the Breusch–Pagan test (BP test). In standard ordinary-least squares (OLS) using a linear model:

y = β0 + β1x +  (4.9)

linear model residuals have a mean of zero, and therefore an estimate of the variance is the average squared value of the residuals. If there is a linear dependence of the variance on the independent variable (x) a regression of the estimated squared residuals (µb2) on the independent variable will reveal any heteroskedastic relationship:

2 µb = β0 + β1x +  (4.10)

69 robust robust robust Breusch–Pagan BP Gene SNP G × BMI G × BMI G × BMI (BP) test P-value SE t-value P-value CHURC1 rs7143432 24.63 1.85 × 10-5 0.0076 3.37 7.8 × 10-4 PEPD rs10415555 6.04 0.11 — — — CAST rs13160562 7.19 0.07 — — — CIDEA rs7505859 19.98 1.7 × 10-4 0.0037 -7.65 6.54 × 10-14 ZNF117 rs6948760 2.43 0.49 — — — ADH1A rs1693457 50.11 7.56 × 10-11 0.0079 4.28 2.17 × 10-5 RP11-71E19.1 rs1980140 2.53 0.47 — — — PHACTR3 rs6070866 3.67 0.30 — — — ANXA5 rs2306420 10.17 0.017 0.0034 6.55 1.09 × 10-10 SIK1 rs12482956 5.35 0.15 — — — HLA-DQB2 rs114370295 4.49 0.21 — — — ERV3-1 rs11979998 10.52 0.015 0.0047 6.76 2.79 × 10-11 POU6F2 rs34792397 1.81 0.61 — — — IFNAR1 rs2834098 0.14 0.99 — — — SCFD2 rs7687982 4.03 0.26 — — — SPAG17 rs9661038 1.63 0.65 — — —

Table 4.4: FDR5% significant G × BMI eQTL tested for heteroskedasticity using the Breusch-Pagan test statistic. If significant heteroskedasticity was observed (BP P-value ≥ 0.05) Heteroskedasticity-corrected (HC1) robust standard errors, t-values and P-values were computed for each G × BMI eQTL interaction term.

I show that the majority of the findings are not affected by heteroskedasticity. As can be seen from Table 4.4, five out of sixteen reported G × BMI eQTLs show signs of signficant heteroskedasticity (BP P-value ≥ 0.05). Upon calculating robust standard errors for each G × BMI term using HC1 (heteroskedastically consistent standard errors) (Long & Ervin 2000), ANXA5’s G × BMI eQTL effect is increased in magni- tude. Only one G × BMI eQTL becomes significantly weaker (CHURC1 -rs7143432), but not sufficient to remove all evidence of an interaction effect. Correcting for heteroskedasticity in standard eQTL analysis has been shown to be an important consideration for the field, as Daye et al.(2012) demonstrated that to 40% of eQTLs can have non-constant error-variances. Finally, it has been thoroughly documented that population stratification can lead to inflated test statistics in GWAS, gene-by-environment interaction studies and lead to overestimating the genetic variance (Sul et al. 2016, Kruijer 2016). Whilst TwinsUK individuals have been shown to not exhibit population structure (analysis using STRUCTURE) (Richards et al. 2008), I aimed to test this explicitly in terms of modelling G × BMI eQTLs. I used Eigenstrat to calculate Principle components from the genotype matrix of non-imputed SNPs and adjusted each G × BMI eQTL model accordingly (see methods). This analysis shows that population stratification

70 does not explain the presence of G × BMI eQTLs (Table: 4.5).

G × BMI G × BMI Gene SNP β P-value corrected corrected CHURC1 rs7143432 0.025 1.70 × 10-11 CAST rs13160562 -0.034 5.67 × 10-13 CIDEA rs7505859 -0.027 2.04 × 10-10 ZNF117 rs6948760 0.038 2.90 × 10-10 ADH1A rs1693457 0.035 2.51 × 10-11 RP11-71E19.1 rs1980140 -0.055 1.42 × 10-9 PEPD rs10415555 -0.044 1.47 × 10-09 PHACTR3 rs6070866 -0.052 8.08 × 10-10 ANXA5 rs2306420 0.022 4.59 × 10-9 SIK1 rs12482956 0.054 5.59 × 10-8 HLA-DQB2 rs114370295 -0.052 2.38 × 10-9 ERV3-1 rs11979998 0.032 1.21 × 10-8 POU6F2 rs34792397 -0.040 2.91 × 10-8 IFNAR1 rs2834098 -0.047 2.79 × 10-8 SCFD2 rs7687982 -0.060 2.21 × 10-8 SPAG17 rs9661038 0.043 6.65 × 10-8

Table 4.5: G × BMI eQTL coefficients and P-values after adjusting for 5 genotype principle components (PCs) to account for population structure, computed using Eigenstrat.

4.3.5 G × BMI eQTLs and their expression are highly adi- pose specific

I sought to investigate the tissue specificity and functional properties of all genome- wide significant G × BMI eQTLs. All 16 G × BMI eQTLs discovered are specific to adipose tissue. No nominally significant effects were observed in any other tissue (Ta- ble: A.2), whilst many G × BMI eQTL main-effects were significant across a range of tissues (Appendix table: A.3). Five of the sixteen G × BMI genes (PHACTR3, SPAG17, POU6F2, and RP11-71E19.1, ADH1A) had restricted gene expression to only adipose tissue (Appendix table: A.2). To investigate any functional element enrichment for G × BMI eQTL SNPs, I used HaploReg to test for significant enrich- ment in enhancers, promoters and conserved genomic sites, and the predicted effect of each allele. HaploReg utilises 127 cell type regulatory tracks from ENCODE.

71 Overall, the 16 G × BMI eQTL were enriched specifically for active adipocyte en- hancers (mesenchymal-stem-cell-derived adipocytes, P-value = 0.028). Eight SNPs had significant evolutionary GERP scores and nine SNPs disrupted specific or multi- ple Transcription Factor Binding Sites (TFBSs) (Table: 4.6). With several G × BMI eQTL genes and regulatory variants being specific to adipose tissue, this reinforces mounting evidence of the importance of ‘disease specific’ tissues of interest.

Gene SNP Enhancer Promoter Binding motif GERP AP-3, GATA HDAC2, HMGN3 CHURC1 rs7143432 NA NA rs4902333 hLTF, lMO2-COMPLEX CAST rs13160562 Adipocyte enhancer NA TATA rs7063 CIDEA rs7505859 Adipocyte ( rs56850047) NA NA NA ZNF117 rs6948760 NA NA GATA, Smad3 NA bbx, hbp1 ADH1A rs1693457 Adipocyte/Liver/Lung NA NA Pax-8, Sox RP11-71E19.1 rs1980140 Adipocyte + 5 other cells NA NA rs73205039 PEPD rs10415555 Adipocyte ( rs34258884) NA NA rs12986227 PHACTR3 rs6070866 Brain NA NA NA GATA,Pou3f2 ANXA5 rs2306420 Adipocyte + 6 other cells NA rs13145977 Sox,Tef SIK1 rs12482956 Blood NA Inf NA HLA-DQB2 rs114370295 NA NA NA NA ERV3-1 rs11979998 Blood NA NA NA POU6F2 rs34792397 rs36052669 Pou5f1 rs12674262 HNF1 IFNAR1 rs2834098 Stem cells NA rs2834098 Pou2f2, Pou3f2 HMG-IY, Pou1f1 SCFD2 rs7687982 aMSC (rs6857264) NA rs6857264 Sox SPAG17 rs9661038 NA NA TATA NA

Table 4.6: G × BMI eQTL SNPs and their proxy SNPS (r ≥ 0.8) and how they overlap with enhancers, promoters and the TF motifs they disrupt.

4.3.6 G × BMI eQTLs and their link to related traits

Next, I explored the genes that G × BMI eQTLs regulate. Peptidase D (PEPD) is essential for collagen production and the recycling of proline. Several GWAS have identified the PEPD locus as being associated to Adiponectin levels, Type-2 Diabetes, Tryglyceride levels, and Fasting Insulin (Dastani et al. 2012, Manning et al. 2012, Global Lipids Genetics Consortium 2013, Cho et al. 2012). The lead GWAS SNP is in low LD with the lead G × BMI (r2 = 0.15, D’ =1). The PEPD G × BMI is therefore interesting even if it is likely to be independent of the GWAS loci mentioned above.

72 Calpastatin (CAST ) is an endogenous and ubquitously expressed calcium-dependent cysteine protease. Interestingly, CAST has been implicated in a range of immune cell functions. CAST regulates the inhibition of macrophage hyperactivation via the Toll- like receptor (TLR) pathway and dampens the immune response. These functions could be linked to the heterogeneous nature of low grade systemic inflammation in obesity (Huang et al. 2011, Rose et al. 2013).

Finally, cell death activator (CIDEA) has been linked to a number of metabolic traits in model organisms (Zhou et al. 2003). Knockout CIDEA mouse display higher rates of lipolysis, basal metabolic rates and elevated body temperatures (Wu et al. 2014, Abreu-Vieira et al. 2015). Double knockout mice for CIDEA are resistant to diet-induced type II diabetes as they do not become obese. Similar evidence has been collected in human adipocyte experiments (Puri et al. 2008). In all TwinsUK individuals, CIDEA gene expression shows a negative association with BMI (P-value = 8.1 × 10-54). When stratified on individuals carrying the minor allele of rs7505859, the relationship flips, with rs7505859-risk carriers showing a positive association with BMI. Futher work will be necessary in identifying the precise role CIDEA plays in BMI and T2D development.

To systematically evaluate the variants associated with each of the 16 G × BMI eQTLs to published GWAS loci, I overlapped G × BMI eQTL SNPs and their proxies (r2 ≥ 0.6) with SNPs present in the NHGRI GWAS catalogue (restricted to P-value ≤ 5.0 × 10-8). A proxy SNP of the alcohol dehydrogenase 1A G × BMI eQTL ADH1A-rs1693457 (rs1229977, r2 = 0.63) is associated with a replicated esophageal cancer risk locus (Wu et al. 2012) whilst this locus covers several ADH family genes, as well as exhibiting a strong G × BMI eQTL, ADH1A also is a strong main effect eQTL (β = 0.89 2.9 × 10–43). As many eQTLs and GWAS loci colocalise by chance, I formally tested whether both SNPs were tagging the same causal variant or were independent (Nica et al. 2010). Using Regulatory Trait concordance (RTC) analysis, I demonstrate that rs1229977, the esophageal cancer GWAS SNP, has an RTC score of 0.97, where an RTC score of 1 indicates it tags the same causal variant (see methods). As well as the G × BMI effect I detect at this SNP, prior Gene-by-environment (G × E) interaction evidence has been described for this locus (Wu et al. 2012). The esophageal cancer risk loci (rs1229977) effect is modified by alcohol consumption. BMI, alcohol consumption, esophageal cancer risk and smoking have a complicated

73 relationship and unresolved causality, thus it is possible that BMI is acting as a external proxy environment to mediate the ADH1A G × E.

4.3.7 trans G × BMI eQTL detection in adipose tissue

As cis-eQTLs are enriched for trans-eQTL effects, I took a two step approach in identifying trans G × BMI eQTLs. First, as trans-eQTL discovery is particularly sensitive to PCA/PEER correction strategies, I used non-PEER corrected gene ex- pression residuals to map cis-acting G × BMI eQTLs. This approach identified four genome-wide significant G × BMI eQTLs (FDR5%) (Table 4.7). Secondly, I mapped these four variants against all genes expressed in adipose tissue and identified a single trans-acting G × BMI effect at ALG9 -rs3851570 regulating the expression of 53 genes across the genome (Bonferroni significant: P-value ≤ 1.1 × 10-7) (Figure: 4.4, Table: A.4).

non-PEER non-PEER PEER PEER Gene SNP β P-value β P-value HACL1 rs1464171 + 4.77 × 10-9 + 6.20 × 10-8 ALG9 rs3851570 - 2.35 × 10-8 - 0.016 SMG6 rs113368712 - 4.84 × 10-7 - 4.2 × 10-4 GAA rs35662778 + 6.64 × 10-7 + 3.46 × 10-7

Table 4.7: four FDR5% non-PEER cis-G × BMI eQTLs

I sought to replicate the ALG9 trans G × BMI eQTL in deCODE genetics (n=754). I tested rs3851570 against all gene expression traits in deCODE’s adipose RNA-seq dataset. Whilst the individual trans-genes did not replicate, there was a global enrichment for low P-values (π1 = 0.14), suggesting ALG9 is a broad regulator of adipose tissue gene expression. This observation is consistent with previously pub- lished trans-eQTL analysis studies, in which trans-genes themselves replicate poorly, but genome-wide regulatory effects are observed (Grundberg et al. 2012).

To investigate the ALG9 regulatory network, I performed gene ontology enrich- ment analysis using Ingenuity IPA. The 53 trans genes were enriched for ‘inhibition of matrix metalloproteases’ (Benjamini-Hochberg (B-H) P-value = 3.6 × 10-8), ox- idative phosphorylation (B-H P-value = 3.1 × 10-4) and a cardiovascular disease gene network (Figure: 4.4). ALG9 itself is not a known transcription factor, and functions as an enzyme in lipid-linked oligosaccharide assembly. However, given the cis-genetic

74 Figure 4.4: A) Visualisation of global trans-associations of ALG9. B) cardiovascular disease network which has an over-representation of ALG9 -trans genes. C) Lead trans-G × BMI eQTL ZNF423.

75 effect of rs3851570 on ALG9 and its trans-effect on 53 genes, I formally tested whether ALG9 is a mediating this network or whether they are independent events. Statis- tical mediation analysis (see methods) supports the role of ALG9 expression as a mediator for all 53 trans genes (Sobel’s P-value ≤ 10-3) (Appendix Table: A.5). It is likely ALG9 indirectly regulates these genes through a signalling cascade or indi- rectly, given it is not a known transcription factor (TF). The lead trans gene, ZNF423 (P-value = 8.2 × 10-13) has been shown to be an important master regulator of pre- adipocyte differentiation and determination by regulating PPARγ expression (Gupta et al. 2010). It is possible that ZNF423 is responsible for mediating the wide-spread trans-effects noted here, though this would require further investigation.

4.3.8 Properties of G × BMI eQTLs below genome-wide sig- nificance

As detailed previously, to obtain genome-wide significant G × BMI eQTLs, I used a conservative permutation strategy. However, to assess whether G × BMI eQTLs share any general properties or characteristics, I relaxed the significance threshold P- value to ≤ 1.0 × 10-6 (Appendix Table: A.6). All 127 G × BMI eQTLs at this level of significance are cross-over interactions, with homozygote classes demonstrating opposing directions of effect between BMI and expression. Like main-effect eQTLs, G × BMI eQTLs were proximal to the transcription start site (TSS) (median distance = 38kb). This was most noticeable for genome-wide significant G × BMI eQTLs (Figure: 4.5). Of the 127 G × BMI eQTLs, many were also main-effect eQTLs in multiple tissues, with adipose tissue eQTLs being twice as numerous (number of significant FDR1% eQTLs: Adipose = 20, Skin = 10, Blood = 8 and LCLs = 10) (Figure 4.6).

Whilst G × BMI eQTLs had main-effects across multiple tissues, the G × BMI effect itself was strictly adipose specific, with little tissue sharing observed (Skin π1 =

0.038, Blood π1 = 0 and LCLs π1 = 0). None of the SNPs I report to have G × BMI eQTL effects are associated directly with BMI, (BMI GWAS n = 339,224) (Overall π1 = 0), ruling out the possibility of genotype-environment dependence, which has been shown to have the potential to create spurious interactions (Dudbridge & Fletcher 2014).

76 Figure 4.5: A) Effect size (β) of TwinsUK G × BMI eQTL discovery and replication in deCODE genetics. B) 127 G × BMI eQTLs are proximal to the TSS. C) Visceral fat DXA measurements (G × Visceral Fat eQTLs) recapitulate G × BMI eQTLs.

4.3.9 Enrichment analysis of G × BMI eQTLs highlights both metabolic and immune processes

To investigate biological properties of the 127 G × BMI eQTLs, I performed gene en- richment analysis using Ingenuity IPA. Key metabolic processes were overrepresented, such as ‘the uptake of cholesterol’ (P-value = 9.1 × 10-5) and ‘RXR activation’ (B-H P-value = 6.1 × 10-3), which is a pathway targeted by several T2D treatments. Sev- eral immune processes were also over-represented, including the ‘antigen presentation pathway’ (P-value = 2.0 × 10-4) and ‘Quantity of macrophages’ (5.8 × 10-3).Given the role macrophages play in obesity co-morbidity development, I investigated whether G × BMI eQTLs are enriched for genes present in the Macrophage Enriched Metabolic Network (MEMN). Whilst there was no significant overlap between the 127 G × BMI genes and the MEMN (Binomial P-value = 0.78) nor a substantial enrichment (π1 = 0.021) two of the four (CIDEA, ADH1A, ICAM3 and TBX3 )G × BMI genes that are in the MEMN also show genome-wide significant interaction effects (ADH1A, CIDEA). Given the known relationship between obesity and macrophage infiltration, I explore the possibility that G × BMI eQTLs are driven by differences in macrophage proportion further in Chapter5.

4.3.10 Visceral fat DXA measurements recapitulate G × BMI eQTL findings

Whilst BMI has been used successfully as a proxy for adiposity, its limitations have also been examined, particularly for individuals with a high lean mass proportion

77 Adipose vs Adipose Adipose vs Skin Adipose vs Blood Adipose vs LCL Pi1=0.096 Pi1=0.045 Pi1=0.0075 Pi1=0.112 0 1500 0 800 0 400 0 800 Frequency Frequency Frequency Frequency

0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8

Adipose eQTL pvalue Skin eQTL pvalue Blood eQTL pvalue LCL eQTL pvalue

Skin vs Adipose Skin vs Skin Skin vs Blood Skin vs LCLs Pi1=0.063 Pi1=0.068 Pi1=0.0069 Pi1=0.064 0 800 0 1000 0 600 0 800 Frequency Frequency Frequency Frequency

0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8

Adipose eQTL pvalue Skin eQTL pvalue Blood eQTL pvalue LCL eQTL pvalue

Blood vs Adipose Blood vs Skin Blood vs Blood Blood vs LCLs Pi1=0.055 Pi1=0.045 Pi1=0.050 Pi1=0.068 0 600 0 600 0 600 0 800 Frequency Frequency Frequency Frequency

0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8

Adipose eQTL pvalue Skin eQTL pvalue Blood eQTL pvalue LCL eQTL pvalue

LCLs vs Adipose LCLs vs Skin LCLs vs Blood LCLs vs LCLs Pi1=0.053 Pi1=0.10 Pi1=0.04 Pi1=0.062 0 600 0 600 0 400 0 1000 Frequency Frequency Frequency Frequency

0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8

Adipose eQTL pvalue Skin eQTL pvalue Blood eQTL pvalue LCL eQTL pvalue

Figure 4.6: π1 analysis of G × BMI eQTLs main effects in each tissue. Each histogram represents the P-value distribution of the main effect eQTL of each exon-SNP G × BMI matched in a different tissue (e.g. Top left histogram is the P-value distribution of Adipose G × BMI eQTLs matched to the main effect adipose eQTL results.) Approximately 10% of adipose G × BMI eQTLs have main effects.

78 (Rugby players would typically be classified as obese (BMI ≥ 30). Additionally, BMI does not capture difference in body fat distribution and visceral fat area and other portal fat measurements has been associated with several obesity co-morbid traits independently of BMI (Bjorntorp 1990). Whilst TwinsUK is not over-represented with individuals with high lean mass, I validated the findings by implementing the same interaction analysis but replacing BMI with a highly accurate measure of visceral fat, obtained using Dual X-ray absorptiometry (DXA) on a subset of individuals (n = 682) (correlation with BMI r = 0.78) (Figure: 4.7).

Relationship between BMI and Visceral Fat r = 0.78

● 45 ● ● ●

● ●

● ● 40 ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

35 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● BMI ● ●● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ●● ● 30 ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●●●● ●● ● ●● ● ● ●● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●●● ● ● ● ● ● ● ● ●● ● ● ●● ● ●● ●●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ●● ● ● ● ●● ● ● ● ● ●● ●●● ● ● ● ● ● ● ● ● ●●● ●● ● ●● ●● ● ●●● ● ● ● ● ●● ●● ● ● ●●● ● ● ● ● ● ●●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●●● ●● ●●● ● ● ●● ● ● ● ● ● ●● ●●●● ●● ● ● ● ● ● 25 ● ●● ● ● ● ● ●●●●● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●●●●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●●● ● ● ● ●●● ● ●● ●● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ●● ●●●● ●●● ●● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●● ● ●●● ● ● ● ● ●● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ●● ●● ● ● ●●● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●●●● ● ● ● ● ● ●●● ● ●●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●●●● ● ● ●● ● 20 ● ● ●● ● ● ● ● ● ● ● ● ● ●

● 15 500 1000 1500

Visceral Fat Volume

Figure 4.7: BMI’s correlation to visceral fat area in expression individuals who also have DXA measurements (n = 682).

Visceral fat in TwinsUK subjects spanned two orders of magnitude (78 to 1,542g), representing substantial variability in visceral fat deposition. G × Visceral fat eQTLs recapitulate G × BMI eQTLs, with several G × Visceral fat eQTLs increasing in sig- nificance relative to G × BMI eQTLs by more than three orders of magnitude (PEPD,

79 CAST ) (Figure: 4.5). This suggests the increased sensitivity of this phenotype out- weighed the modest drop in sample size. Indeed, it has been shown that imprecise measurement of the environment when conducting G × E studies will tend to bias effect sizes towards the null (Greenwood et al. 2006).

4.3.11 Discussion

This chapter describes the first demonstration of replicated, robust G × BMI eQTLs, in which BMI modifies the genetic effect of gene expression regulation in adipose tis- sue. I show that these G × BMI eQTLs are enriched for adipose main-effect eQTLs. Furthermore, they are are highly tissue specific and overlap active adipocyte en- hancers. In addition, G × BMI eQTLs are enriched for both metabolic and immune processes, two well known biological pathways involved in obesity co-morbidity devel- opment. I extend the analysis to discover trans-acting G × BMI eQTLs and describe a regulatory network regulating 53 genes across the genome. As trans-effects have been estimated to account for up to 60% of gene expression heritability, and model organisms have demonstrated that trans-eQTLs are highly context/environment de- pendent, accounting for interaction effects in trans-eQTL discovery maybe fruitful in future, larger studies. Finally, due to the enrichment for metabolic and immune re- sponse pathways, G × BMI eQTLs could represent differential lipid storage potential, energy expenditure, adipocyte size, vascularisation or propensity for developing an inflammatory state. Genetic predisposition to inflammation in an obese state is an intriguing possibility that should be investigated further given the heterogeneity of metabolic complication development in the obese population and the known links to inflammation. I believe G × BMI eQTLs will aid in dissecting GWAS loci mechanisms in the future and to help understand the heterogeneity of obesity-related metabolic complications observed in the population.

By using PEER, I ensured that the G × BMI eQTLs I describe here are adjusted for hidden covariates. Furthermore, I show that these G × BMI eQTLs are robust against scaling, normalization, heteroskedasticity and population structure. How- ever, several caveats with this chapter should be addressed. First, this study and others highlight the benefit of using an eQTL study design over traditional GWAS to identify interaction effects with substantially smaller sample sizes. However, G × E discovery is thought to increase linearly with sample size (Westra et al. 2015). With 720 individuals, this study may be under-powered. Second, it is inherently difficult to replicate results based on RNA-sequencing. This is due to the use of RNA-seq

80 pipelines that have different gene annotations, reference genome builds, quantifica- tion methods, quantification units and alignment strategies. However, I have shown it is possible to replicate across disparate studies. Thus, this study demonstrates that greater replicative ability will be achieved if data processing is harmonized across discovery and replication cohorts. Additional differences between cohorts, such as differences in the environment, are not so easily addressed. This is the case with diet, in particular. It is also the case with sex: adiposity is sexually dimorphic, so the fact that the deCODE cohort is 44% male while TwinsUK is 100% female could have hampered the replication effort. Finally, I cannot rule out the possibility that BMI is acting as a proxy for another endogenous variable that I have not accounted for. An example is cell type composition heterogeneity, which could be one of the variables mediating the effect of BMI. I examine this hypothesis in chapter5.

81 Chapter 5

Population level variability in adipose tissue cell-type composition and its link to obesity

5.1 Introduction

Adipose tissue is the largest endocrine organ in the human body, having a role in insulin resistance, obesity, type 2 diabetes and many other cardiometabolic compli- cations. It is therefore of interest to understand the complex cellularity of adipose tissue, its variability in the population and how it affects health and disease. Adi- pose tissue is an extremely heterogeneous tissue, containing adipocytes, endothelial cells, and a variety of immune cell subtypes (R`afols 2014). Whilst whole blood is an extremely well characterized multicellular tissue, the cellular composition of subcu- taneous adipose tissue remains to be fully explored. The few studies that have inves- tigated adipose claim contradictory proportions of cell types, perhaps not suprising given flow sorting is usually performed in small sample sizes on heterogeneous subjects (Travers et al. 2015, Zimmerlin et al. 2010, Van Harmelen et al. 2003). Separating cell types present in subcutaneous adipose tissue presents a number of difficulties, such as adipocyte rupturing, shared cell type specific surface markers, and the expense of low-throughput flow sorting of hundreds of samples.

82 5.1.1 Gene expression deconvolution methods

Gene expression deconvolution is an established method for computationally estimat- ing cell types from expression profiles gathered from whole tissues and has been shown to accurately recapitulate ground truth gold standard flow cytometry measurements (Gong & Szustakowski 2013, Newman et al. 2015, Shen-Orr et al. 2010, Gaujoux & Seoighe 2012, Abbas et al. 2009). Most cell type estimation research has been successfully applied to blood gene expression arrays due to ease, and more recently, RNA-seq in solid tissues and tumours (Abbas et al. 2009, Newman et al. 2015). Several strategies exist for cell type composition estimation methods: complete/full deconvolution, unsupervised learning based on factor analysis and signature-based cell type proportion inference. Full deconvolution involves estimating the cell type gene expression profile from a group of mixtures (tissues), as well as the proportion of each cell type present (Wang et al. 2016). Unsupervised methods use matrix decomposi- tion to estimate factors that can act as proxies for cell types, without estimating the relative proportion of each cell explicitly (Houseman et al. 2014). Finally, signature gene based methods focus on estimating the relative proportion of each constituent cell type present by using a set of marker genes, either known via experimentation, or inferred through differential gene expression comparison between purified cell type ex- pression profiles (Newman et al. 2015, Gong & Szustakowski 2013). A signature/basis matrix is a collection of many cell-type specific or enriched gene expression markers for each cell type to be estimated from a mixture, obtained from performing differential gene expression analysis between reference profiles of purified cells.

Whilst I will focus on cell type proportion estimation using a signature/basis matrix, all deconvolution methods rely on the same underlying assumption, that gene expression measured in a tissue is a mixture sample that can be modelled as a linear combination of gene expression contributions from each constituent cell type present:

X yi = wikhkj +  (5.1)

Where:

th yi = gene expression of the i gene in mixture/tissues th th wik = i genes expression in k purified cell, th th hkj = cell type, k proportion in the j sample

This problem is then posed as a matrix decomposition problem:

83 X = WH (5.2)

Where:

X = tissue gene expression matrix (e.g. Adipose tissue) W = purified cells signature-gene matrix H = cell type proportion matrix

By regressing the tissue gene expression matrix (X) onto the purified cell type signature-gene matrix (W), it is possible to estimate the relative cell-type proportions (H). This problem is usually then suitably constrained to be non-negative and for cell proportions to sum to one, usually achieved through non-negative matrix factorisation (NMF) (Gaujoux & Seoighe 2012).

5.1.2 Accuracy of cell type estimation methods

Cell type estimation accuracy is measured against either in-silico simulated mixtures of cell types or comparison to flow-cytometry measurements, where cell type propor- tions are known and form a gold standard benchmark to validate against. Abbas et al.(2009) demonstrated that by using cell-type signatures obtained from purified cell gene expression profiles, they could achieve both low bias and high precision in estimating constituent cell types from Peripheral Blood Mononuclear Cell (PBMCs) samples (bias 2.4% ± 1.4%, precision 0.78% ± 0.52%). Abbas et al.(2009), found that cell type proportion estimation accuracy was closely related to κ, an estimate of the signature matrix’s condition number. A small κ signifies a better fit and lower co-linearity leading to a more stable decomposition. Newman et al.(2015) subsequently made improvements by proposing CIBERSORT, a method that utilises support vector regression (ν-SVR) to obtain a well-conditioned signature matrix and an optimal set of signature genes (support vectors) for each mixture/tissue (Newman et al. 2015). CIBERSORT was demonstrated to outperform other cell type estima- tion methods when estimating 22 leukocyte proportions from PBMCs, lung tissue and lymph nodes from follicular lymphoma patients, both on sensitivity/resolution of detection metrics, and overall accuracy (Newman et al. 2015). CIBERSORT was shown to perform particularly well compared to other methods in complex noisy tis- sues such as tumours, where there are many unknown cell types present (Newman et al. 2015). However, a caveat of all cell type estimation methods is that proportions

84 are relative not absolute, so unless all cell types of a tissue are estimated, it is difficult to compare to absolute cell counts.

5.1.3 Utility of gene expression-derived cell type proportions

Cell types have been estimated using gene expression in several tissues, including: whole blood, PBMCs, lung tissue and lymph nodes (Gong & Szustakowski 2013, Newman et al. 2015, Shen-Orr et al. 2010, Gaujoux & Seoighe 2012, Abbas et al. 2009). Cell type proportions in blood have been shown to vary with season and age (Dopico et al. 2015, Jaffe & Irizarry 2014). Jaffe & Irizarry(2014) demonstrated that many age-associated differentially methylated regions (DMRs) in several age DNA methylation studies, are confounded by t-cell heterogeneity. By correcting for the relative cell type proportions, Jaffe & Irizarry(2014) were able to detect gene expression and methylation changes with age that are not dependent on changes in cell type composition. Similarly Houseman et al.(2015) demonstrated that associations found in a Rheumatoid Arthritis (RA) Epigenome-Wide Association Study (EWAS), were due to immune cell proportion differences. Global inflation of test statistics were drastically reduced when conditioning on cell type estimates. It is therefore important, depending on the biological question, to account for cell type proportion differences in the analysis of gene expression.

Recent work has incorporated a genetics approach to tease apart eQTLs derived from whole blood into cell-type specific eQTLs by estimating cell type proportions from expression data (Zhernakova et al. 2015, Westra et al. 2015). Westra et al. (2015) used a TWAS strategy to find gene expression traits associated to measured neutrophil proportion and then used these neutrophil marker genes to estimate neu- trophil proportion in datasets that do not have complete blood count information. By fitting an interaction model between SNP and inferred neutrophil count, Wes- tra et al.(2015) were able to detect and describe neutrophil-dependent eQTLs from whole blood. This approach is promising because it allows for vast whole tissue ex- pression resources to be used to map cell-type dependent eQTLs, without the need to flow-sort, a particularly enticing strategy for complex tissues such as adipose. Here, I utilize CIBERSORT to obtain the relative proportions of four distinct cell types (adipocytes, macrophages, CD4+ t-cells and Micro-Vascular Endothelial Cells (MVEC)) from subcutaneous adipose tissue RNA-seq data obtained in 766 females from TwinsUK. I conduct extensive simulations to investigate whether the adipose signature matrix can accurately identify the purified cell types I want to estimate,

85 the range of cell type detection possible and that the signature matrix is robust to varying levels of noise and unknown cell content (contamination).

By exploiting the twin structure of the data I fit structural equation models and estimate the heritability of relative adipocyte and macrophage proportion to be 18% and 25% respectively. I also recapitulate the previously well-known hallmark of obe- sity, finding a significant association between macrophage infiltration abundance and Body Mass Index (BMI) and go on to demonstrate similar relationships are true with DXA derived visceral fat and android/gynoid ratio traits. In previous chapters I have shown BMI to have a large effect on adipose tissue gene expression. I esti- mate approximately 22% of these associations are driven by macrophage infiltration and inflammation differences amongst the study population. I fit interaction models to detect cis-eQTLs driven by cell type heterogeneity (cell-type specific eQTLs) and demonstrate that cell type heterogeneity represents an important variable that can confound TWAS and eQTL analysis and is also useful to gain insight into complex tissue biology.

5.2 Methods

5.2.1 Purified cell type data

I estimated cell type proportions from TwinsUK RNA-sequencing data obtained from bulk tissues, as previously described. To create the adipose signature matrix, I used purified cell type RNA-seq that were obtained from the Sequence Read Archive (SRA) as raw fasta files (see Table: 5.1). One independent set of experiments were used to construct the adipose tissue signature matrix, and another independent set to construct in-silico simulated mixtures to test deconvolution accuracy. Purified cell type data was aligned and quantified using the same pipeline as TwinsUK to ensure comparability.

5.2.2 GTEx RNA-seq quantification

All GTEx v6.1 RNA-seq data obtained from Subcutaneous adipose tissue samples were aligned using the same pipeline as TwinsUK as to ensure comparability. Samples labeled as ‘USE ME’ in the manifest file and with a library size ≥ 10 million reads were retained for analysis.

86 5.2.3 RNA-seq alignment and gene quantification

All (Adipose tissue and purified cell data) aligned BAMs were filtered to contain reads with a mapping quality greater than 10 and only reads that were properly paired and had two or fewer mismatches were kept. Samples were excluded if they failed to have more than 10 million reads map to known exons or if the sequence data did not correspond to actual genotype data as previously identified through Allele specific expression (ASE) (Buil et al. 2015). GENCODE annotation v19 gene counts were calculated using only protein coding genes without retained intron transcripts. All gene counts were transformed into Counts Per Million (CPM) a unit previously shown to be well suited for deconvolution tasks and to account for library size differences (Gong & Szustakowski 2013). Whilst all protein-coding genes were used for cell type estimation (20,345 genes) as filtering lowly expressed genes could bias estimates to highly abundant cell types, genes with at least 0.5 CPM expressed in 90% of samples (13,201 genes) were retained for transcription wide association (TWAS) and eQTL analysis.

5.2.4 Construction of an adipose signature matrix

RNA-seq obtained from cell types and their biological replicates were constructed into a purified cell type matrix with n rows (genes) and m columns (cell type). A class file, as described in Newman et al.(2015) was also constructed to describe the pairwise comparisons to perform between cell types to produce the signature matrix. The signature matrix contains all differentially expressed genes between each cell type at a specified FDR (q = 0.30, default). CIBERSORT has the additional benefit that each tissue/mixture is deconvolved with potentially different signature genes due to the algorithm implementing a ν-Support Vector Regression (ν-SVR) step, in which only the maximally separating support vectors are retained for the linear regression. SVR also aides in minimizing co-linearity as measured through the matrix condition number (κ), an ideal step when estimating cell types that are biologically closely related.

5.2.5 In-Silico mixture simulations

Purified cell types were combined at random proportions to generate 1000 in-silico simulated cell mixtures, termed “the ground truth” (S). A mixture matrix (M) was generated by drawing variables (equal to the total number of cells to form a mixture

87 with) from a random uniform distribution normalized to sum to one and multiplied by the purified cell matrix (C):

S = CM > (5.3)

Where:

S = truth (known simulated proportions) C = Matrix of purified cell expression profiles M = Mixture matrix specifying amount of each cell type [0-1]

A natural amount of noise is introduced into this problem because the purified cell types are obtained from different laboratories, using different sequencing chemistries. This is ideal as the same problem is present for the deconvolution of the real Adipose tissue mixtures, making the simulated data more realistic. However, to make the problem more challenging and to assess the signature matrix’s limit and ability to deal with noise in mixture profiles, I added scaled randomly distributed Gaussian noise to each simulated sample from 10% to 100%:

y1 = y0 + x + y1S (5.4)

Where:

x = is a random normal varaible with X ∼ N (0,1)

y1 = Simulated in-silico mixture with added noise S = scale factor [0–1]

5.2.6 Estimating cell types from adipose RNA-seq data

CIBERSORT was used to estimate cell type proportions from adipose tissue RNA-seq samples, both from TwinsUK and GTEx (Newman et al. 2015). For signature matrix construction in CIBERSORT, I used the default value of q=0.30 for the FDR because CIBERSORTs support vector regression step maximizes which variables best fit each adipose tissue mixture, so it is therefore better to have a lower false negative rate when detecting the initial set of signature genes. CIBERSORT also reports the condition number (κ) of the signature matrix, a measure of co-linearity and matrix stability. The signature matrix has a low kappa (κ = 3.22), suggesting a well-conditioned matrix was achieved.

88 5.2.7 Heritability estimation

Heritability calculations were performed using OpenMx (Boker et al. 2011). A stan- dard ACE model was fitted in which additive genetic, common and unique environ- ment variance components were estimated for macrophage and adipocyte proportion between twin pairs.

5.2.8 DXA phenotype collection

Android, gynoid and visceral fat volume were measured (n = 652) at the time of biopsy using dual-energy X-ray absorptiometry (DXA; Hologic QDR 4500 plus) with the standard manufacturer’s protocol.

5.2.9 BMI TWAS

Each gene expression measurement (CPM) was tested as a dependent variable in a linear mixed effects model as previously described in detail (Glastonbury et al. 2016). Independent variables in addition to BMI and macrophage proportion included tech- nical covariates that are well known to have strong effects on RNA-seq gene expression studies (Fixed effects: Insert size mode, mean GC content, Primer index) (Random effects: date of sequencing). I compared the model fit adjusted for macrophage pro- portion with the null model, not adjusted for macrophages, using a single degree of freedom ANOVA.

5.2.10 Weighted gene co-expression network analysis (WGCNA)

Signed weighted gene co-expression network analysis was carried out. Gene networks have been shown to follow a scale-free topology. WGCNA finds modules/clusters of highly correlated co-expressed genes using soft thresholding. The overall process has been described previously. Briefly, the gene expression matrix is represented as a weighted adjacency matrix:

A = [aij] (5.5)

Where aij entries represent the connection strength between genei and genej. This measure is a signed correlation and is a measure of co-expression of genes using the following power adjacency function:

89 β aij = |0.5 + 0.5ρ(xi, xj)| (5.6)

Where β=12 was chosen by inspecting the scale free topology model criterion. This determines at what β, scale free topology is satisfied. The adjacency matrix is then converted into a topological overlap matrix: P u aiuauj + aij TOMij = (5.7) min(kikj) + 1 − aij Where k is the sum of adjacency matrix row.

I determined co-expression modules by hierarchical clustering of 1-TOM (the dis- similarity matrix) using the CutTreeDynamic function in WGCNA. Modules are then represented as ’module eigengenes’. Module eigengenes are the first principle com- ponent of each modules standardised gene expression vector. All of this analysis was carried out using R and the WGCNA package.

5.2.11 Gene-by-environment interaction modeling

Interaction models were fitted using the ’modellinear cross’ function in MatrixQTL (Shabalin 2012). To maximize the power to detect cis-eQTLs that are dependent on cell type proportion, I inferred 50 PEER factors using inverse-rank normalized gene expression residuals corrected for sequencing date, Zygosity and family structure. No PEER factors were associated to any genotypes. Interaction models for relative macrophage proportion were adjusted for the following covariates: mean GC content, insert size, BMI and age. Macrophage proportion was also inverse normalized to ensure normally distributed errors.

5.2.12 Cis-eQTL analysis

For global cis-eQTL analysis, I defined each cis-window as a 1MB region around the TSS of each gene. SNPs with a MAF ≥ 5% were analysed from the 1000G phase 1 imputation and eigenMT was used to determine significant association (Davis et al. 2016). EigenMT calculates the number of effective tests per cis-window by performing eigenvalue decomposition and taking the effective number of tests as equal to the eigenvalues that explain 99% of the variance. This procedure has been shown empirically to control the FDR similarly to permutations. All analysis was performed using gene expression residuals and the MatrixeQTL package (Shabalin 2012).

90 5.3 Results

5.3.1 Construction of an adipose tissue signature matrix

To construct the adipose tissue signature matrix, I utilized previously published RNA- seq datasets obtained from biological replicates of purified cells that are present in subcutaneous adipose tissue (Adipocytes, Macrophages, CD4+ T-cells and micro- vascular endothelial cells (MVEC)) Table: 5.1.

Cell type Citation SRA accession

White adipocyte (Moisan et al. 2015) SRR1296133, SRR1296134, SRR1296135

SRR1422906, SRR1422907, SRR1422908, T-cell (CD4+) (Weinstein et al. 2014) SRR1422909

MVEC (DiMaio et al. 2016) SRR2776477, SRR2776478, SRR2776479

SRR2910670, SRR2910671, SRR2939145, M1/M2 macrophages (Zhang et al. 2015) SRR2939146, SRR2939148, SRR2939149, SRR2939150, SRR2939151, SRR2939152

Table 5.1: Purifed cell RNA-seq data from SRA used to construct Adipose tissue signature matrix

To test for differential gene expression amongst purified cells to identify genes that make up the adipose tissue signature/basis matrix, I used CIBERSORT, a ν- support vector regression (ν-SVR) method that is capable of identifying both cell- type specific marker genes and performing gene expression deconvolution in solid tissues (Newman et al. 2015). CIBERSORT provides a deconvolution P-value per sample, calculated from 1000 bootstrapped permutations (Newman et al. 2015). All adipose RNA-seq samples were successfully deconvolved at an FDR 1%. Hierarchical clustering of purified cell types used to construct the adipose signature matrix formed two distinct clusters, recapitulating biological relationships such as a non-immune (adipocytes/MVEC) and immune cell separation (M1/M2 macrophages, CD4+ t- cells) Figure: 5.1.

91 efre ecmr n eea iuain.FrtIso htteadipose the that show I First simulations. introduced, several is and noise benchmark when a performance performed and I accuracy ability, deconvolution test To matrix signature assess to simulations and Benchmarks 5.3.2 as Adipocytes, such in standards gold as used markers specific cell-type the separately cluster cell produce Bio- t-cells) immune to 5). CD4+ and = and MVEC) used k (Macrophage (Adipocytes, are (where fractions Non-immune that clustering recapitulated: k-means cells hierarchy unsupervised reference logical by Coloured of clustering matrix. signature Hierarchical 5.1: Figure CTSC eea o ee dnie nteaioetsu intr arxaewl known well are matrix signature tissue adipose the in identified genes top Several 0 50 100 150 200 250 300 xrse nMcohgsand Macrophages in expressed accuracy

SERPINE1 Adipocytes Adipocytes Adipocytes HMVEC ,

MMP1 HMVEC Reference matrixclustering HMVEC

and CD4

FOS CD4 VWF 92 CD4 , TCF7 CD4 nedteilcells, endothelial in M1

and M1 SCD

CD3 M2 M2 , COL1A1 o t-cells. for M2 M2 SPP1 M1 and , M1 F13A1

ADIPOQ M1 M1 and tissue signature matrix can accurately identify the four cell types when its applied to a set of independent, purified cell type RNA-seq datasets (Table: 5.2). All cell types were estimated robustly, with three out of four cell types attaining ≥ 99% accuracy in prediction. Pure Macrophages were estimated at 93.8%. Macrophages are particularly difficult to purify, so it is possible that the 6% CD4+ t-cells I estimate are present in the original purified Macrophage sample.

CD4+ M1 M2 Cell type Adipocytes (%) MVEC (%) t-cell (%) Macrophage (%) Macrophage (%)

Adipocytes 99.1 0.4 0 0.2 0.4 CD4+ 0 100 0 0 0 t-cell MVEC 0.9 0 99.1 0 0 Macrophage 0 6.2 0 0.1 92.8

Table 5.2: Cell type (%) estimates when applying the adipose tissue signature matrix to four independent samples of purified cells. Top row represents cells present in adipose tissue signature matrix. Left most column represents independent set of purified cell RNA-seq profiles.

I next assessed how well I could estimate the constituent cell proportions of a mixture of known cell types. I created 1000 in-silico mixtures of known proportions of each of the four cell types, and estimated them with CIBERSORT. Estimates from these four cell types were highly accurate, with mean absolute deviation (mAD) of estimated proportions to ground truth values ranging from 0.02 to 0.06 (Figure: 5.2). I performed an additional simulation, assuming adipocytes are the dominant cell type [0.75-1] and all other cells make up mixtures between [0-0.25]. Again, cell estimates matched ground truth proportions extremely well. This additional simulated high- lighted that the signature matrix’s sensitivity to detect MVEC is limited to mixtures where MVEC presence is ≥ 2.5% (Figure: 5.3).

93 1.00 1.00

0.75 0.75

0.50 0.50

0.25 0.25 Ground Truth MVEC Ground Truth Ground Truth Adipocytes Ground Truth

0.00 0.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 Estimated Adipocytes Estimated MVEC

1.00 1.00

0.75 0.75

0.50 0.50

0.25 0.25 Ground Truth CD4+ T−cell Ground Truth Ground Truth Macrophages Ground Truth

0.00 0.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 Estimated Macrophages Estimated CD4+ T−cell

Figure 5.2: Proportion of estimated cell types using the adipose signature matrix from in-silico mixture simulations.

94 1.00 0.25

0.95 0.20

0.90 0.15

0.85 0.10

0.80 MVEC Ground Truth 0.05 Ground Truth Adipocytes Ground Truth

0.75 0.00 0.75 0.80 0.85 0.90 0.95 1.00 0.00 0.05 0.10 0.15 0.20 0.25 Estimated Adipocytes Estimated MVEC

0.25 0.25

0.20 0.20

0.15 0.15

0.10 0.10

0.05 0.05 Ground Truth CD4+ T−cell Ground Truth Ground Truth Macrophages Ground Truth

0.00 0.00 0.00 0.05 0.10 0.15 0.20 0.25 0.00 0.05 0.10 0.15 0.20 0.25 Estimated Macrophages Estimated CD4+ T−cell

Figure 5.3: Estimated cell types using comparable ranges as those estimated from the TwinsUK Adipose tissue samples. All estimated cell types are highly correlated with ground truth simulations R = [0.988-0.995] and have small mean absolute differences (mAD) [0.003- 0.01]. MVEC range of detection ≥ 2.5%.

In adipose tissue, there are many more cell types present than I am estimating due to either unavailable purified RNA-seq datasets for that cell type, or lack of replicates to ensure stable construction of the signature matrix. I reasoned that if a substantial amount of unaccounted for cell is present in adipose tissue that shares marker genes with any of the four cell types I am estimating, I would overestimate that cell type. To test this, I added proportions of smooth muscle cells, dendritic cells and neutrophils to the mixtures, cells that have been shown to be present in adipose tissue and therefore reflect realistic ‘contaminant cells’. Neutrophils in particular could be troublesome, as neutrophils make up 60-70% of whole blood, which would represent blood contamination in the TwinsUK adipose biopsies. I retain accuracy in estimating the cell types present in the signature matrix when up to 10% of a given sample is composed of other cell types I am not estimating (Figure: 5.4). As it likely

95 the content of unknown cells in the samples is ≤ 5% given previous cell type estimates from adipose tissue, the adipose tissue signature matrix is robust in estimating cell types from mixtures with some unknown content.

Figure 5.4: In-silico cell type estimates with unknown content added. Additional cell types known to be present in adipose tissue (Smooth muscle, Neutrophils, Dendritic Cells) but that are not estimated by the adipose signature matrix, were included in simulated adipose tissue mixtures (Adipocytes, CD4+, MVEC, Macrophages) to assess estimation accuracy with varying amounts of unknown content. The adipose tissue signature matrix is robust to unknown cell types, with cell estimates maintain- ing a highly linear relationship with ground truth data. An unlikely scenario of 50% unaccounted for mixture content, resulted in systematic overestimation, yet a linear relationship was still maintained.

Given RNA-seq experiments are noisy due to technical factors during library

96 preparation and sequencing, I tested how much noise I could introduce into the simu- lations and still accurately predict cell type proportions, similar to analysis performed in Newman et al.(2015). I added Gaussian noise at 10, 50 and 90%. The estimates are robust when up to 10% of the mixture is distorted with noise, and a linear rela- tionship between ground truth and predicted estimates still holds when large amounts of noise are introduced (Figure: 5.5).

10% noise 50% noise 90% noise 1.00 1.00 1.00 0.75 0.75 0.75 0.50 0.50 0.50 0.25 0.25 0.25 0.00 0.00 0.00

Truth Adipocytes Truth 0.00 0.25 0.50 0.75 1.00 Adipocytes Truth 0.00 0.25 0.50 0.75 1.00 Adipocytes Truth 0.00 0.25 0.50 0.75 1.00 Estimated Adipocytes Estimated Adipocytes Estimated Adipocytes 10% noise 50% noise 90% noise 1.00 1.00 1.00 0.75 0.75 0.75 0.50 0.50 0.50 0.25 0.25 0.25

Truth MVEC Truth 0.00 MVEC Truth 0.00 MVEC Truth 0.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 Estimated MVEC Estimated MVEC Estimated MVEC 10% noise 50% noise 90% noise 1.00 1.00 1.00 0.75 0.75 0.75 0.50 0.50 0.50 0.25 0.25 0.25 0.00 0.00 0.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 Truth Macrophages Truth Estimated Macrophages Macrophages Truth Estimated Macrophages Macrophages Truth Estimated Macrophages 10% noise 50% noise 90% noise 1.00 1.00 1.00 0.75 0.75 0.75 0.50 0.50 0.50 0.25 0.25 0.25

Truth CD4+ Truth 0.00 CD4+ Truth 0.00 CD4+ Truth 0.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 Estimated CD4+ Estimated CD4+ Estimated CD4+

Figure 5.5: Cells type estimates from in-silico simulations with added scaled Gaussian noise (10, 50, 90% respectively).

5.3.3 Estimation of relative cell type proportions using RNA- seq from primary subcutaneous adipose tissue biopsies

I utilized the previously published dataset of 766 subcutaneous adipose tissue biopsies obtained from females within TwinsUK. All 766 samples had RNA-seq data, of which 720 were also genotyped and imputed to 1000 genome phase 1 (see chapter:2). I

97 quantified the abundance of protein-coding genes (at the gene level) using GENCODE v19 due to the library being polyA+ enriched. I used CIBERSORT with the adipose tissue signature matrix to estimate relative cell type proportions in all 766 individuals. By doing so, I identify adipocytes to be the most dominant relative cell type estimated out of the four in the adipose tissue data (µ = 0.95, range [0.73 - 0.99]), macrophages (M1/M2 combined) (µ = 0.03, range [0.004 – 0.22]), MVEC (µ = 0.01, range [0 - 0.19]) and CD4+ t-cells (µ = 0.002, range [0 - 0.11]). Figures: 5.6 and 5.7 show adipocyte and macrophage estimates in all expression individuals - two cell types I will focus on for the rest of the chapter.

Adipocyte proportion TwinsUK n = 718 Macrophage proportion TwinsUK n = 734

100

60 75

40 50 Count Count

25 20

0 0

0.90 0.95 1.00 0.00 0.02 0.04 0.06 0.08 Adipocyte proportion Macrophage proportion Figure 5.6: Estimated Adipocyte Figure 5.7: Estimated Macrophage proportion in TwinsUK proportion in TwinsUK

These estimates are within the range of previously published studies that were performed using flow cytometry, summerised in table 5.3. This suggests the esti- mates are accurate and other cell types present in adipose tissue (fibroblasts, MSCs, neutrophils, CD8+ t-cells) make up a small relative proportion of the overall cellular heterogeneity in adipose tissue.

98 Proportion Correlated Study Cell type Sample size Sex Age (%) with adiposity? CD4+ 3-4.7 17 M 35-55 n.s Travers et al.(2015) CD8+ 0.5-5.7 - - - n.s Macrophages 2.9 - 15.5 - - - P <0.05 (+) Zimmerlin et al.(2010) Endothelial cells 15.4 ± 4.8 8 F - not tested Van Harmelen et al.(2003) Adipocytes 85 49 M/F 16-73 P <0.05 (+)

Table 5.3: Estimates from several studies that flow sorted subcutaneous adipose tissue to measure cell proportions and their relationship with adiposity. 99 5.3.4 Cell type proportions are heritable, explaining major components of gene expression variance

As the simulations show that MVEC estimates below 2.5% are unreliable (estimated as zero) (Figure: 5.3) and 97% of samples have CD4+ t-cell’s estimates below 1%, I chose to focus on adipocytes and macrophages falling within 2 standard deviations of the mean (Total samples: nMacrophages = 734, nAdipocytes = 718). By exploiting the twin structure of the data, I fitted structural equation ACE models (SEM) and estimated the narrow sense heritability (h2) of adipocyte and macrophage variability to be 0.18 and 0.25 respectively. Previous studies have observed a wide range of heritabilities for cell types and their abundance in whole blood (Monocyte h2 = [0 - 0.23]; Regulatory t-cells (T-regs) h2 = 0) (Brodin et al. 2015). Brodin et al.(2015) found that most blood cell types have small heritability estimates and 58% of all cell types in blood had ≤ 20% of their total variance explained by additive genetic effects. (Brodin et al. 2015). These results seem consistent with what I observe in adipose tissue.

Principle component analysis (PCA) is a commonly used method to identify ma- jor sources of gene expression variance. In TwinsUK adipose tissue samples, PC2 is correlated strongly with adipocyte proportion and macrophage proportion (R = 0.46, P-value ≤ 2.2 × 10-16; R = -0.59, P-value = 2.2 × 10-16, respectively), with PC2 explaining 12.6% of adipose tissue gene expression variance (Figure: 5.8). This confirms that cell type heterogeneity at the population level contributes highly to gene expression variation in adipose tissue, and accounting for principle components should account for this variability.

100 Macrophage proportion vs PC2

100

0

−100 TwinsUK Adipose PC2 TwinsUK

−200

0.02 0.04 0.06 0.08 Macrophage proportion

Figure 5.8: Adipose tissue RNA-seq PC2 captures macrophage proportion heterogen- ity amongst samples.

Genes are not independent units, and often work in pathways and networks. Co-expression analysis methods try to functionally group co-expressed genes to un- derstand global gene expression paterns in terms of networks. Weighted Gene Co- Expression Network Analysis (WGCNA) is a widely used technique that uses the correlation structure of global gene expression profiles to construct functionally dis- tinct modules. The previously mentioned Macrophage Enriched Metabolic Network (MEMN) was identified using WGCNA, and characterises the inflammatory processes that take place in adipose tissue (Emilsson et al. 2008, Chen et al. 2008). I performed a WGCNA to see whether I could recapitulate the MEMN, and whether it was asso- ciated to the estimated macrophage proportions. WGCNA revealed several modules that correlate strongly with macrophage estimates, including the MEMN. (Figure: 5.9b). The most significant module, representing the MEMN was the green module, a co-expression network whose genes are enriched for glycoproteins (P-value = 7.1 × 10-63), Immunity (P-value = 1.1 × 10-23) and the innate immune response (P-value = 4.5 × 10-12). This module is positively associated to estimated macrophage proportion

101 Macrophage proportion vs WGCNA Inflammation enriched module

0.08

0.06

0.04 Macrophage proportion

0.02

0.00 −0.1 0.0 0.1 Inflammation Module

(a)

Module−trait relationships

−0.33 MElightcyan (1e−20) 1

−0.25 MEdarkgrey (1e−12)

−0.29 MEorange (2e−16)

0.22 MEblue (4e−10) 0.5

0.046 MEdarkorange (0.2)

0.058 MEtan (0.1)

0.17 MEdarkturquoise (4e−06) 0

0.36 MEpurple (3e−24)

0.67 MEgreen (9e−101)

0.48 −0.5 MElightgreen (1e−44)

0.14 MEbrown (1e−04)

0.43 MEblack (2e−35)

−1 0.26 MEdarkgreen (9e−13)

(b)

Figure 5.9: (a) Macrophage proportion is highly correlated to the ‘green’ WGCNA co- expression module which is enriched for immune response processes. (b) Correlation of macrophage proportion and all WGCNA co-expression modules in TwinsUK adipose tissue. 102 (Pearson’s R = 0.67, P-value = 9.86 × 10-97) (Figure: 5.9a). These findings reinforce the idea that macrophage infiltration has a very large effect on adipose tissue gene expression profiles and that cell type composition is a major driver of detected co- expression networks in bulk tissue RNA-seq samples.

5.3.5 Macrophage estimates are associated to whole-body obesity traits but not age.

Macrophage infiltration and abundance in adipose tissue is known to broadly correlate to obesity and levels of inflammation. I recapitulate this finding, demonstrating a sig- nificant correlation between Body Mass Index (BMI) and the estimated macrophage proportions (R = 0.24, P-value = 2.20 × 10-11), comparable to other published work using flow cytometry (Travers et al. 2015). A correlation analysis with age versus macrophage and adipocyte proportion revealed no significant association (r = -0.01, P-value = 0.71), in contrast to the age vs t-cell proportion correlation that has been observed in whole-blood (Jaffe & Irizarry 2014). A subset of twins (n = 652) also had Dual X-ray absorptiometry (DXA) measures of fat stores available, such as visceral fat volume (VFAT), android/gynoid (A/G) ratio and subcutaneous adipose tissue (SAT) volume. Despite the drop in sample size, the correlation of android/gynoid ratio and visceral fat with relative macrophage estimates were significantly higher (Pearson’s RA/G ratio = 0.40, RV F AT = 0.31, RSAT = 0.25). This confirms both the importance of macrophage biology in obesity and also suggests inflammation plays a more prominent role in body fat distribution. These associations also validate the ability to estimate cell type proportions from adipose tissue to gain biological insights.

5.3.6 Adjusting for Macrophage heterogeneity accounts for 22% of all BMI TWAS associations

It has been previously established that BMI has a profound effect on adipose tissue gene expression regulation, with the majority of the adipose transcriptome associated to BMI in multiple Transcription Wide Association Studies (TWAS) conducted on both microarrays and using RNA-seq in independent populations (Emilsson et al. 2008, Glastonbury et al. 2016). Because inflammation is a major component of obe- sity, I sought to assess the impact of BMI TWAS associations after adjusting for

103 macrophage proportion heterogeneity. I fitted two TWAS’s, adjusting and not ad- justing for macrophage proportion amongst samples to assess the impact BMI’s has on the protein-coding transcriptome. I recapitulate previous work with 6,032/13,174 protein-coding genes significantly associated with BMI (Bonferroni corrected P-value: 3.8 × 10-6). Adjusting for macrophage proportion resulted in 22% of associations no longer being significant, a loss of 1,309 gene expression-BMI associations (Figure: 5.10) (Appendix table: A.7). This demonstrates that whilst inflammation is an im- portant aspect of obesity etiology, the majority (78%) of BMI TWAS associations are likely to be independent of macrophage proportion but could still be dependent on activation state (e.g. resting vs activated). Of the 1,309 genes that are no longer sig- nificant after adjusting for macrophage proportion an example is Sialic acid-binding -8 immunoglobulin-type lectin (SIGLEC1 ) P-valueorig = 5.29 × 10 P-valueadj = 0.19), a gene that encodes for a protein that is found primarily on the surface of macrophages. Genes no longer significant also include genes such as C1QA, ITGAM and CD14 that are known to be primarily expressed in macrophage and immune cell lineages.

BMI TWAS adjustment

100

75

50 Macrophage Adjusted BMI TWAS 25

0

0 25 50 75 100 Unadjusted BMI TWAS

Figure 5.10: Dotted lines – Bonferroni correction threshold (P-value = 4.2 × 10-7. Blue – significant only in unadjusted TWAS, red not significant in both, teal – only significant in Macrophage adjusted TWAS. Green - significant in both TWAS.

104 5.3.7 GTEx adipose tissue gene expression strongly influ- enced by ischemic time

I sought to compare the primary subcutaneous adipose tissue biopsy cell type es- timates with those obtained from cadaveric tissue as part of the genotype tissue- expression consortium (GTEx). I remapped GTEx subcutaneous adipose tissue (Freeze v6; n = 310 after QC, see methods) RNA-seq samples and quantified gene expression using the same TwinsUK pipeline to ensure comparability (see methods). As ischemic time has been previously shown to be one of the most important latent variables for GTEx expression data (adipose expression matrix PC1: R = - 0.40, P-value = 1.34 × 10-13, PC2: R = -0.27, P-value = 1.94 × 10-6, PC3: R = -0.34, P-value = 9.10 × 10-10) (Figure: 5.11), I tested whether significant fibrosis driven by prolonged ischaemia could be detected by observing a change in cell type composition and therefore the underlying biology of the tissues collected.

Ischaemic time vs GTEx Ischemic time distribution expression PC1

●● ● ● ● ● 100 ●●●● ● ●●●● ●● ●● ●

●● ●● 40 ● ●● ● ● ●●●●●● ●● ● ● ●●● ● ●● ● ●●●●● ●●● ● ● ● ●●●● ● ● ● ● ●●●● ● ● ● ● ●● ● ●● ● ●●● ●

0 ● ●●● ●● ● ●●●●●●● ● ● ● ● ● ●● ●● PC1 ● ● ●●●●● ● ●●●●●● ●● ● ● ● ● ●● ● ● ●●● ● ● ● 20 ●●● ● ●● ●●● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Frequency ● ●● ● ● 0 −100 0 500 1000 1500 0 500 1000

Ischaemia time Sample ischaemia

Figure 5.11: Majority of Subcutaneous adipose tissue gene expression variance in GTEx dataset can be explained by ischaemic time (mins) variation.

I used CIBERSORT to provide bootstrapped estimates of cell type proportions in Gtex adipose tissue RNA-seq samples. All GTEx samples that pass deconvolution have a lower estimated adipocyte fraction (Adipocyte median proportion = 0.67) than TwinsUK (median proportion 0.96). However, samples that have a high probability of been well deconvoluted (defined as r ≥ 0.50, mean ± 2.SD, n = 70) have a higher mean adipocyte estimate (0.82), closer to TwinsUK mean adipocyte proportion, with a similar right-shifted distribution (Figure: 5.12). This likely reflects the difference

105 between punch biopsies obtained from live individuals (TwinsUK) and fibrotic ca- daveric tissue biopsies (GTEX). Of the 310 GTEx samples, 19% failed successful deconvolution (1% FDR), suggesting substantial differences of cell types present in the tissue, and those being estimated. Finally, with these samples that pass decon- volution (n=249), I observe a correlation between ischaemic time and endothelial proportion (R = 0.19, P-value = 2.8 × 10-3), suggesting hypoxia and fibrosis are the likely cause for these differences between primary and cadaveric tissue profiles.

Figure 5.12: Distribution of relative adipocyte estimates in TwinsUK and GTEx.

To validate the difference I observe between TwinsUK and GTEx adipocyte es- timates, I focused on the expression of adiponectin (ADIPOQ), a highly expressed and specific expression marker of adipocytes. In TwinsUK adipose tissue biopsies, ADIPOQ is expressed 4-fold higher (median CPM = 5,576, expression rank = 15) compared to GTEx (median CPM = 1,381, expression rank = 41). ADIPOQ ex- pression is low in some GTEx samples, which suggests few viable adipocytes remain

(GTEx ADIPOQmin = 47.2 CPM; TwinsUK ADIPOQmin = 1,455 CPM) (Figure: 5.13).

106 ADIPOQ expression across TwinsUK and GTEx

3e−04

2e−04 sample GTEx

Density TwinsUK

1e−04

0e+00

0 3000 6000 9000 ADIPOQ Expression (CPM)

Figure 5.13: Distribution of ADIPOQ expression (CPM) in TwinsUK and GTEx samples.

There are several possible explanations of why cell type estimates between Twin- sUK and GTEx differ. All GTEx adipose samples were obtained from the lower left leg of cadaveric samples. It is known that there are cell type differences between fat depots throughout the human body. In addition to this, fibrosis due to ischemia is likely to alter the number of viable cells available for sequencing. All GTEx samples were fixed and inspected by trained clinicians, with many samples being reported as containing large fibrotic regions (up to 50%) and nerve tissue. I expect that cell type composition estimation using additional cell types (additional blood cell types, fibrob- lasts, neuronal cells) may improve these estimates but this analysis demonstrates to the usefulness of using cell type composition estimation to infer important technical covariates that may be unknown to the experimenter, and potentially identify samples with purity issues.

5.3.8 Correcting for macrophage heterogeneity in adipose tissue increases cis-eQTL discovery yield

It is possible that eQTL analysis can be confounded with cell type heterogeneity because cell type composition varies between samples and gene expression from in- dividual cells approximates a linear relationship with the cells abundance. To assess

107 this hypothesis, I aimed to see if correcting for macrophage proportion can alter cis- eQTL discovery yield. I ran two separate eQTL analyses with and without correcting for macrophage proportion. First I obtained gene expression residuals that have all RNA-seq technical covariates and twin structure regressed out (see methods chapter). I fit these models using matrixeQTL, assessing all 1000G phase 1 imputed common variants (MAF5%) within given 1MB cis-window. To account for multiple testing, an inherent problem in eQTL based analyses, cis- eQTL P-values were corrected using EigenMT, which has been shown to approximate non-parametric permutation methods (Davis et al. 2016). Adjusting for macrophage heterogeneity amongst samples leads to a modest increase in cis-eQTL yield (2.3%) (naive model = 5,531, macrophage-adjusted = 5,665), confirming the utility in esti- mating cell types from gene expression data.

It has become standard practice in eQTL studies to use gene expression principle components, PEER factors, or other factor-analysis based methods, to estimate and adjust-out confounding factors from gene expression data. To test whether I can account for cell type proportions using latent factors, I utilise PEER. By Adjusting for 50 PEER factors in both the naive model and macrophage-adjusted model, I achieve a 39% and 36% increase in cis-eQTL yield respectively (naive = 7,665, macrophage- adjusted = 7,664). This confirms that gene expression PCs can capture the effect of cell type composition differences amongst samples, as well as many other unmeasured latent factors. It is clear from this analysis that PEER captures cell type heterogeneity and is sufficient to use PCA/PEER correction to account for these differences.

5.3.9 CECR1 - rs1807517 is a macrophage proportion-dependent cis-eQTL in adipose tissue

I have shown that cell type heterogeneity within a population is variable, heritable (Macrophage proportion h2 = 0.25) and accounting for it can increase cis-eQTL discovery. As some cell types are more abundant in some individuals, power to detect eQTLs active in specific cell types increases as a function of that cells relative proportion. Additionally, I have shown previously that obesity can interact with adipose tissue gene expression to create BMI-dependent eQTLs (G×BMI eQTLs) (Glastonbury et al. 2016) (Chapter:4). A potential explanation for some of these G×BMI eQTLs is that some genetic variants can predispose or protect their carrier from inflammation and other obesity associated co-morbidities that are known to be

108 heterogeneous amongst the obese population. Alternatively, cell type differences in adipose tissue amongst individuals with different BMI could drive the detection of cis-eQTLs specific to one cell type.

Previous published work has shown that it is possible to detect cell-type specific eQTLs in whole blood without cell sorting, by using cell type quantification methods or a proxy phenotype for a cell type of interest (Westra et al. 2015). I utilised es- timates of relative cell type proportion in interaction models to see whether I could detect cell-type dependent eQTLs in bulk adipose tissue samples. I restricted the analysis to common genetic variants in cis (5% MAF) and used PEER to account for latent unmeasured factors in the gene expression data (see chapter:2). I found one macrophage dependent cis-eQTL in adipose tissue that is Bonferroni significant (P-value threshold = 1.1 × 10-9, based on 44,967,459 association tests). Adenosine deaminase, CECR1 /ADA2 - rs1807517 (P-value = 7.8 × 10-13) is expressed primarily in monocytes/macrophages and is thought to promote the differentiation of mono- cytes into tissue resident macrophages. In the data CECR1 expression is dependent on macrophage proportion, with CECR1 expression increasing with the dosage of rs1807517’s C allele [0,1,2] (m0 = 0.33, m1 = 0.40, m2 = 0.51) (Figure: 5.14).

Figure 5.14: Inverse normalized macrophage proportion vs CECR1 PEER expression residuals.

rs1807517 is in tight LD with 3 intronic SNPs (rs1807518, rs2013910, rs5992639) and one 5’-UTR SNP (rs17807317). All of these SNPs have been previously shown to be multi-tissue eQTLs, fall within promoter and enhancer histone peaks of multiple tissues and are in elements that have been observed to harbor protein binding motifs (JUND, TBP, P300 ) (Haploreg v4, accessed: 18/08/2016).

I replicate this adipose macrophage eQTL finding in a context specific monocyte eQTL dataset (Fairfax et al. 2014). In monocytes, CECR1 -rs17807317 is a strong

109 eQTL when challenged with interferon gamma (IFN-γ) (t-statistic = 8.97; P-value = 2.44 × 10-17), but much weaker for LPS2 (t-statistic = 3.48; P-value = 5.9 × 10-4), LPS24 (t-statistic = 5.71; P-value = 2.8 × 10-8) challenged monocytes or in their na¨ıve state (t-statistic = 4.59; P-value = 6.12 × 10-6). This provides evidence that CECR1 -rs17807317 is more active in an inflammatory state. CECR1 expression is also positively correlated to BMI (R = 0.34, P-value ≤ 2.2 × 10-16) in all TwinsUK individuals but its eQTL is only very weakly dependent on BMI as my previous work demonstrates CECR1 -rs17807317 (top exon start: 17660194, stop: 17662466, P-value = 1.7 × 10-3, corrected for nine exons: 0.014) (Glastonbury et al. 2016).

5.3.10 G × Macrophage eQTLs at relaxed threshold

I expect some macrophage-dependent eQTLs to also be BMI-dependent, due to the role of inflammation and macrophage infiltration in obesity. To assess this, I relaxed my significance threshold to P-value ≤ 1 × 10-6. At this significance threshold, there are 10 macrophage-dependent eQTLs. 60% of these interactions show evidence of also being BMI-dependent (P-values corrected for 10 tests) (Glastonbury et al. 2016) (Table: 5.4) However, of the previous FDR5% 16 G × BMI eQTLs described in chapter4, none were strong G × Macrophage eQTL (lowest CHURC1 -rs7143432 P- value = 0.003, P-value corrected for 16 tests = 0.05). This suggests my previous analysis in chapter4 was not affected by cell-type composition confounding.

G × Macrophage G × Macrophage SNP Gene G × BMI P-value β P-value

rs1807517 CECR1 0.01 0.084 7.75 × 10-13 rs12549382 DERL1 0.01 -0.086 1.01 × 10-8 rs13397309 PLEK 1.5 × 10-4 -0.081 8.72 × 10-8 rs12191668 CD83 6.3 × 10-5 -0.098 2.01 × 10-7 rs13038557 ATP5E 0.02 -0.13 2.98 × 10-7 rs6677754 FNBP1L 0.01 -0.089 3.18 × 10-7 rs2425170 RBM12 0.08 0.1 3.87 × 10-7 rs10917148 USP48 0.07 0.21 7.27 × 10-7 rs4246055 PRRT1 0.08 0.35 8.65 × 10-7 rs7232352 IER3IP1 0.59 -0.12 9.98 × 10-7

110 Table 5.4: 60% of G×Macrophage eQTLs are nominally significant G×BMI eQTLs. P-values corrected for 16 tests

Two macrophage-dependent eQTLs at this signficance threshold (P-value ≤ 1 × 10-6) are also strong BMI-dependent eQTLs. Plekstrin (PLEK ) is exclusively ex- pressed in immune cells and is associated to phagosomal membranes in macrophages. Fairfax et al.(2014) called PLEK -rs1867313 (rs1867313-rs13397309 LD r = 0.743) an eQTL in both the naive state and IFN-γ treated state (INF-γ t-stat = -9.02, naive t-test = -8.55). CD83, a cluster of differentiation molecule expressed in a wide variety of immune cells, strikingly in the (Fairfax et al. 2014) dataset, shows exclu- sive eQTL activity in IFNγ stimulated monocytes (rs12191668-rs9296918 LD r=0.93) (IFNγ t-stat =-3.95, Naive = n.s, LPS2 = n.s, LPS24 = n.s). Collectively, these G × Macrophage eQTL seem to represent change to a pro-inflammatory activation state, which is also known to be true for obese compared to lean individuals.

5.4 Discussion

RNA-seq profiling from primary tissue biopsies is becoming widely used. I and oth- ers have shown its utility and ability to capture more in-vivo effects as compared to cell lines such as LCLs. The cellular complexity of primary tissue biopsies is often unaccounted for and can confound certain types of analysis. I show that many of the BMI gene-expression associations are explained by differences in macrophage infil- tration between individuals, and that adipose tissue derived macrophage proportion is linearly associated to a range of adiposity measurements but not age. By fitting structural equation models, I also show that cell type proportion for macrophages and adipocytes are heritable. As cell type heterogeneity has a genetic component, it is likely that certain genotypes can predispose or protect individuals from macrophage infiltration and the consequences of inflammation in obesity. Additionally, variability in adipocyte number could also influence individual ability for adipose tissue expan- sion and storage.

I demonstrate TWAS is confounded by cell type composition, though cis-eQTL analysis is robust against such confounding, if adjusted for principle components or factors derived from the gene expression profile being analysed. I show that whilst adjusting for cell type composition does increase the cis-eQTL discovery yield, ad- justing for PCs results in a much more dramatic improvement and captures cell type

111 heterogeneity. This is because PCs capture additional underlying hidden latent vari- ables that have a broad effect on global gene expression variance. Such factors capture both biological factors; age and BMI for example, technical factors such as library size, GC content, batch effects and cell type composition.

It has been demonstrated that cell proportions can be used to identify cell type specific eQTLs (Westra et al. 2015). Whilst the ability to detect interactions with estimated cell proportions is limited both in terms of sample size and the accuracy of cell type estimation from a complex tissue such as adipose, I have demonstrated it’s possible to detect cell-type proportion dependent eQTLs in whole tissues. This allows us to study the genetics of gene expression of cells in-vivo, in their natural biological environment, which in-vitro experiments cannot fully capture. I reason, as others have shown, the ability to detect cell-type specific eQTLs in complex tissues will increase approximately linearly with sample size (Westra et al. 2015). Future work will likely involve the collection and analysis of many more purified cell populations and single cell RNA-sequencing datasets. This will allow for the deconvolution of futher publicly available whole tissue datasets. Cell type estimation methods may become more accurate by utilising multiple assays to uncover cell type specific features (such as chromatin marks, CpG DNA methylation and gene expression). eQTL analysis in the future will be routinely performed on thousands of samples and therefore the ability to detect cell-type specific interactions and to characterise the cell and tissue specific landscape of genetic regulatory effects will reach fruition.

112 Chapter 6

Concluding Remarks

In this thesis, I have described the regulation of gene expression in multiple peripheral tissues. In chapter3, by utilising a unique deeply phenotyped and well characterised study population, TwinsUK, I found that cardio-metabolic traits are under extensive gene expression regulation. To our knowledge, this represents the largest single study performed to date that combines multi-tissue gene expression and phenotypic data. For the first time using RNA-sequencing data, I demonstrate that of the traits stud- ied, BMI has the most pervasive effect on multiple peripheral tissue gene expression profiles, with gene expression enrichment for both metabolic and inflammatory pro- cesses. Whilst only BMI had multi-tissue gene expression associations, many traits were specific to single tissues, highlighting the importance of using the right tissue of disease manifestation. To characterise the genetic basis of gene expression, I con- ducted both a gene expression heritability and cis-eQTL analysis, demonstrating that gene expression associated to cardio-metabolic traits is enriched for heritability, par- ticularly in adipose tissue. Approximately 50% of trait associated gene expression is regulated by at least one cis-eQTL, suggesting many of the TWAS findings are either regulated by trans-acting genetic variation or are environmentally driven.

Previously published evidence suggests that obesity can modify the heritability of traits. Given that BMI has a pervasive effect on multi-tissue gene expression (as shown in chapter3), I explored the possibility of identifying BMI-dependent eQTLs. In chapter4, I identify for the first time genetic variants whose effect on adipose tissue gene expression is modified by an individual’s obesity status, as measured using BMI (G × BMI eQTLs). I replicate these variants in an independent dataset obtained from Icelandic individuals (DeCODE) and characterise their genomic enrichment. I find that G × BMI eQTLs are enriched for main effect eQTLs in several tissues,

113 particularly in adipose tissue, whilst the interaction effect is tissue specific. I validate the use of BMI as an obesity measurement, by using a highly accurate DXA measure of visceral fat area. I find two of the genome-wide significant G × BMI eQTLs are present in the MEMN and G × BMI eQTLs overall show an enrichment for both immune cell and metabolic processes. By integrating the results with GWAS and performing colocalisation analysis, I demonstrate that one G × BMI eQTLs tags a GWAS locus for esophageal cancer which has previously been shown to interact with alcohol consumption.

In chapter4, I explored multiple possible explanations for G × BMI eQTLs and conducted extensive robustness analysis. However, given the enrichment for immune cell processes, I wanted to determine whether the 16 genome-wide significant G × BMI eQTLs were caused by cell type heterogeneity between individual adipose tissue samples. To do this, in chapter5 I demonstrate for the first time that it is possible to estimate relative constituent cell type proportions using gene expression obtained from bulk adipose tissue RNA-seq. I start the analysis by performing extensive sim- ulations to determine the sensitivity and robustness of the estimates to noise and unknown cell type content. By estimating four separate cell types present in adipose tissue, I focus on macrophage and adipocyte proportions, guided by the simulations. I show that macrophage and adipocyte variability is heritable. Additionally, adjust- ing for cell type heterogeneity can decrease TWAS cell-type confounding and increase cis-eQTL discovery yield. I also show that cell type composition does not explain the G × BMI eQTL findings described in chapter4. Finally, I demonstrate the ability to characterise cis-eQTLs that are dependent on macrophage proportion, and repli- cate these findings in an independent monocyte dataset. Collectively, these analysis represent a vast compendium of results that give insight into the tissue specificity, genetic regulation and complexity of peripheral tissue biology and its relationship with several traits.

6.0.1 Improvements to this study

In hindsight, there are several aspects of this study that can be improved upon. First, since starting this analysis, several more advanced and sensitive methods for many processes that form part of the RNA-sequencing methodology have become avail- able. These include a new reference genome build (GRCh38) and gene annotations (GENCODEv25), as well as improvements in RNA-seq alignment and quantification (Dobin et al. 2013, Robinson & Oshlack 2010). Whilst none of these improvements

114 would invalidate the findings presented here, they would potentially increase discovery power. Second, in chapter3 I used a parametric FDR, whilst a non-parametric FDR (permutation based method) has been shown to be more sensitive and to control for test-statistic inflation more rigorously (van Iterson et al. 2016). Of course, permuta- tions come with a vast computational burden, and I determined that for the analysis of 23 traits across four tissues, this was not a feasible approach. Finally, in chapter5, I use simulations to benchmark cell type composition estimation performance, similar to many previous studies. However, it would have been ideal to have a flow sorted reference population of cells (ground-truth) obtained from adipose tissue to compare with the cell proportion estimates. As mentioned throughout chapter5, obtaining a large enough purified sub-sample of cells from a tissue as complex as adipose tissue is extremely difficult, with most flow cytometry analysis conducted on adipose tissue being restricted to tens of samples.

6.0.2 Future work

The field of regulatory genomics is an extremely exciting and promising field to be in, with several rapid technological developments taking place. Therefore, there are sev- eral interesting analyses that could be performed to build and improve on this work. In the introduction of this thesis (chapter1), I elucidated to how alternative splicing can dramatically increase the functional repertoire of expressed genes. Several studies have demonstrated that alternative splicing is regulated extensively by the environ- ment, and work has started to identify context specific alternative splicing eQTLs (Pai et al. 2016). These analyses typically focus on using exon-exon junction reads to characterise exon exclusion/inclusion events in mRNA (Li et al. 2016). However, with long read sequencing reaching fruition, it may become routinely possible to per- form full mRNA isoform quantification, something that is currently very inaccurate with short read sequencing technology. A natural next step forward for this anaylsis is therefore to understand the cardio-metabolic regulation of isoforms in peripheral tissues, and the genetic basis of isoform usage in the population.

Second, several recent studies have demonstrated that Allele Specific expression (ASE) is a more powerful approach to identify G × E effects on expression (Knowles et al. 2015). Whilst ASE has several non-trivial artifacts that need to be accounted for, it represents a promising strategy to identify further G × BMI effects in adipose tissue. I am currently performing follow-up analysis and have designred and used a personalised genome framework, an analysis technique that can minimise mapping

115 artifacts inherent with ASE analysis. To do this, I am utilising whole genome sequenc- ing data (UK10K) from the same twins presented here, to explore ASE in multiple peripheral tissues. Finally, I expect the discovery of context specific eQTLs to be a function of sample size. Few eQTL studies have been performed with thousands of samples, with no single study analysing more than 10,000 individuals. Future work will have to implement meta-analysis across cohorts to leverage the ability to detect smaller effect size variants.

116 Bibliography

Abbas, A. R., Wolslegel, K., Seshasayee, D., Modrusan, Z. & Clark, H. F. (2009), ‘Deconvolution of blood microarray data identifies cellular activation patterns in systemic lupus erythematosus’, PloS one 4(7), e6098.

Abreu-Vieira, G., Fischer, A. W., Mattsson, C., de Jong, J. M., Shabalina, I. G., Ryd´en,M., Laurencikiene, J., Arner, P., Cannon, B., Nedergaard, J. et al. (2015), ‘CIDEA improves the metabolic profile through expansion of adipose tissue’, Nature communications 6.

Aguet, F., Brown, A. A., Castel, S., Davis, J. R., Mohammadi, P. et al. (2016), ‘Local genetic effects on gene expression across 44 human tissues’, bioRxiv .

Alexa, A. & Rahnenfuhrer, J. (2010), ‘topGO: enrichment analysis for gene ontology’, R package version 2(0).

Allison, M. B. & Jr., M. G. M. (2014), ‘Connecting leptin signaling to biological function’, Journal of Endocrinology 223(1), T25–T35.

Azeez, O. I., Meintjes, R. & Chamunorwa, J. P. (2014), ‘Fat body, fat pad and adipose tissues in invertebrates and vertebrates: the nexus’, Lipids in health and disease 13(1), 1.

Bai, Y. & Sun, Q. (2015), ‘Macrophage recruitment in obese adipose tissue’, obesity reviews 16(2), 127–136.

Baker, C. & Bate, A. (2016), Obesity statistics, Technical Report 3336, House of Commons Library.

Barreiro, L. B., Tailleux, L., Pai, A. A., Gicquel, B., Marioni, J. C. & Gilad, Y. (2012), ‘Deciphering the genetic architecture of variation in the immune response to mycobacterium tuberculosis infection’, Proceedings of the National Academy of Sciences 109(4), 1204–1209.

117 Bates, D., M¨achler, M., Bolker, B. & Walker, S. (2014), ‘Fitting linear mixed-effects models using lme4’, arXiv preprint arXiv:1406.5823 .

Beasley, T. M., Erickson, S. & Allison, D. B. (2009), ‘Rank-based inverse normal transformations are increasingly used, but are they merited?’, Behavior genetics 39(5), 580–595.

Bell, C., Walley, A. & Froguel, P. (2005), ‘The genetics of human obesity’, Nature Reviews Genetics 6(3), 221–234.

Berg, A. H. & Scherer, P. E. (2005), ‘Adipose tissue, inflammation, and cardiovascular disease’, Circulation research 96(9), 939–949.

Bjorntorp, P. (1990), ‘Portal adipose tissue as a generator of risk factors for cardio- vascular disease and diabetes’, Arteriosclerosis, Thrombosis, and Vascular Biology 10, 493–496.

Blignaut, M. (2012), ‘Review of non-coding RNAs and the epigenetic regulation of gene expression: A book edited by kevin morris’, epigenetics 7(6), 664–666.

Bl¨uher,M. (2009), ‘Adipose tissue dysfunction in obesity’, Experimental and Clinical Endocrinology & Diabetes 117(06), 241–250.

Boden, G. (1997), ‘Role of fatty acids in the pathogenesis of insulin resistance and niddm’, Diabetes 46(1), 3–10.

Boker, S., Neale, M., Maes, H., Wilde, M., Spiegel, M., Brick, T., Spies, J., Estabrook, R., Kenny, S., Bates, T. et al. (2011), ‘OpenMx: an open source extended structural equation modeling framework’, Psychometrika 76(2), 306–317.

Boutens, L. & Stienstra, R. (2016), ‘Adipose tissue macrophages: going off track during obesity’, Diabetologia 59(5), 879–894.

Brem, R. B., Yvert, G., Clinton, R. & Kruglyak, L. (2002), ‘Genetic dissection of transcriptional regulation in budding yeast’, Science 296(5568), 752–755.

Brodin, P., Jojic, V., Gao, T., Bhattacharya, S., Angel, C. J. L., Furman, D., Shen- Orr, S., Dekker, C. L., Swan, G. E., Butte, A. J. et al. (2015), ‘Variation in the hu- man immune system is largely driven by non-heritable influences’, Cell 160(1), 37– 47.

118 Brown, A. A., Buil, A., Vi˜nuela,A., Lappalainen, T., Zheng, H.-F., Richards, J. B., Small, K. S., Spector, T. D., Dermitzakis, E. T. & Durbin, R. (2014), ‘Genetic in- teractions affecting human gene expression identified by variance association map- ping’, Elife 3, e01381.

Buil, A., Brown, A. A., Lappalainen, T., Vi˜nuela, A., Davies, M. N., Zheng, H.- F., Richards, J. B., Glass, D., Small, K. S., Durbin, R. et al. (2015), ‘Gene-gene and gene-environment interactions detected by transcriptome sequence analysis in twins’, Nature genetics 47(1), 88–91.

Buil, A., Vi˜nuela,A., Brown, A., Davies, M., Padioleau, I., Bielser, D., Romano, L., Glass, D., Di Meglio, P., Small, K., Spector, T. & Dermitzakis, E. T. (2016), ‘Quan- tifying the degree of sharing of genetic and non-genetic causes of gene expression variability across four tissues.’, bioRxiv .

C¸alı¸skan, M., Cusanovich, D. A., Ober, C. & Gilad, Y. (2011), ‘The effects of EBV transformation on gene expression levels and methylation profiles’, Human molec- ular genetics p. ddr041.

Carnevalli, L. S., Masuda, K., Frigerio, F., Le Bacquer, O., Um, S. H., Gandin, V., Topisirovic, I., Sonenberg, N., Thomas, G. & Kozma, S. C. (2010), ‘S6K1 plays a critical role in early adipocyte differentiation’, Developmental cell 18(5), 763–774.

Chen, Y., Zhu, J., Lum, P. Y., Yang, X., Pinto, S., MacNeil, D. J., Zhang, C., Lamb, J., Edwards, S., Sieberts, S. K. et al. (2008), ‘Variations in DNA elucidate molecular networks that cause disease’, Nature 452(7186), 429–435.

Cho, Y. S., Chen, C.-H., Hu, C., Long, J., Ong, R. T. H., Sim, X., Takeuchi, F., Wu, Y., Go, M. J., Yamauchi, T. et al. (2012), ‘Meta-analysis of genome-wide association studies identifies eight new loci for type 2 diabetes in east asians’, Nature genetics 44(1), 67–72.

Cinti, S., Mitchell, G., Barbatelli, G., Murano, I., Ceresi, E., Faloia, E., Wang, S., Fortier, M., Greenberg, A. S. & Obin, M. S. (2005), ‘Adipocyte death defines macrophage localization and function in adipose tissue of obese mice and humans’, Journal of lipid research 46(11), 2347–2355.

Claussnitzer, M., Dankel, S. N., Kim, K.-H., Quon, G., Meuleman, W., Haugen, C., Glunk, V., Sousa, I. S., Beaudry, J. L., Puviindran, V. et al. (2015), ‘FTO

119 obesity variant circuitry and adipocyte browning in humans’, N Engl J Med 2015(373), 895–907.

Dalmas, E., Cl´ement, K. & Guerre-Millo, M. (2011), ‘Defining macrophage phenotype and function in adipose tissue’, Trends in immunology 32(7), 307–314.

Dastani, Z., Hivert, M.-F., Timpson, N., Perry, J. R. B. et al. (2012), ‘Novel loci for adiponectin levels and their influence on type 2 diabetes and metabolic traits: A multi-ethnic meta-analysis of 45,891 individuals’, PLoS Genet 8(3), 1–23.

Davis, J. R., Fresard, L., Knowles, D. A., Pala, M., Bustamante, C. D., Battle, A. & Montgomery, S. B. (2016), ‘An efficient multiple-testing adjustment for eQTL studies that accounts for linkage disequilibrium between variants’, The American Journal of Human Genetics 98(1), 216–224.

Davis, R. L., Weintraub, H. & Lassar, A. B. (1987), ‘Expression of a single transfected cDNA converts fibroblasts to myoblasts’, Cell 51(6), 987–1000.

Daye, Z. J., Chen, J. & Li, H. (2012), ‘High-dimensional heteroscedastic regression with an application to eQTL data analysis’, Biometrics 68(1), 316–326. de Gonz´alez,A. B., Cox, D. R. et al. (2007), ‘Interpretation of interaction: A review’, The Annals of Applied Statistics 1(2), 371–385.

DiMaio, T. A., Wentz, B. L. & Lagunoff, M. (2016), ‘Isolation and characterization of circulating lymphatic endothelial colony forming cells’, Experimental cell research 340(1), 159–169.

Djebali, S., Davis, C. A., Merkel, A., Dobin, A., Lassmann, T., Mortazavi, A., Tanzer, A., Lagarde, J., Lin, W., Schlesinger, F. et al. (2012), ‘Landscape of transcription in human cells’, Nature 489(7414), 101–108.

Dobin, A., Davis, C. A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut, P., Chaisson, M. & Gingeras, T. R. (2013), ‘Star: ultrafast universal RNA-seq aligner’, Bioinformatics 29(1), 15–21.

Dopico, X. C., Evangelou, M., Ferreira, R. C., Guo, H., Pekalski, M. L., Smyth, D. J., Cooper, N., Burren, O. S., Fulford, A. J., Hennig, B. J. et al. (2015), ‘Widespread seasonal gene expression reveals annual differences in human immunity’.

120 Dudbridge, F. & Fletcher, O. (2014), ‘Gene-environment dependence creates spu- rious gene-environment interaction’, The American Journal of Human Genetics 95(3), 301–307.

Eichler, E. E., Flint, J., Gibson, G., Kong, A., Leal, S. M., Moore, J. H. & Nadeau, J. H. (2010), ‘Viewpoint missing heritability and strategies for finding the under- lying causes of complex disease’, Nature Reviews Genetics 11(6), 446–450.

Emilsson, V., Thorleifsson, G., Zhang, B., Leonardson, A. S., Zink, F., Zhu, J., Carlson, S., Helgason, A., Walters, G. B., Gunnarsdottir, S. et al. (2008), ‘Genetics of gene expression and its effect on disease’, Nature 452(7186), 423–428.

ENCODE Consortium (2012), ‘An integrated encyclopedia of DNA elements in the human genome’, Nature 489(7414), 57–74.

Enerb¨ack, S. (2009), ‘The origins of brown adipose tissue’, New England Journal of Medicine 360(19), 2021–2023. PMID: 19420373.

Fadista, J., Vikman, P., Laakso, E. O., Mollet, I. G., Esguerra, J. L., Taneera, J., Storm, P., Osmark, P., Ladenvall, C., Prasad, R. B. et al. (2014), ‘Global ge- nomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism’, Proceedings of the National Academy of Sciences 111(38), 13924–13929.

Fairfax, B. P., Humburg, P., Makino, S., Naranbhai, V., Wong, D., Lau, E., Jostins, L., Plant, K., Andrews, R., McGee, C. et al. (2014), ‘Innate immune activity conditions the effect of regulatory variants upon monocyte gene expression’, Science 343(6175), 1246949.

Fall, T., H¨agg,S., M¨agi,R., Ploner, A., Fischer, K., Horikoshi, M., Sarin, A.- P., Thorleifsson, G., Ladenvall, C., Kals, M. et al. (2013), ‘The role of adipos- ity in cardiometabolic traits: a mendelian randomization analysis’, PLoS Med 10(6), e1001474.

Farooqi, I. & O’Rahilly, S. (2005), ‘Monogenic obesity in humans’, Annual Review of Medicine 56, 443.

Fisher, R. A. (1930), The genetical theory of natural selection, Oxford University Press.

121 Francesconi, M. & Lehner, B. (2014), ‘The effects of genetic variation on gene expres- sion dynamics during development’, Nature 505(7482), 208–211.

Frayling, T. M., Timpson, N. J., Weedon, M. N., Zeggini, E., Freathy, R. M., Lind- gren, C. M. et al. (2007), ‘A common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity’, Science 316(5826), 889–94.

Fritsche, L. G., Igl, W., Bailey, J. N. C., Grassmann, F., Sengupta, S. et al. (2016), ‘A large genome-wide association study of age-related macular degeneration highlights contributions of rare and common variants’, Nature genetics 48(2), 134–143.

Fruebis, J., Tsao, T.-S., Javorschi, S., Ebbets-Reed, D., Erickson, M. R. S., Yen, F. T., Bihain, B. E. & Lodish, H. F. (2001), ‘Proteolytic cleavage product of 30- kda adipocyte complement-related protein increases fatty acid oxidation in muscle and causes weight loss in mice’, Proceedings of the National Academy of Sciences 98(4), 2005–2010.

Fuchsberger, C., Flannick, J., Teslovich, T. M., Mahajan, A. et al. (2016), ‘The genetic architecture of type 2 diabetes’, Nature 536(7614), 41–47.

Fukuhara, A., Matsuda, M., Nishizawa, M., Segawa, K., Tanaka, M. et al. (2005), ‘Visfatin: A protein secreted by visceral fat that mimics the effects of insulin’, Science 307(5708), 426–430.

Galic, S., Oakhill, J. S. & Steinberg, G. R. (2010), ‘Adipose tissue as an endocrine organ’, Molecular and Cellular Endocrinology 316(2), 129 – 139. Endocrine Aspects of Obesity.

Gaujoux, R. & Seoighe, C. (2012), ‘Semi-supervised nonnegative matrix factorization for gene expression deconvolution: a case study’, Infection, Genetics and Evolution 12(5), 913–921.

Gershenzon, N. I. & Ioshikhes, I. P. (2005), ‘Synergy of human Pol II core promoter elements revealed by statistical sequence analysis’, Bioinformatics 21(8), 1295– 1300.

Glass, D., Vi˜nuela,A., Davies, M. N., Ramasamy, A., Parts, L., Knowles, D., Brown, A. A., Hedman, A.˚ K., Small, K. S., Buil, A. et al. (2013), ‘Gene expression changes with age in skin, adipose tissue, blood and brain’, Genome biology 14(7), 1.

122 Glastonbury, C. A., Vi˜nuela,A., Buil, A., Halldorsson, G. H., Thorleifsson, G., Helgason, H., Thorsteinsdottir, U., Stefansson, K., Dermitzakis, E. T., Spector, T. D. et al. (2016), ‘Adiposity-dependent regulatory effects on multi-tissue tran- scriptomes’, The American Journal of Human Genetics 99(3), 567–579.

Global Lipids Genetics Consortium (2013), ‘Discovery and refinement of loci associ- ated with lipid levels’, Nature genetics 45(11), 1274–1283.

Gong, T. & Szustakowski, J. D. (2013), ‘DeconRNASeq: a statistical framework for deconvolution of heterogeneous tissue samples based on mRNA-Seq data’, Bioin- formatics 29(8), 1083–1085.

Gonz`alez-Porta, M., Frankish, A., Rung, J., Harrow, J. & Brazma, A. (2013), ‘Tran- scriptome analysis of human tissues and cell lines reveals one dominant transcript per gene’, Genome biology 14(7), 1.

Goodpaster, B., Thaete, F., Simoneau, J. & Kelley, D. (1997), ‘Subcutaneous abdom- inal fat and thigh muscle composition predict insulin sensitivity independently of visceral fat’, Diabetes 46(10), 1579–1585.

Grant, R. W. & Dixit, V. D. (2015), ‘Adipose tissue as an immunological organ’, Obesity 23(3), 512–518.

Greenawalt, D. M., Dobrin, R., Chudin, E., Hatoum, I. J., Suver, C., Beaulaurier, J., Zhang, B., Castro, V., Zhu, J., Sieberts, S. K. et al. (2011), ‘A survey of the genetics of stomach, liver, and adipose gene expression from a morbidly obese cohort’, Genome research 21(7), 1008–1016.

Greenberg, A. S. & Obin, M. S. (2006), ‘Obesity and the role of adipose tis- sue in inflammation and metabolism’, The American journal of clinical nutrition 83(2), 461S–465S.

Greenwood, D. C., Gilthorpe, M. S. & Cade, J. E. (2006), ‘The impact of imprecisely measured covariates on estimating gene-environment interactions’, BMC medical research methodology 6(1), 1.

Grundberg, E., Small, K. S., Hedman, A. K., Nica, A. C., Buil, A., Keildson, S., Bell, J. T., Yang, T.-P., Meduri, E., Barrett, A. et al. (2012), ‘Mapping cis-and trans- regulatory effects across multiple tissues in twins’, Nature genetics 44(10), 1084– 1089.

123 Gupta, R. K., Arany, Z., Seale, P., Mepani, R. J., Ye, L., Conroe, H. M., Roby, Y. A., Kulaga, H., Reed, R. R. & Spiegelman, B. M. (2010), ‘Transcriptional control of preadipocyte determination by ZFP423’, Nature 464(7288), 619–623.

Gusev, A., Ko, A., Shi, H., Bhatia, G., Chung, W., Penninx, B. W., Jansen, R., De Geus, E. J., Boomsma, D. I., Wright, F. A. et al. (2016), ‘Integrative approaches for large-scale transcriptome-wide association studies’, Nature genetics .

Harms, M. & Seale, P. (2013), ‘Brown and beige fat: development, function and therapeutic potential’, Nature medicine 19(10), 1252–1263.

Haworth, C. M. A., Carnell, S., Meaburn, E. L., Davis, O. S. P., Plomin, R. & Wardle, J. (2008), ‘Increasing heritability of BMI and stronger associations with the FTO gene over childhood’, Obesity 16(12), 2663–2668.

Heinz, S., Romanoski, C. E., Benner, C. & Glass, C. K. (2015), ‘The selection and function of cell type-specific enhancers’, Nature Reviews Molecular Cell Biology 16(3), 144–154.

Holmes, M. V., Lange, L. A., Palmer, T., Lanktree, M. B., North, K. E. et al. (2014), ‘”causal effects of body mass index on cardiometabolic traits and events: A mendelian randomization analysis”’, American Journal of Human Genetics 94(2), 198–208.

Houseman, E. A., Kelsey, K. T., Wiencke, J. K. & Marsit, C. J. (2015), ‘Cell- composition effects in the analysis of DNA methylation array data: a mathematical perspective’, BMC bioinformatics 16(1), 1.

Houseman, E. A., Molitor, J. & Marsit, C. J. (2014), ‘Reference-free cell mixture adjustments in analysis of DNA methylation data’, Bioinformatics 30(10), 1431– 1439.

Hsueh, W., Wagner, M., Mitchell, B., Jean, P. S., Aburomia, R., Knowler, W., Pollin, T., Burns, D., Sakul, H., Bell, C., Ehm, M., Shuldiner, A. & Michelsen, B. (2000), ‘Diabetes in the old order amish - characterization and heritability analysis of the amish family diabetes study’, Diabetes care 23(5), 595–601.

Huan, T., Esko, T., Peters, M. J., Pilling, L. C., Schramm, K., Schurmann, C., Chen, B. H., Liu, C., Joehanes, R., Johnson, A. D. et al. (2015), ‘A meta-analysis of gene expression signatures of blood pressure and hypertension’, PLoS Genet 11(3), e1005035.

124 Huang, Z., Hoffmann, F. W., Norton, R. L., Hashimoto, A. C. & Hoffmann, P. R. (2011), ‘Selenoprotein k is a novel target of m-calpain, and cleavage is regulated by toll-like receptor-induced calpastatin in macrophages’, Journal of Biological Chem- istry 286(40), 34830–34838.

Hunt, S., Hasstedt, S., Kuida, H., Stults, B., Hopkins, P. & Williams, R. (1989), ‘Ge- netic heritability and common environmental components of resting and stressed blood pressures, lipids, and body-mass index in utah pedigrees and twins’, Ameri- can Journal of Epidemiology 129(3), 625–638.

Idaghdour, Y., Czika, W., Shianna, K. V., Lee, S. H., Visscher, P. M., Martin, H. C., Miclaus, K., Jadallah, S. J., Goldstein, D. B., Wolfinger, R. D. et al. (2010), ‘Geo- graphical genomics of human leukocyte gene expression variation in southern mo- rocco’, Nature genetics 42(1), 62–67.

Jaffe, A. E. & Irizarry, R. A. (2014), ‘Accounting for cellular heterogeneity is critical in epigenome-wide association studies’, Genome biology 15(2), 1.

Jimenez, E. G., Cordero, M. J. A., Lopez, C. A. P. & Garcia, I. G. (2012), ‘Monogenic human obesity: role of the leptin-melanocortin system in the regulation of food intake and body weight in humans’, Anales Del Sistema Sanitario De Navarra 35(2), 285–293.

Jo, B., He, Y., Strober, B. J., Parsana, P., Aguet, F. et al. (2016), ‘Distant regulatory effects of genetic variation in multiple human tissues’, bioRxiv .

Juven-Gershon, T. & Kadonaga, J. T. (2010), ‘Regulation of gene expression via the core promoter and the basal transcriptional machinery’, Developmental biology 339(2), 225–229.

Kato, N., Loh, M., Takeuchi, F., Verweij, N., Wang, X. et al. (2015), ‘Trans-ancestry genome-wide association study identifies 12 genetic loci influencing blood pressure and implicates a role for DNA methylation’, Nature genetics 47(11), 1282.

Kawano, J. & Arora, R. (2009), ‘The role of adiponectin in obesity, diabetes, and cardiovascular disease’, Journal of the CardioMetabolic Syndrome 4(1), 44–49.

Keildson, S., Fadista, J., Ladenvall, C., Hedman, A.˚ K., Elgzyri, T., Small, K. S., Grundberg, E., Nica, A. C., Glass, D., Richards, J. B. et al. (2013), ‘Skeletal muscle expression of phosphofructokinase is influenced by genetic variation and associated with insulin sensitivity’, Diabetes p. DB 131301.

125 Kelesidis, T., Kelesidis, I., Chou, S. & Mantzoros, C. S. (2010), ‘Narrative review: The role of leptin in human physiology: Emerging clinical applications’, Annals of Internal Medicine 152(2), 93.

Kl¨oting, N., Fasshauer, M., Dietrich, A., Kovacs, P., Sch¨on, M. R., Kern, M., Stumvoll, M. & Bl¨uher,M. (2010), ‘Insulin-sensitive obesity’, American Journal of Physiology - Endocrinology and Metabolism 299(3), E506–E515.

Knowles, D. A., Davis, J. R., Raj, A., Zhu, X., Potash, J. B., Weissman, M. M., Shi, J., Levinson, D. F., Mostafavi, S., Montgomery, S. B. et al. (2015), ‘Allele-specific expression reveals interactions between genetic variation and environment’, bioRxiv p. 025874.

K¨olling,N. (2016), Quantitative genetics of gene expression during fruit fly develop- ment, PhD thesis, University of Cambridge.

Kruijer, W. (2016), ‘Misspecification in mixed-model-based association analysis’, Ge- netics 202(1), 363–366.

Kukurba, K. R., Parsana, P., Balliu, B., Smith, K. S., Zappala, Z., Knowles, D. A., Fav´e,M.-J., Davis, J. R., Li, X., Zhu, X. et al. (2016), ‘Impact of the X and sex on regulatory variation’, Genome research 26(6), 768–777.

Lappalainen, T. et al. (2013), ‘Transcriptome and genome sequencing uncovers func- tional variation in humans’, Nature 501(7468), 506–511.

Latchman, D. S. (2011), ‘Transcriptional gene regulation in eukaryotes’, eLS .

Lee, T. I. & Young, R. A. (2013), ‘Transcriptional regulation and its misregulation in disease’, Cell 152(6), 1237–1251.

Lewis, G. F., Carpentier, A., Adeli, K. & Giacca, A. (2002), ‘Disordered fat storage and mobilization in the pathogenesis of insulin resistance and type 2 diabetes’, Endocrine reviews 23(2), 201–229.

Li, P., Piao, Y., Shon, H. S. & Ryu, K. H. (2015), ‘Comparing the normalization methods for the differential analysis of illumina high-throughput RNA-seq data’, BMC bioinformatics 16(1), 1.

126 Li, Y., Alvarez,´ O. A., Gutteling, E. W., Tijsterman, M., Fu, J., Riksen, J. A., Hazendonk, E., Prins, P., Plasterk, R. H., Jansen, R. C. et al. (2006), ‘Mapping determinants of gene expression plasticity by genetical genomics in c. elegans’, PLoS Genet 2(12), e222.

Li, Y. I., van de Geijn, B., Raj, A., Knowles, D. A., Petti, A. A., Golan, D., Gilad, Y. & Pritchard, J. K. (2016), ‘RNA splicing is a primary link between genetic variation and disease’, Science 352(6285), 600–604.

Lihn, A. S., Pedersen, S. B. & Richelsen, B. (2005), ‘Adiponectin: action, regulation and association to insulin sensitivity’, Obesity Reviews 6(1), 13–21.

Liu, X., Finucane, H. K., Gusev, A., Bhatia, G., Gazal, S., O’Connor, L., Bulik- Sullivan, B., Wright, F., Sullivan, P., Neale, B. & Price, A. (2016), ‘Functional partitioning of local and distal gene expression regulation in multiple human tis- sues’, bioRxiv .

Lo, K. A. & Sun, L. (2013), ‘Turning WAT into BAT: a review on regulators control- ling the browning of white adipocytes’, Bioscience reports 33, 711–719.

Locke, A. E., Kahali, B., Berndt, S. I., Justice, A. E. et al. (2015), ‘Genetic studies of body mass index yield new insights for obesity biology’, Nature 518(7538), 197– U401.

Loh, P., Danecek, P., Palamara, P. F., Fuchsberger, C., Reshef, Y. A., Finucane, H. K., Schoenherr, S., Forer, L., McCarthy, S., Abecasis, G. R., Durbin, R. & Price, A. L. (2016), ‘Reference-based phasing using the haplotype reference consortium panel’.

Long, J. S. & Ervin, L. H. (2000), ‘Using heteroscedasticity consistent standard errors in the linear regression model’, The American Statistician 54(3), 217–224.

Lotta, L. A., Gulati, P., Day, F. R., Payne, F., Ongen, H., van de Bunt, M., Gaulton, K. J., Eicher, J. D., Sharp, S. J., Luan, J. et al. (2016), ‘Integrative genomic analysis implicates limited peripheral adipose storage capacity in the pathogenesis of human insulin resistance’, Nature Genetics .

Mahabir, S., Baer, D., Johnson, L. L., Roth, M., Campbell, W., Clevidence, B. & Taylor, P. R. (2007), ‘Body mass index, percent body fat, and regional body fat distribution in relation to leptin concentrations in healthy, non-smoking post- menopausal women in a feeding study’, Nutrition journal 6(1), 1.

127 Manning, A. K., Hivert, M.-F., Scott, R. A., Grimsby, J. L., Bouatia-Naji, N., Chen, H., Rybin, D., Liu, C.-T., Bielak, L. F., Prokopenko, I. et al. (2012), ‘A genome- wide approach accounting for body mass index identifies genetic variants influencing fasting glycemic traits and insulin resistance’, Nature genetics 44(6), 659–669.

Manolio, T. A., Collins, F. S., Cox, N. J., Goldstein, D. B., Hindorff, L. A., Hunter, D. J., McCarthy, M. I., Ramos, E. M., Cardon, L. R., Chakravarti, A., Cho, J. H., Guttmacher, A. E., Kong, A., Kruglyak, L., Mardis, E., Rotimi, C. N., Slatkin, M., Valle, D., Whittemore, A. S., Boehnke, M., Clark, A. G., Eichler, E. E., Gibson, G., Haines, J. L., Mackay, T. F. C., McCarroll, S. A. & Visscher, P. M. (2009), ‘Finding the missing heritability of complex diseases’, Nature 461(7265), 747–753.

Maranville, J. C., Luca, F., Richards, A. L., Wen, X., Witonsky, D. B., Baxter, S., Stephens, M. & Di Rienzo, A. (2011), ‘Interactions between glucocorticoid treat- ment and cis-regulatory polymorphisms contribute to cellular response phenotypes’, PLoS Genet 7(7), e1002162.

Maston, G. A., Evans, S. K. & Green, M. R. (2006), ‘Transcriptional regulatory elements in the human genome’, Annu. Rev. Genomics Hum. Genet. 7, 29–59.

Matlin, A. J., Clark, F. & Smith, C. W. (2005), ‘Understanding alternative splicing: towards a cellular code’, Nature reviews Molecular cell biology 6(5), 386–398.

Matsubara, M., Maruoka, S. & Katayose, S. (2002), ‘Inverse relationship between plasma adiponectin and leptin concentrations in normal-weight and obese women’, European Journal of Endocrinology 147(2), 173–180.

McGarry, J. D. (1998), ‘Glucose-fatty acid interactions in health and disease.’, The American journal of clinical nutrition 67(3), 500S–504S.

McLaughlin, T., Abbasi, F., Lamendola, C. & Reaven, G. (2007), ‘Heterogeneity in the prevalence of risk factors for cardiovascular disease and type 2 diabetes mellitus in obese individuals: effect of differences in insulin sensitivity’, Archives of internal medicine 167(7), 642–648.

McLaughlin, T., Stuhlinger, M., Lamendola, C., Abbasi, F., Bialek, J., Reaven, G. M. & Tsao, P. S. (2006), ‘Plasma asymmetric dimethylarginine concentrations are elevated in obese insulin-resistant women and fall with weight loss’, The Journal of Clinical Endocrinology & Metabolism 91(5), 1896–1900.

128 Minokoshi, Y., Kim, Y.-B., Peroni, O. D., Fryer, L. G., M¨uller,C., Carling, D. & Kahn, B. B. (2002), ‘Leptin stimulates fatty-acid oxidation by activating amp- activated protein kinase’, Nature 415(6869), 339–343.

Mitchell, B., Kammerer, C., Blangero, J., Mahaney, M., Rainwater, D., Dyke, B., Hixson, J., Henkel, R., Sharp, R., Comuzzie, A., VandeBerg, J., Stern, M. & Mac- Cluer, J. (1996), ‘Genetic and environmental contributions to cardiovascular risk factors in mexican americans - the San Antonio Family Heart Study’, Circulation 94(9), 2159–2170.

Moayyeri, A., Hammond, C. J., Valdes, A. M. & Spector, T. D. (2013), ‘Cohort profile: Twinsuk and healthy ageing twin study’, International journal of epidemiology 42(1), 76–85.

Moisan, A., Lee, Y.-K., Zhang, J. D., Hudak, C. S., Meyer, C. A., Prummer, M., Zoffmann, S., Truong, H. H., Ebeling, M., Kiialainen, A. et al. (2015), ‘White-to- brown metabolic conversion of human adipocytes by JAK inhibition’, Nature cell biology 17(1), 57–67.

Montague, C. T., Farooqi, I. S., Whitehead, J. P., Soos, M. A., Rau, H., Ware- ham, N. J., Sewter, C. P., Digby, J. E., Mohammed, S. N., Hurst, J. A. et al. (1997), ‘Congenital leptin deficiency is associated with severe early-onset obesity in humans’, Nature 387(6636), 903–907.

Mortazavi, A., Williams, B. A., McCue, K., Schaeffer, L. & Wold, B. (2008), ‘Map- ping and quantifying mammalian transcriptomes by RNA-seq’, Nature methods 5(7), 621–628.

Moutsianas, L., Jostins, L., Beecham, A. H., Dilthey, A. T., Xifara, D. K. et al. (2015), ‘Class II HLA interactions modulate genetic risk for multiple sclerosis’, Nature genetics 47(10), 1107.

MuTHER Consortium (2011), ‘Identification of an imprinted master trans regulator at the KLF14 locus related to multiple metabolic phenotypes’, Nature genetics 43(6), 561–564.

Neph, S., Vierstra, J., Stergachis, A. B., Reynolds, A. P., Haugen, E., Vernot, B., Thurman, R. E., John, S., Sandstrom, R., Johnson, A. K. et al. (2012), ‘An expan- sive human regulatory lexicon encoded in transcription factor footprints’, Nature 489(7414), 83–90.

129 Newman, A. M., Liu, C. L., Green, M. R., Gentles, A. J., Feng, W., Xu, Y., Hoang, C. D., Diehn, M. & Alizadeh, A. A. (2015), ‘Robust enumeration of cell subsets from tissue expression profiles’, Nature methods 12(5), 453–457.

Nica, A. C., Montgomery, S. B., Dimas, A. S., Stranger, B. E., Beazley, C., Bar- roso, I. & Dermitzakis, E. T. (2010), ‘Candidate causal regulatory effects by inte- gration of expression QTLs with complex trait genetic associations’, PLoS Genet 6(4), e1000895.

North, K., Howard, B., Welty, T., Best, L., Lee, E., Yeh, J., Fabsitz, R., Roman, M. & MacCluer, J. (2003), ‘Genetic and environmental contributions to cardiovascular disease risk in american indians - the strong heart family study’, American Journal of Epidemiology 157(4), 303–314.

Ongen, H. & Dermitzakis, E. T. (2015), ‘Alternative splicing QTLs in european and african populations’, The American Journal of Human Genetics 97(4), 567–575.

Oral, E. A., Simha, V., Ruiz, E., Andewelt, A., Premkumar, A., Snell, P., Wagner, A. J., DePaoli, A. M., Reitman, M. L., Taylor, S. I., Gorden, P. & Garg, A. (2002), ‘Leptin-replacement therapy for lipodystrophy’, New England Journal of Medicine 346(8), 570–578. PMID: 11856796.

Oswal, A. & Yeo, G. S. H. (2007), ‘Appetite regulatory peptides - the leptin melanocortin pathway and the control of body weight: lessons from human and murine genetics’, Obesity Reviews 8(4), 293–306.

Pai, A. A., Baharian, G., Sabourin, A. P., Brinkworth, J. F., Nedelec, Y., Foley, J. W., Grenier, J.-C., Siddle, K. J., Dumaine, A., Yotova, V. et al. (2016), ‘Widespread shortening of 3’untranslated regions and increased exon inclusion are evolutionarily conserved features of innate immune responses to infection’, bioRxiv p. 026831.

Peake, P., Kriketos, A., Denyer, G., Campbell, L. & Charlesworth, J. (2003), ‘The postprandial response of adiponectin to a high-fat meal in normal and insulin- resistant subjects’, International journal of obesity 27(6), 657–662.

Pelham, H. R. (1982), ‘A regulatory upstream promoter element in the drosophila HSP 70 heat-shock gene’, Cell 30(2), 517–528.

Pleiss, J. A., Whitworth, G. B., Bergkessel, M. & Guthrie, C. (2007), ‘Rapid, transcript-specific changes in splicing in response to environmental stress’, Molec- ular cell 27(6), 928–937.

130 Poulsen, P., Kyvik, K., Vaag, A. & Beck-Nielsen, H. (1999), ‘Heritability of type II (non-insulin-dependent) diabetes mellitus and abnormal glucose tolerance - a population-based twin study’, Diabetologia 42(2), 139–145.

Powell, J. E., Henders, A. K., McRae, A. F., Wright, M. J., Martin, N. G., Dermitza- kis, E. T., Montgomery, G. W. & Visscher, P. M. (2012), ‘Genetic control of gene expression in whole blood and lymphoblastoid cell lines is largely independent’, Genome research 22(3), 456–466.

Price, A. L., Zaitlen, N. A., Reich, D. & Patterson, N. (2010), ‘New approaches to population stratification in genome-wide association studies’, Nat Rev Genet 11(7).

Puri, V., Ranjit, S., Konda, S., Nicoloro, S. M., Straubhaar, J., Chawla, A., Lin, C., Burkart, A., Corvera, S., Perugini, R. A. et al. (2008), ‘CIDEA is associated with lipid droplets and insulin sensitivity in humans’, Proceedings of the National Academy of Sciences 105(22), 7833–7838.

R`afols,M. E. (2014), ‘Adipose tissue: cell heterogeneity and functional diversity’, Endocrinolog´ıay Nutrici´on(English Edition) 61(2), 100–112.

Richards, J., Rivadeneira, F., Inouye, M., Pastinen, T., Soranzo, N., Wilson, S., Andrew, T., Falchi, M., Gwilliam, R., Ahmadi, K. et al. (2008), ‘Bone mineral density, osteoporosis, and osteoporotic fractures: a genome-wide association study’, The Lancet 371(9623), 1505–1512.

Rijsdijk, F. V. & Sham, P. C. (2002), ‘Analytic approaches to twin data using struc- tural equation models’, Briefings in Bioinformatics 3(2), 119–133.

Robinson, M. D. & Oshlack, A. (2010), ‘A scaling normalization method for differen- tial expression analysis of RNA-seq data’, Genome biology 11(3), 1.

Romanoski, C. E., Lee, S., Kim, M. J., Ingram-Drake, L., Plaisier, C. L., Yordanova, R., Tilford, C., Guan, B., He, A., Gargalovic, P. S. et al. (2010), ‘Systems genetics analysis of gene-by-environment interactions in human cells’, The American Journal of Human Genetics 86(3), 399–410.

Romere, C., Duerrschmid, C., Bournat, J., Constable, P., Jain, M., Xia, F., Saha, P. K., Del Solar, M., Zhu, B., York, B. et al. (2016), ‘Asprosin, a fasting-induced glucogenic protein hormone’, Cell 165(3), 566–579.

131 Rose, A., Huang, Z., Hoffmann, F., Denk, T., Hashimoto, A. & Hoffmann, P. (2013), ‘Calpastatin prevents NF-κB mediated hyperactivation of macrophages and atten- uates colitis’, The Journal of Immunology 190(1 Supplement), 43–37.

Rosen, E. D. & MacDougald, O. A. (2006), ‘Adipocyte differentiation from the inside out’, Nature reviews Molecular cell biology 7(12), 885–896.

Ruden, D. M., Chen, L., Possidente, D., Possidente, B., Rasouli, P., Wang, L., Lu, X., Garfinkel, M. D., Hirsch, H. V. & Page, G. P. (2009), ‘Genetical toxicogenomics in drosophila identifies master-modulatory loci that are regulated by developmental exposure to lead’, Neurotoxicology 30(6), 898–914.

Rutkowski, J. M., Stern, J. H. & Scherer, P. E. (2015), ‘The cell biology of fat expansion’, The Journal of cell biology 208(5), 501–512.

Sacks, H. & Symonds, M. E. (2013), ‘Anatomical locations of human brown adipose tissue’, Diabetes 62(6), 1783–1790.

Saely, C. H., Geiger, K. & Drexel, H. (2012), ‘Brown versus white adipose tissue: A mini-review’, Gerontology 58(1), 15–23.

Sanyal, A., Lajoie, B. R., Jain, G. & Dekker, J. (2012), ‘The long-range interaction landscape of gene promoters’, Nature 489(7414), 109–113.

Shabalin, A. A. (2012), ‘Matrix eQTL: ultra fast eQTL analysis via large matrix operations’, Bioinformatics 28(10), 1353–1358.

Shen-Orr, S. S., Tibshirani, R., Khatri, P., Bodian, D. L., Staedtler, F., Perry, N. M., Hastie, T., Sarwal, M. M., Davis, M. M. & Butte, A. J. (2010), ‘Cell type–specific gene expression differences in complex tissues’, Nature methods 7(4), 287–289.

Shimomura, I., Hammer, R., Ikemoto, S., Brown, M. & Goldstein, J. (1999), ‘Leptin reverses insulin resistance and diabetes mellitus in mice with congenital lipodys- trophy’, Nature 401(6748), 73–76.

Shlyueva, D., Stampfel, G. & Stark, A. (2014), ‘Transcriptional enhancers: from properties to genome-wide predictions’, Nature Reviews Genetics 15(4), 272–286.

Shungin, D., Winkler, T. W., Croteau-Chonka, D. C., Ferreira, T. et al. (2015), ‘New genetic loci link adipose and insulin biology to body fat distribution’, Nature 518(7538), 187–U378.

132 Simino, J., Shi, G., Weder, A., Boerwinkle, E., Hunt, S. C. & Rao, D. C. (2013), ‘Body mass index modulates blood pressure heritability: The family blood pressure program’, American journal of hypertension p. hpt144.

Smemo, S., Tena, J. J., Kim, K.-H., Gamazon, E. R., Sakabe, N. J., G´omez-Mar´ın, C., Aneas, I., Credidio, F. L., Sobreira, D. R., Wasserman, N. F. et al. (2014), ‘Obesity-associated variants within FTO form long-range functional connections with irx3’, Nature 507(7492), 371–375.

Smirnov, D. A., Morley, M., Shin, E., Spielman, R. S. & Cheung, V. G. (2009), ‘Genetic analysis of radiation-induced changes in human gene expression’, Nature 459(7246), 587–591.

Smith, E. N. & Kruglyak, L. (2008), ‘Gene–environment interaction in yeast gene expression’, PLoS Biol 6(4), e83.

Smith, P. & Day, N. (1984), ‘The design of case-control studies: the influence of con- founding and interaction effects’, International journal of epidemiology 13(3), 356– 365.

Spalding, K. L., Arner, E., Westermark, P. O., Bernard, S., Buchholz, B. A., Bergmann, O., Blomqvist, L., Hoffstedt, J., N¨aslund,E., Britton, T. et al. (2008), ‘Dynamics of fat cell turnover in humans’, Nature 453(7196), 783–787.

Steppan, C. M., Bailey, S. T., Bhat, S., Brown, E. J., Banerjee, R. R., Wright, C. M., Patel, H. R., Ahima, R. S. & Lazar, M. A. (2001), ‘The hormone resistin links obesity to diabetes’, Nature 409(6818), 307–312.

Storey, J. D. & Tibshirani, R. (2003), ‘Statistical significance for genomewide studies’, Proceedings of the National Academy of Sciences 100(16), 9440–9445.

Stranger, B. E., Montgomery, S. B., Dimas, A. S., Parts, L., Stegle, O., Ingle, C. E., Sekowska, M., Smith, G. D., Evans, D., Gutierrez-Arcelus, M. et al. (2012), ‘Patterns of cis regulatory variation in diverse human populations’, PLoS Genet 8(4), e1002639.

Sul, J. H., Bilow, M., Yang, W.-Y., Kostem, E., Furlotte, N., He, D. & Eskin, E. (2016), ‘Accounting for population structure in gene-by-environment interactions in genome-wide association studies using mixed models’, PLoS Genet 12(3), e1005849.

133 Teucher, B., Skinner, J., Skidmore, P. M., Cassidy, A., Fairweather-Tait, S. J., Hooper, L., Roe, M. A., Foxall, R., Oyston, S. L., Cherkas, L. F. et al. (2007), ‘Dietary patterns and heritability of food choice in a uk female twin cohort’, Twin Research and Human Genetics 10(05), 734–748.

Thomas, S. M., Kagan, C., Pavlovic, B. J., Burnett, J., Patterson, K., Pritchard, J. K. & Gilad, Y. (2015), ‘Reprogramming LCLs to iPSCs results in recovery of donor-specific gene expression signature’, PLoS Genet 11(5), e1005216.

Thrift, A. P., Shaheen, N. J., Gammon, M. D., Bernstein, L., Reid, B. J., Onstad, L., Risch, H. A., Liu, G., Bird, N. C., Wu, A. H., Corley, D. A., Romero, Y., Chanock, S. J., Chow, W.-H., Casson, A. G., Levine, D. M., Zhang, R., Ek, W. E., MacGre- gor, S., Ye, W., Hardie, L. J., Vaughan, T. L. & Whiteman, D. C. (2014), ‘Obesity and risk of esophageal adenocarcinoma and barrett’s esophagus: A mendelian ran- domization study’, Jnci-Journal of the National Cancer Institute 106(11), dju252.

Tilgner, H., Knowles, D. G., Johnson, R., Davis, C. A., Chakrabortty, S., Djebali, S., Curado, J., Snyder, M., Gingeras, T. R. & Guig´o,R. (2012), ‘Deep sequencing of subcellular RNA fractions shows splicing to be predominantly co-transcriptional in the human genome but inefficient for lncrnas’, Genome research 22(9), 1616–1625.

Travers, R., Motta, A., Betts, J., Bouloumi´e,A. & Thompson, D. (2015), ‘The impact of adiposity on adipose tissue-resident lymphocyte activation in humans’, Interna- tional Journal of Obesity 39(5), 762–769.

Trayhurn, P. (2013), ‘Hypoxia and adipose tissue function and dysfunction in obesity’, Physiological reviews 93(1), 1–21.

Trayhurn, P. & Beattie, J. (2001), ‘Physiological role of adipose tissue: white adipose tissue as an endocrine and secretory organ’, Proceedings of the Nutrition Society 60(3), 329–339.

Tsao, D., Thomsen, H. K., Chou, J., Stratton, J., Hagen, M., Loo, C., Garcia, C., Sloane, D. L., Rosenthal, A. & Lin, J. C. (2008), ‘Trkb agonists ameliorate obesity and associated metabolic conditions in mice’, Endocrinology 149(3), 1038–1048.

Van Harmelen, V., Skurk, T., R¨ohrig,K., Lee, Y., Halbleib, M., Aprath-Husmann, I. & Hauner, H. (2003), ‘Effect of BMI and age on adipose tissue cellularity and differentiation capacity in women’, International journal of obesity 27(8), 889–895.

134 van Iterson, M. M., van Zwet, E. W., Slagboom, P. E., Heijmans, B. T., Consortium, B. et al. (2016), ‘Controlling bias and inflation in epigenome-and transcriptome- wide association studies using the empirical null distribution’, bioRxiv p. 055772.

Vinuela, A., Brown, A. A., Buil, A., Tsai, P.-C., Davies, M. N., Bell, J. T., Der- mitzakis, E., Spector, T. & Small, K. (2016), ‘Age-dependent changes in mean and variance of gene expression across tissues in a twin cohort’, bioRxiv .

Vi˜nuela,A., Snoek, L. B., Riksen, J. A. & Kammenga, J. E. (2010), ‘Genome-wide gene expression regulation as a function of genotype and age in c. elegans’, Genome research 20(7), 929–937.

Visscher, P. M., Benyamin, B. & White, I. (2004), ‘The use of linear mixed models to estimate variance components from data on twin pairs by maximum likelihood’, Twin Research 7(6), 670–674.

Visscher, P. M., Brown, M. A., McCarthy, M. I. & Yang, J. (2012), ‘Five years of gwas discovery’, The American Journal of Human Genetics 90(1), 7–24.

Walley, A. J., Asher, J. E. & Froguel, P. (2009), ‘The genetic contribution to non- syndromic human obesity’, Nature Reviews Genetics 10(7), 431–442.

Walter, K., Min, J. L., Huang, J., Crooks, L. et al. (2015), ‘The uk10k project identifies rare variants in health and disease’, Nature 526(7571), 82.

Wang, H., Maurano, M. T., Qu, H., Varley, K. E., Gertz, J., Pauli, F., Lee, K., Canfield, T., Weaver, M., Sandstrom, R. et al. (2012), ‘Widespread plasticity in CTCF occupancy linked to DNA methylation’, Genome research 22(9), 1680–1688.

Wang, N., Hoffman, E. P., Chen, L., Chen, L., Zhang, Z., Liu, C., Yu, G., Herrington, D. M., Clarke, R. & Wang, Y. (2016), ‘Mathematical modelling of transcriptional heterogeneity identifies novel markers and subpopulations in complex tissues’, Sci- entific reports 6.

Wang, X., Ding, X., Su, S., Spector, T., Mangino, M., Iliadou, A. & Snieder, H. (2009), ‘Heritability of insulin sensitivity and lipid profile depend on BMI: evidence for gene–obesity interaction’, Diabetologia 52(12), 2578–2584.

135 Weinstein, J. S., Lezon-Geyda, K., Maksimova, Y., Craft, S., Zhang, Y., Su, M., Schulz, V. P., Craft, J. & Gallagher, P. G. (2014), ‘Global transcriptome analy- sis and enhancer landscape of human primary T follicular helper and T effector lymphocytes’, Blood 124(25), 3719–3729.

Westra, H.-J., Arends, D., Esko, T., Peters, M. J., Schurmann, C., Schramm, K., Kettunen, J., Yaghootkar, H., Fairfax, B. P., Andiappan, A. K. et al. (2015), ‘Cell specific eQTL analysis without sorting cells’, PLoS Genet 11(5), e1005223.

Wheeler, H. E. & Dolan, M. E. (2012), ‘Lymphoblastoid cell lines in pharmacogenomic discovery and clinical translation’, Pharmacogenomics 13(1), 55–70.

Wheeler, H. E., Shah, K. P., Brenner, J., Garcia, T., Aquino-Michaels, K., Consor- tium, G., Cox, N. J., Nicolae, D. L. & Im, H. K. (2016), ‘Survey of the heritability and sparse architecture of gene expression traits across human tissues’, PLOS Ge- netics 12(11), 1–23.

Willer, C. J., Schmidt, E. M., Sengupta, S., Peloso, G. M. et al. (2013), ‘Discovery and refinement of loci associated with lipid levels’, Nature Genetics 45(11), 1274–1283.

Wong, J. J.-L., Au, A. Y., Ritchie, W. & Rasko, J. E. (2016), ‘Intron retention in mRNA: No longer nonsense’, BioEssays 38(1), 41–49.

Wood, A. R., Tyrrell, J., Beaumont, R., Jones, S. E., Tuke, M. A., Ruth, K. S., Yaghootkar, H., Freathy, R. M., Murray, A., Frayling, T. M. & Weedon, M. N. (2016), ‘Variants in the FTO and CDKAL1 loci have recessive effects on risk of obesity and type 2 diabetes, respectively’, Diabetologia 59(6), 1214–1221.

Wright, F. A., Sullivan, P. F., Brooks, A. I., Zou, F., Sun, W., Xia, K., Madar, V., Jansen, R., Chung, W., Zhou, Y.-H. et al. (2014), ‘Heritability and genomics of gene expression in peripheral blood’, Nature genetics 46(5), 430–437.

Wu, C., Kraft, P., Zhai, K., Chang, J., Wang, Z., Li, Y., Hu, Z., He, Z., Jia, W., Ab- net, C. C. et al. (2012), ‘Genome-wide association analyses of esophageal squamous cell carcinoma in chinese identify multiple susceptibility loci and gene-environment interactions’, Nature genetics 44(10), 1090–1097.

Wu, L., Zhou, L., Chen, C., Gong, J., Xu, L., Ye, J., Li, P. et al. (2014), ‘Cidea controls lipid droplet fusion and lipid storage in brown and white adipose tissue’, Science China Life sciences 57(1), 107–116.

136 Xu, H., Barnes, G. T., Yang, Q., Tan, G., Yang, D., Chou, C. J., Sole, J., Nichols, A., Ross, J. S., Tartaglia, L. A. et al. (2003), ‘Chronic inflammation in fat plays a crucial role in the development of obesity-related insulin resistance’, The Journal of clinical investigation 112(12), 1821–1830.

Yang, J., Bakshi, A., Zhu, Z., Hemani, G., Vinkhuyzen, A. A. E., Lee, S. H., Robinson, M. R., Perry, J. R. B., Nolte, I. M., van Vliet-Ostaptchouk, J. V., Snieder, H., Esko, T., Milani, L., Maegi, R., Metspalu, A., Hamsten, A., Magnusson, P. K. E., Pedersen, N. L., Ingelsson, E., Soranzo, N., Keller, M. C., Wray, N. R., Goddard, M. E., Visscher, P. M. & Study, L. C. (2015), ‘Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index’, Nature genetics 47(10), 1114.

Yip, J., Facchini, F. S. & Reaven, G. M. (1998), ‘Resistance to insulin-mediated glucose disposal as a predictor of cardiovascular disease’, The Journal of Clinical Endocrinology & Metabolism 83(8), 2773–2776.

Young, A. I., Wauthier, F. & Donnelly, P. (2016), ‘Multiple novel gene-by- environment interactions modify the effect of FTO variants on body mass index’, Nature Communications 7, 12724.

Yuan, M., Konstantopoulos, N., Lee, J., Hansen, L., Li, Z.-W., Karin, M. & Shoel- son, S. E. (2001), ‘Reversal of obesity- and diet-induced insulin resistance with salicylates or targeted disruption of ikkb’, Science 293(5535), 1673–1677.

Yvert, G., Brem, R. B., Whittle, J., Akey, J., Foss, E., Smith, E., Mackelprang, R. & Kruglyak, L. (2003), ‘Trans acting regulatory variation in saccharomyces cerevisiae and the role of transcription factors’, Nature genetics 35(1), 57–64.

Zavaroni, I., Bonini, L., Gasparini, P., Barilli, A., Zuccarelli, A., Dall’Aglio, E., Delsignore, R. & Reaven, G. (1999), ‘Hyperinsulinemia in a normal population as a predictor of non-insulin-dependent diabetes mellitus, hypertension, and coronary heart disease: The barilla factory revisited’, Metabolism 48(8), 989–994.

Zhang, H., Xue, C., Shah, R., Bermingham, K., Hinkle, C. C., Li, W., Rodrigues, A., Tabita-Martinez, J., Millar, J. S., Cuchel, M. et al. (2015), ‘Functional analysis and transcriptomic profiling of ipsc-derived macrophages and their application in modeling mendelian disease’, Circulation research 117(1), 17–28.

137 Zhao, K., Lu, Z.-x., Park, J. W., Zhou, Q. & Xing, Y. (2013), ‘Glimmps: robust sta- tistical model for regulatory variation of alternative splicing using RNA-seq data’, Genome biology 14(7), 1.

Zhernakova, D., Deelen, P., Vermaat, M., van Iterson, M., van Galen, M., Arindrarto, W., van t Hof, P., Mei, H., van Dijk, F., Westra, H.-J. et al. (2015), ‘Hypothesis-free identification of modulators of genetic risk factors’, bioRxiv p. 033217.

Zhou, Z., Toh, S. Y., Chen, Z., Guo, K., Ng, C. P., Ponniah, S., Lin, S.-C., Hong, W. & Li, P. (2003), ‘Cidea-deficient mice have lean phenotype and are resistant to obesity’, Nature genetics 35(1), 49–56.

Zhu, Z., Zhang, F., Hu, H., Bakshi, A., Robinson, M. R., Powell, J. E., Montgomery, G. W., Goddard, M. E., Wray, N. R., Visscher, P. M. et al. (2016), ‘Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets’, Nature genetics .

Zimmerlin, L., Donnenberg, V. S., Pfeifer, M. E., Meyer, E. M., P´eault,B., Rubin, J. P. & Donnenberg, A. D. (2010), ‘Stromal vascular progenitors in adult human adipose tissue’, Cytometry Part A 77(1), 22–30.

138 Appendix A

This Appendix contains any supplementary tables that are too large to fit into the main body of the thesis.

Trait Tissue Sample size

Adipose 377 Blood 192 Adiponectin Skin 344 LCLs 404 Adipose 421 Blood 208 Leptin Skin 390 LCLs 452 Adipose 691 Blood 350 Insulin Skin 647 LCLs 739 Adipose 722 Blood 365 Glucose Skin 676 LCLs 768 Adipose 701 Blood 357 LDL Skin 657 LCLs 746 Adipose 718 Blood 364 HDL Skin 675

139 LCLs 764 Adipose 724 Blood 365 Total Cholesterol Skin 677 LCLs 769 Adipose 709 Blood 360 Triglycerides Skin 663 LCLs 755 Adipose 574 Blood 292 Waist Skin 540 LCLs 621 Adipose 576 Blood 291 Hip Skin 541 LCLs 623 Adipose 572 Blood 290 WHR Skin 538 LCLs 619 Adipose 726 Blood 366 BMI Skin 680 LCLs 772 Adipose 684 Blood 349 Fruit & Vegetable Skin 641 LCLs 730 Adipose 684 Blood 349 High Alcohol Skin 641 LCLs 730 Adipose 684 Blood 349 Traditional English Skin 641

140 LCLs 730 Adipose 684 Blood 349 Dieting Skin 641 LCLs 730 Adipose 684 Blood 349 Low Meat Skin 641 LCLs 730 Adipose 682 Blood 312 Trunk fat Skin 577 LCLs 650 Adipose 682 Blood 312 Lean mass Skin 577 LCLs 650 Adipose 682 Blood 312 Trunk % fat Skin 577 LCLs 650 Adipose 682 Blood 312 Whole-body fat Skin 577 LCLs 650 Adipose 682 Blood 312 Whole-body lean Skin 577 LCLs 650 Adipose 682 Blood 312 Whole-body % fat Skin 577 LCLs 650

Table A.1: Number of individuals per Phenotypes, across all tissues.

141 Adipose Blood LCL Skin Gene SNP Allele Frequency Meta exon coordinates chr:start-stop Adipose GxBMI β GxBMI GxBMI GxBMI GxBMI P-value P-value P-value P-value CHURC1 rs7143432 0.784 chr14:65398856 - 65401913 0.026 1.97 × 10-12 0.33 0.26 0.45 CAST rs13160562 0.688 chr5:96104652 -96106306 -0.032 3.91 × 10-12 0.28 0.29 0.66 CIDEA rs7505859 0.617 chr18:12262824 -12262968 -0.028 3.12 × 10-11 0.82 NA NA ZNF117 rs6948760 0.403 chr7:64450325 - 64451414 0.039 4.44 × 10-11 0.72 0.92 0.97 ADH1A rs1693457 0.183 chr4:100202972 -100203763 0.034 5.87 × 10-11 NA NA NA RP11-71E19.1 rs1980140 0.786 chr3:126006856 - 126007260 -0.058 6.05 × 10-11 NA NA NA PEPD rs10415555 0.811 chr19:33877855 - 33878387 -0.044 4.82 × 10-10 0.71 0.91 0.64 -10

142 PHACTR3 rs6070866 0.514 chr20:58418510 - 58420320 -0.052 7.22 × 10 NA NA NA ANXA5 rs2306420 0.714 chr4:122589110 - 122589682 0.022 1.37 × 10-9 0.58 0.51 0.50 SIK1 rs12482956 0.710 chr21:44845287 - 44845403 0.058 3.00 × 10-9 0.78 NA 0.36 HLA-DQB2 rs114370295 0.267 chr6:32726627 - 32726908 -0.050 3.50 × 10-9 0.81 NA 0.59 ERV3-1 rs11979998 0.517 chr7:64451187 -64453792 0.032 8.40 × 10-9 0.32 0.03 0.004 POU6F2 rs34792397 0.747 chr7:39503781 - 39508200 -0.041 9.98 × 10-9 NA NA NA IFNAR1 rs2834098 0.781 chr21:34721695 - 34721849 -0.047 1.38 × 10-8 0.52 0.72 0.90 SCFD2 rs7687982 0.754 chr4:54139993 - 54140168 -0.059 1.52 × 10-8 0.78 0.68 0.27 SPAG17 rs9661038 0.637 chr1:118514403 - 118514670 0.043 2.77 × 10-8 NA NA NA

Table A.2: 16 FDR5% G × BMI eQTLs in adipose tissue and their association P-values in Skin, Blood and LCLs. Table lists lead exon coordinates in TwinsUK. NA = not expressed in tissue. Adipose Skin main Adipose main LCL Skin Blood Blood Gene SNP main effect LCL main effect P-value effect effect β main effect β main effect β main effect β main effect P-value P-value P-value CHURC1 rs7143432 -0.765 6.00 × 10-38 -0.762 4.02 × 10-39 -0.846 3.89 × 10-43 -0.737 9.56 × 10-18 CAST rs13160562 -0.740 1.98 × 10-43 -0.755 5.02 × 10-49 -0.851 3.14 × 10-56 -0.866 1.59 × 10-32 CIDEA rs7505859 -0.469 4.63 × 10-16 NA NA -0.310 2.74 × 10-7 NA NA ZNF117 rs6948760 0.575 1.14 × 10-27 0.376 9.80 × 10-13 0.474 1.29 × 10-17 0.481 2.94 × 10-11 ADH1A rs1693457 -0.893 2.85 × 10-41 NA NA NA NA NA NA RP11-71E19.1 rs1980140 -0.820 4.83 × 10-42 NA NA NA NA NA NA PEPD rs10415555 -0.905 1.90 × 10-45 -0.006 0.93 -0.334 2.60× 10-6 -0.692 0.49 143 PHACTR3 rs6070866 -0.544 9.59 × 10-25 NA NA NA NA NA NA ANXA5 rs2306420 0.597 1.22 × 10-26 0.403 1.29 × 10-13 0.713 5.61× 10-37 0.478 1.34 × 10-9 SIK1 rs12482956 0.117 0.0419 0.039 0.50 -0.067 0.27 NA NA HLA-DQB2 rs114370295 -0.617 1.60 × 10-26 -0.580 6.12 × 10-24 -0.003 0.96 NA NA ERV3-1 rs11979998 0.514 2.36 × 10-21 0.303 2.45 × 10-8 0.315 5.17 × 10-8 0.333 1.60 × 10-5 POU6F2 rs34792397 0.805 5.47 × 10-46 NA NA NA NA NA NA IFNAR1 rs2834098 -0.100 0.115 0.090 0.15 0.010 0.87 -0.010 0.87 SCFD2 rs7687982 -0.036 0.574 -0.067 0.28 0.010 0.87 0.111 0.20 SPAG17 rs9661038 0.041 0.467 NA NA NA NA NA NA

Table A.3: 16 FDR5% G × BMI eQTL main effect in each tissue. Gene chr β t-stat P-value

ZNF423 16 0.06 7.29 8.23 × 10-13 THBS2 6 0.068 6.62 7.02 × 10-11 FBLN5 14 0.049 5.99 3.27 × 10-9 SLC44A2 19 0.045 5.99 3.40 × 10-9 NDRG1 8 0.052 5.92 5.03 × 10-9 GPR124 8 0.052 5.86 7.15 × 10-9 ACAA2 18 -0.048 -5.83 8.31 × 10-9 TLN2 15 0.059 5.80 9.91 × 10-9 A2M 12 0.046 5.78 1.12 × 10-8 HSPG2 1 0.048 5.77 1.20 × 10-8 NFE2L1 17 0.052 5.76 1.26 × 10-8 AMOTL2 3 0.049 5.71 1.67 × 10-8 MYLK 3 0.053 5.71 1.67 × 10-8 SEC14L1 17 0.049 5.70 1.72 × 10-8 ATP5F1 1 -0.055 -5.68 1.98 × 10-8 ARID1B 6 0.059 5.67 2.09 × 10-8 TGFBR2 3 0.053 5.67 2.10 × 10-8 KIAA0195 17 0.054 5.65 2.37 × 10-8 LRP1 12 0.049 5.61 2.85 × 10-8 PLXND1 3 0.049 5.61 2.95 × 10-8 COX6C 8 -0.048 -5.60 3.05 × 10-8 PACS1 11 0.056 5.60 3.10 × 10-8 COX7A2 6 -0.048 -5.60 3.13 × 10-8 HIP1 7 0.051 5.59 3.17 × 10-8 XXbac-B461K10.4 22 0.046 5.59 3.24 × 10-8 MBNL2 13 0.05 5.59 3.29 × 10-8 ANTXR1 2 0.059 5.56 3.76 × 10-8 TTC28 22 0.054 5.55 4.01 × 10-8 CALD1 7 0.044 5.55 4.13 × 10-8 PUM2 2 0.052 5.55 4.13 × 10-8 IMMT 2 -0.043 -5.54 4.17 × 10-8 MAN1A1 6 0.052 5.51 5.14 × 10-8 RFTN1 3 0.048 5.51 5.14 × 10-8 SLC25A3 12 -0.05 -5.50 5.32 × 10-8

144 FNDC3B 3 0.048 5.50 5.33 × 10-8 KLF6 10 0.051 5.48 5.85 × 10-8 LAMB2 3 0.047 5.47 6.34 × 10-8 TPP1 11 0.052 5.47 6.41 × 10-8 TRAM1 8 0.051 5.46 6.54 × 10-8 MCM3 6 0.058 5.44 7.36 × 10-8 COX6B1 19 -0.055 -5.44 7.47 × 10-8 ZC3H7A 16 0.05 5.43 7.58 × 10-8 AC007098.1 2 0.048 5.42 8.19 × 10-8 TIMP2 17 0.046 5.42 8.21 × 10-8 KANK2 19 0.054 5.41 8.41 × 10-8 AC027323.1 5 0.047 5.41 8.43 × 10-8 ESYT2 7 0.049 5.41 8.73 × 10-8 LIMCH1 4 0.046 5.38 1.01 × 10-7 PTPN21 14 0.05 5.38 1.01 × 10-7 COL8A1 3 0.03 5.38 1.01 × 10-7 RPRD1A 18 0.056 5.38 1.01 × 10-7 CAPN3 15 0.054 5.37 1.04 × 10-7 SMAD9 13 0.058 5.37 1.06 × 10-7

Table A.4: All 53 ALG9 trans-G × BMI eQTL genes.

145 original adjusted Mediation Sobel’s original adjusted Gene G × BMI G × BMI score P-value P-value P-value name β β

ZNF423 0.300 0.0001 0.060 0.042 8.23 × 10-13 8.72 × 10-8 THBS2 0.236 0.0001 0.068 0.052 7.02 × 10-11 3.45 × 10-7 FBLN5 0.336 0.001 0.049 0.032 3.27 × 10-9 3.58 × 10-5 SLC44A2 0.403 0.002 0.045 0.027 3.40 × 10-9 1.25 × 10-4 NDRG1 0.386 0.002 0.052 0.032 5.03 × 10-9 1.15 × 10-4 GPR124 0.375 0.002 0.052 0.032 7.15 × 10-9 1.13 × 10-4 ACAA2 0.380 0.002 -0.048 -0.030 8.31 × 10-9 1.33 × 10-4 TLN2 0.319 0.001 0.059 0.040 9.91 × 10-9 4.89 × 10-5 A2M 0.318 0.002 0.046 0.032 1.12 × 10-8 5.13 × 10-5 HSPG2 0.371 0.002 0.048 0.030 1.20 × 10-8 1.40 × 10-4 NFE2L1 0.416 0.005 0.052 0.030 1.26 × 10-8 3.06 × 10-4 AMOTL2 0.357 0.003 0.049 0.032 1.67 × 10-8 1.31 × 10-4 MYLK 0.280 0.001 0.053 0.038 1.67 × 10-8 3.17 × 10-5 SEC14L1 0.286 0.001 0.049 0.035 1.72 × 10-8 3.63 × 10-5 ATP5F1 0.391 0.003 -0.055 -0.034 1.98 × 10-8 2.58 × 10-4 ARID1B 0.387 0.003 0.059 0.036 2.09 × 10-8 2.48 × 10-4 TGFBR2 0.348 0.003 0.053 0.034 2.10 × 10-8 1.28 × 10-4 KIAA0195 0.315 0.001 0.054 0.037 2.37 × 10-8 7.57 × 10-5 LRP1 0.379 0.002 0.049 0.030 2.85 × 10-8 2.54 × 10-4 PLXND1 0.387 0.004 0.049 0.030 2.95 × 10-8 2.96 × 10-4 COX6C 0.459 0.006 -0.048 -0.026 3.05 × 10-8 9.37 × 10-4 PACS1 0.277 0.001 0.056 0.041 3.10 × 10-8 4.39 × 10-5 COX7A2 0.447 0.005 -0.048 -0.026 3.13 × 10-8 7.91 × 10-4 HIP1 0.244 0.001 0.051 0.039 3.17 × 10-8 2.35 × 10-5 XXbac- 0.388 0.004 0.046 0.028 3.24 × 10-8 3.18 × 10-4 B461K10.4 MBNL2 0.364 0.003 0.050 0.032 3.29 × 10-8 2.16 × 10-4 ANTXR1 0.274 0.001 0.059 0.043 3.76 × 10-8 4.62 × 10-5 TTC28 0.314 0.001 0.054 0.037 4.01 × 10-8 1.01 × 10-4 CALD1 0.337 0.003 0.044 0.029 4.13 × 10-8 1.54 × 10-4 PUM2 0.333 0.002 0.052 0.035 4.13 × 10-8 1.43 × 10-4 IMMT 0.435 0.006 -0.043 -0.024 4.17 × 10-8 7.58 × 10-4

146 MAN1A1 0.404 0.005 0.052 0.031 5.14 × 10-8 5.22 × 10-4 RFTN1 0.454 0.006 0.048 0.026 5.14 × 10-8 1.12 × 10-3 SLC25A3 0.390 0.004 -0.050 -0.031 5.32 × 10-8 4.20 × 10-4 FNDC3B 0.306 0.002 0.048 0.033 5.33 × 10-8 1.02 × 10-4 KLF6 0.434 0.008 0.051 0.029 5.85 × 10-8 8.82 × 10-4 LAMB2 0.399 0.005 0.047 0.028 6.34 × 10-8 5.37 × 10-4 TPP1 0.367 0.004 0.052 0.033 6.41 × 10-8 3.21 × 10-4 TRAM1 0.369 0.005 0.051 0.032 6.54 × 10-8 3.36 × 10-4 MCM3 0.249 0.001 0.058 0.044 7.36 × 10-8 4.37 × 10-5 COX6B1 0.426 0.005 -0.055 -0.032 7.47 × 10-8 8.75 × 10-4 ZC3H7A 0.355 0.003 0.050 0.032 7.58 × 10-8 2.87 × 10-4 AC007098.1 0.212 0.001 0.048 0.037 8.19 × 10-8 2.31 × 10-5 TIMP2 0.336 0.002 0.046 0.030 8.21 × 10-8 2.20 × 10-4 KANK2 0.424 0.005 0.054 0.031 8.41 × 10-8 9.04 × 10-4 AC027323.1 0.285 0.001 0.047 0.034 8.43 × 10-8 9.30 × 10-5 ESYT2 0.428 0.005 0.049 0.028 8.73 × 10-8 9.75 × 10-4 LIMCH1 0.263 0.001 0.046 0.034 1.01 × 10-7 6.97 × 10-5 PTPN21 0.423 0.007 0.050 0.029 1.01 × 10-7 9.67 × 10-4 COL8A1 0.380 0.003 0.030 0.019 1.01 × 10-7 5.01 × 10-4 RPRD1A 0.299 0.002 0.056 0.039 1.01 × 10-7 1.32 × 10-4 CAPN3 0.221 0.001 0.054 0.042 1.04 × 10-7 3.18 × 10-5 SMAD9 0.273 0.001 0.058 0.042 1.06 × 10-7 8.52 × 10-5

Table A.5: Mediation analysis suggest ALG9 mediates the trans-network via its cis- eQTL

147 Effect Gene chr SNP β P-value allele

CHURC1 chr14 rs7143432 A 0.026 1.97 × 10-12 CAST chr5 rs13160562 G -0.032 3.91 × 10-12 CIDEA chr18 rs7505859 C -0.028 3.12 × 10-11 ZNF117 chr7 rs6948760 T 0.039 4.44 × 10-11 ADH1A chr4 rs1693457 C 0.034 5.87 × 10-11 RP11-71E19.1 chr3 rs1980140 A -0.058 6.05 × 10-11 PEPD chr19 rs10415555 A -0.044 4.82 × 10-10 PHACTR3 chr20 rs6070866 G -0.052 7.22 × 10-10 ANXA5 chr4 rs2306420 G 0.022 1.37 × 10-9 SIK1 chr21 rs12482956 A 0.058 3.00 × 10-9 HLA-DQB2 chr6 rs114370295 T -0.050 3.50 × 10-9 ERV3-1 chr7 rs11979998 C 0.032 8.40 × 10-9 POU6F2 chr7 rs34792397 G -0.041 9.98 × 10-9 IFNAR1 chr21 rs2834098 C -0.047 1.38 × 10-8 SCFD2 chr4 rs7687982 A -0.059 1.52 × 10-8 ATP5E chr20 rs6070701 T -0.031 1.63 × 10-8 RAB23 chr6 rs2109640 A -0.035 1.76 × 10-8 RBM6 chr3 rs1076872 A 0.029 1.88 × 10-8 SPAG17 chr1 rs9661038 G 0.043 2.77 × 10-8 HLA-DQA2 chr6 rs241434 C 0.063 3.13 × 10-8 LTBP1 chr2 rs13030608 C 0.047 4.29 × 10-8 STT3B chr3 rs60808883 T -0.037 4.82 × 10-8 SND1 chr7 rs7785092 T 0.035 5.56 × 10-8 PI4KB chr1 rs3002292 C 0.067 6.19 × 10-8 HACL1 chr3 rs1464171 C 0.051 6.20 × 10-8 HABP4 chr9 rs10820507 A -0.034 6.37 × 10-8 SLC7A8 chr14 rs2145541 C -0.049 6.53 × 10-8 NDOR1 chr9 rs2501568 T 0.065 6.60 × 10-8 FADS2 chr11 rs34685600 T 0.036 8.10 × 10-8 ICAM3 chr19 rs61451658 C -0.038 8.88 × 10-8 HSPG2 chr1 rs12410718 C 0.019 8.95 × 10-8 PLEKHH2 chr2 rs4952959 A -0.032 9.22 × 10-8 MBTPS1 chr16 rs2549175 T -0.040 9.65 × 10-8

148 SKIV2L2 chr5 rs161704 G 0.055 9.90 × 10-8 FABP4 chr8 rs75843534 T -0.038 1.02 × 10-7 ZNHIT6 chr1 rs1750491 T 0.058 1.06 × 10-7 PLAT chr8 rs34060364 C 0.073 1.07 × 10-7 ATAD2 chr8 rs16899510 A -0.065 1.09 × 10-7 SMARCD2 chr17 rs7219143 A -0.058 1.13 × 10-7 ARL16 chr17 rs12937453 C 0.043 1.15 × 10-7 GOLGA8A chr15 rs4924108 C -0.048 1.23 × 10-7 TPRG1L chr1 rs2493324 G 0.043 1.42 × 10-7 XXbac-BPG246D15.9 chr6 rs3819721 G -0.044 1.51 × 10-7 GAA chr17 17-78673474 G 0.040 1.62 × 10-7 FAT1 chr4 rs11930461 C -0.027 1.92 × 10-7 USP39 chr2 rs6733569 C -0.039 2.10 × 10-7 WDR86 chr7 rs34812229 G 0.041 2.11 × 10-7 PLEKHM2 chr1 rs486557 G 0.044 2.18 × 10-7 RABGGTB chr1 rs76613167 C 0.047 2.23 × 10-7 VARS chr6 rs114881836 T -0.039 2.36 × 10-7 TYSND1 chr10 rs4746060 C -0.046 2.44 × 10-7 COG7 chr16 rs1946274 C -0.066 2.55 × 10-7 ABCA1 chr9 rs2472390 T 0.028 2.59 × 10-7 WISP2 chr20 rs148428568 A 0.027 2.67 × 10-7 FASN chr17 rs11653879 C -0.017 2.91 × 10-7 MYH9 chr22 rs9610474 T 0.039 2.93 × 10-7 GHITM chr10 rs11201352 T -0.019 2.99 × 10-7 C16orf89 chr16 rs12921958 G 0.055 3.01 × 10-7 FN1 chr2 rs4673914 C -0.021 3.24 × 10-7 CIITA chr16 rs8052709 A -0.059 3.37 × 10-7 LENG8 chr19 rs10426302 G 0.026 3.37 × 10-7 C1orf115 chr1 rs774132 T -0.036 3.55 × 10-7 YIPF1 chr1 rs12746317 G 0.045 3.82 × 10-7 SFT2D1 chr6 rs7741165 T 0.041 3.99 × 10-7 EPS15 chr1 rs11577254 C -0.068 4.07 × 10-7 GRB10 chr7 rs73344604 C 0.075 4.08 × 10-7 UPP1 chr7 rs9639996 A -0.034 4.09 × 10-7 SELP chr1 rs140855401 C -0.043 4.12 × 10-7 SPARC chr5 rs13168551 T 0.014 4.30 × 10-7

149 SERPINF1 chr17 rs2957924 A 0.040 4.33 × 10-7 HLA-H chr6 rs145825529 A -0.107 4.35 × 10-7 SCUBE1 chr22 rs5759291 T -0.046 4.57 × 10-7 ACTN1 chr14 rs10130694 T 0.026 4.74 × 10-7 DIRC2 chr3 3-121765134 C -0.076 4.83 × 10-7 DIS3L chr15 rs899086 G -0.042 4.87 × 10-7 SCUBE2 chr11 rs10743099 C 0.042 4.94 × 10-7 ACO1 chr9 rs706134 A 0.027 5.03 × 10-7 UTP15 chr5 rs115171889 A -0.076 5.07 × 10-7 CD36 chr7 rs2906199 A 0.061 5.16 × 10-7 IRF8 chr16 rs35615495 C 0.027 5.28 × 10-7 COL4A2 chr13 rs336231 C -0.029 5.29 × 10-7 TAF1C chr16 rs6563998 G -0.079 5.36 × 10-7 INTS1 chr7 rs10267796 G -0.038 5.44 × 10-7 NXN chr17 rs7210753 G 0.043 5.73 × 10-7 SLC43A2 chr17 rs7220186 T 0.042 5.73 × 10-7 DTX1 chr12 rs12821546 A -0.036 5.87 × 10-7 TAP2 chr6 rs3819721 G -0.041 5.90 × 10-7 TRIP12 chr2 rs73992965 A 0.024 5.93 × 10-7 AAK1 chr2 rs3771537 A 0.035 6.18 × 10-7 ADAMTSL3 chr15 rs2667385 G -0.028 6.19 × 10-7 RPS27L chr15 rs139076858 T 0.053 6.45 × 10-7 AC006276.1 chr19 rs8108656 T 0.051 6.49 × 10-7 CD46 chr1 rs1853526 G 0.018 6.59 × 10-7 LARGE chr22 rs61517662 T -0.064 6.59 × 10-7 ABCA5 chr17 rs11652207 T -0.038 6.63 × 10-7 MRPL43 chr10 rs912477 A 0.079 6.85 × 10-7 ANAPC7 chr12 rs117514387 C 0.075 6.88 × 10-7 WDR3 chr1 rs6692932 G 0.064 7.09 × 10-7 ATPIF1 chr1 rs4908343 G -0.059 7.39 × 10-7 ITPR3 chr6 rs7775537 C -0.053 7.45 × 10-7 COL5A2 chr2 rs6749559 A -0.070 7.53 × 10-7 TBX3 chr12 rs66524339 T 0.061 7.63 × 10-7 GRB14 chr2 rs9798198 C -0.090 7.64 × 10-7 STC2 chr5 rs792757 T 0.073 7.90 × 10-7 DENND4A chr15 rs8025898 C -0.053 8.01 × 10-7

150 ARF3 chr12 rs7316688 G -0.031 8.06 × 10-7 MCC chr5 rs72803252 C 0.059 8.14 × 10-7 PLOD1 chr1 rs12731666 T 0.042 8.31 × 10-7 PDCD6IP chr3 rs77436863 G -0.083 8.32 × 10-7 DIRC3 chr2 rs1179691 G -0.037 8.40 × 10-7 SPATA7 chr14 rs113406033 G -0.077 8.43 × 10-7 ARFGEF2 chr20 rs852368 A 0.027 8.53 × 10-7 C10orf76 chr10 rs7918630 C 0.032 8.58 × 10-7 IQGAP1 chr15 rs72748836 T -0.019 8.66 × 10-7 DDX17 chr22 22-39040451 A 0.015 8.76 × 10-7 CCNT2 chr2 rs1519308 A -0.044 8.77 × 10-7 MTA2 chr11 rs9943597 A 0.028 8.81 × 10-7 TP53I11 chr11 rs835745 A -0.036 8.95 × 10-7 SLC9A8 chr20 rs13038606 A -0.091 9.27 × 10-7 H1FX-AS1 chr3 rs57322580 G 0.063 9.27 × 10-7 PLOD2 chr3 rs6440413 T -0.021 9.54 × 10-7 RINT1 chr7 rs62481297 G 0.084 9.58 × 10-7 LRPAP1 chr4 rs2916426 G 0.033 9.61 × 10-7 RSL1D1 chr16 rs8045743 C -0.058 9.68 × 10-7 TRAP1 chr16 rs140423238 A -0.074 9.70 × 10-7 RP11-438L7.3 chr12 rs11045619 A -0.045 9.76 × 10-7 RP11-290P14.2 chr1 rs12566094 G -0.053 9.99 × 10-7

Table A.6: All 127 G × BMI eQTLs with P-value ≤ 1 × 10-6

151 Gene Original P-value BMI Beta Adj P-value Adj Beta

ENSG00000173372.12 3.38 × 10-15 0.269609937 5.88 × 10-6 0.110411758 ENSG00000169896.12 1.11 × 10-14 0.283622396 1.45 × 10-5 0.111293749 ENSG00000170458.9 1.26 × 10-14 0.260534275 6.38 × 10-6 0.107695417 ENSG00000124491.11 1.58 × 10-14 0.288550398 1.11 × 10-5 0.121842429 ENSG00000110324.5 2.40 × 10-14 0.277937302 9.30 × 10-6 0.114736864 ENSG00000026297.11 2.41 × 10-14 0.243479282 7.44 × 10-6 0.106930664 ENSG00000163154.5 2.65 × 10-14 0.27300384 1.18 × 10-5 0.122326855 ENSG00000110079.12 2.72 × 10-14 0.28124658 3.15 × 10-5 0.105497338 ENSG00000006534.11 3.61 × 10-14 0.255958004 5.74 × 10-6 0.124503012 ENSG00000160014.12 3.64 × 10-14 0.257277785 8.61 × 10-6 0.122853499 ENSG00000165457.9 4.02 × 10-14 0.265995114 5.52 × 10-5 0.097602776 ENSG00000104972.10 4.18 × 10-14 0.279051753 4.35 × 10-6 0.14004233 ENSG00000010327.6 4.19 × 10-14 0.248126389 4.32 × 10-6 0.122121649 ENSG00000140749.7 4.86 × 10-14 0.27759499 7.38 × 10-6 0.134634725 ENSG00000187474.4 6.01 × 10-14 0.280548028 3.59 × 10-5 0.113335391 ENSG00000169403.7 8.38 × 10-14 0.27398772 8.75 × 10-6 0.127778974 ENSG00000123329.13 8.71 × 10-14 0.268207614 1.26 × 10-5 0.128006162 ENSG00000160791.12 9.66 × 10-14 0.277238073 6.72 × 10-6 0.137798434 ENSG00000110876.8 1.06 × 10-13 0.267722551 2.67 × 10-5 0.113675525 ENSG00000104894.7 1.09 × 10-13 0.262845718 6.37 × 10-6 0.133950244 ENSG00000239998.1 1.26 × 10-13 0.270564397 1.12 × 10-5 0.132142959 ENSG00000179163.11 1.40 × 10-13 0.26609435 8.49 × 10-5 0.103617208 ENSG00000167642.8 1.43 × 10-13 0.256746787 6.95 × 10-6 0.132464853 ENSG00000184574.5 1.74 × 10-13 0.269271619 7.84 × 10-6 0.134675768 ENSG00000163563.7 1.90 × 10-13 0.274535424 1.75 × 10-5 0.128943449 ENSG00000173369.11 2.24 × 10-13 0.258948721 0.000169295 0.091938106 ENSG00000128973.7 2.99 × 10-13 0.225380159 6.31 × 10-6 0.120782256 ENSG00000141574.3 3.28 × 10-13 0.246811084 7.11 × 10-6 0.13403383 ENSG00000183801.3 3.43 × 10-13 0.270109371 4.52 × 10-6 0.148492921 ENSG00000163803.8 3.47 × 10-13 0.264030128 1.77 × 10-5 0.129574512 ENSG00000130592.9 3.56 × 10-13 0.239125667 5.28 × 10-5 0.109026937 ENSG00000155307.13 4.55 × 10-13 0.268659309 2.61 × 10-5 0.126658621 ENSG00000143387.8 5.94 × 10-13 0.259685816 1.59 × 10-5 0.134383468 ENSG00000185482.3 6.02 × 10-13 0.252376378 8.13 × 10-6 0.139905106

152 ENSG00000161381.9 6.25 × 10-13 0.267442578 3.73 × 10-5 0.128173075 ENSG00000106952.3 7.48 × 10-13 0.266217768 4.73 × 10-5 0.123113888 ENSG00000042493.11 7.97 × 10-13 0.241976983 2.77 × 10-5 0.119566488 ENSG00000138172.6 1.00 × 10-12 0.2436147 1.62 × 10-5 0.1266924 ENSG00000174004.5 1.03 × 10-12 0.247616066 8.66 × 10-6 0.129803439 ENSG00000145730.16 1.07 × 10-12 0.262898754 6.25 × 10-6 0.146017704 ENSG00000132205.6 1.21 × 10-12 0.257491754 0.000791721 0.089250187 ENSG00000111802.9 1.43 × 10-12 0.236674232 9.55 × 10-6 0.133129282 ENSG00000105383.10 1.83 × 10-12 0.262212051 0.000782558 0.092898365 ENSG00000171812.6 2.18 × 10-12 0.257718214 4.57 × 10-5 0.12866248 ENSG00000107819.9 2.18 × 10-12 0.253184158 2.13 × 10-5 0.136420014 ENSG00000135124.10 2.33 × 10-12 0.236966961 0.000233923 0.100144427 ENSG00000188112.4 2.48 × 10-12 0.252704581 7.87 × 10-6 0.146769762 ENSG00000196639.6 2.56 × 10-12 0.263778871 9.65 × 10-5 0.120401892 ENSG00000110931.14 2.78 × 10-12 0.252527108 4.21 × 10-6 0.149983469 ENSG00000102390.6 3.09 × 10-12 0.24333911 3.98 × 10-6 0.150508169 ENSG00000181467.2 3.27 × 10-12 0.248489585 1.38 × 10-5 0.135723261 ENSG00000113552.11 3.45 × 10-12 0.245129304 3.91 × 10-5 0.124743829 ENSG00000188921.12 3.71 × 10-12 0.254496969 1.98 × 10-5 0.133805524 ENSG00000244482.5 3.84 × 10-12 0.256301866 9.47 × 10-5 0.117235341 ENSG00000167925.11 4.04 × 10-12 0.191877629 5.42 × 10-6 0.116179041 ENSG00000198771.6 4.31 × 10-12 0.259308292 2.39 × 10-5 0.137678869 ENSG00000136040.4 4.40 × 10-12 0.255534353 1.76 × 10-5 0.139195952 ENSG00000174600.9 4.86 × 10-12 0.255568935 0.00085345 0.092260358 ENSG00000167895.10 5.08 × 10-12 0.238335101 9.87 × 10-6 0.138550733 ENSG00000114268.7 5.68 × 10-12 0.249882138 5.20 × 10-5 0.129014579 ENSG00000174083.13 6.12 × 10-12 0.247242745 4.18 × 10-5 0.128678063 ENSG00000169410.5 6.25 × 10-12 0.251745379 4.79 × 10-6 0.153943609 ENSG00000113273.11 8.42 × 10-12 0.248586371 5.87 × 10-6 0.150951631 ENSG00000125354.18 9.21 × 10-12 0.253245139 7.01 × 10-6 0.152082617 ENSG00000127838.9 9.35 × 10-12 0.199995404 4.20 × 10-6 0.127498453 ENSG00000162739.9 1.05 × 10-11 0.253107088 1.09 × 10-5 0.146893125 ENSG00000173020.6 1.06 × 10-11 0.171526242 1.17 × 10-5 0.101840828 ENSG00000182578.9 1.07 × 10-11 0.244137059 0.002546497 0.07491729 ENSG00000162711.12 1.10 × 10-11 0.252521988 0.000518751 0.100859003 ENSG00000166734.14 1.10 × 10-11 0.211592751 7.50 × 10-6 0.125986493

153 ENSG00000136754.12 1.11 × 10-11 0.230701478 1.60 × 10-5 0.130638433 ENSG00000072274.8 1.26 × 10-11 0.255802183 9.23 × 10-6 0.15039755 ENSG00000171777.11 1.27 × 10-11 0.242695544 0.000904914 0.093909464 ENSG00000156587.11 1.34 × 10-11 0.235368087 7.64 × 10-6 0.147266831 ENSG00000225614.2 1.40 × 10-11 0.211378725 8.21 × 10-6 0.129288962 ENSG00000186340.10 1.44 × 10-11 0.255237231 2.07 × 10-5 0.149631763 ENSG00000116824.4 1.44 × 10-11 0.251397316 7.31 × 10-6 0.154347129 ENSG00000235568.2 1.56 × 10-11 0.239625702 0.001363578 0.087802053 ENSG00000115165.5 1.58 × 10-11 0.253713927 9.86 × 10-6 0.150693155 ENSG00000059377.11 1.75 × 10-11 0.242384428 0.001209453 0.089334082 ENSG00000085265.6 1.75 × 10-11 0.248130433 2.43 × 10-5 0.143279684 ENSG00000158717.6 1.91 × 10-11 0.197240268 1.03 × 10-5 0.119458531 ENSG00000106948.12 1.92 × 10-11 0.238117753 9.16 × 10-6 0.146950366 ENSG00000129515.14 1.97 × 10-11 0.232496151 6.08 × 10-5 0.120700179 ENSG00000092009.6 1.97 × 10-11 0.244297481 1.15 × 10-5 0.147606714 ENSG00000185905.3 2.00 × 10-11 0.24865281 2.63 × 10-5 0.140699202 ENSG00000038427.11 2.10 × 10-11 0.248428763 0.000183744 0.120217784 ENSG00000109113.13 2.13 × 10-11 0.213940204 1.05 × 10-5 0.132318542 ENSG00000185215.4 2.14 × 10-11 0.231279562 6.74 × 10-5 0.122049096 ENSG00000006194.6 2.18 × 10-11 0.235024538 6.09 × 10-6 0.15097296 ENSG00000196209.8 2.22 × 10-11 0.25070611 0.000422271 0.107179163 ENSG00000163399.11 2.24 × 10-11 0.211881636 1.83 × 10-5 0.123139239 ENSG00000164440.10 2.26 × 10-11 0.248417699 9.23 × 10-6 0.153278997 ENSG00000182287.9 2.29 × 10-11 0.234311636 2.45 × 10-5 0.133069693 ENSG00000009790.10 2.30 × 10-11 0.249703331 9.01 × 10-6 0.154936226 ENSG00000139974.11 2.35 × 10-11 0.239761376 5.98 × 10-5 0.13096258 ENSG00000196533.6 2.40 × 10-11 0.250219329 1.19 × 10-5 0.15323489 ENSG00000159239.7 2.64 × 10-11 0.220639385 6.47 × 10-6 0.140808147 ENSG00000170684.4 2.92 × 10-11 0.212788136 6.87 × 10-6 0.136651128 ENSG00000164934.9 2.93 × 10-11 0.220010895 4.32 × 10-6 0.146645957 ENSG00000185515.10 3.02 × 10-11 0.220169468 7.77 × 10-6 0.139646461 ENSG00000111729.8 3.08 × 10-11 0.24621533 0.000120763 0.124737793 ENSG00000012660.9 3.12 × 10-11 -0.218227781 6.21 × 10-6 -0.141607755 ENSG00000111670.10 3.18 × 10-11 0.221764661 2.46 × 10-5 0.125583831 ENSG00000135842.12 3.23 × 10-11 0.239971938 7.59 × 10-6 0.150669605 ENSG00000090659.13 3.24 × 10-11 0.245847462 0.006087436 0.07154081

154 ENSG00000137404.10 3.43 × 10-11 0.219212435 2.23 × 10-5 0.131119069 ENSG00000160932.6 3.44 × 10-11 0.210270799 6.46 × 10-6 0.139493791 ENSG00000105464.3 3.88 × 10-11 0.229850261 4.85 × 10-6 0.152322944 ENSG00000156966.6 4.04 × 10-11 0.241022623 0.000751171 0.100142514 ENSG00000148344.10 4.04 × 10-11 0.231057513 4.66 × 10-5 0.133348375 ENSG00000171603.12 4.35 × 10-11 0.219711314 9.65 × 10-5 0.11688049 ENSG00000141968.3 4.36 × 10-11 0.234018874 0.001156992 0.093158195 ENSG00000134569.5 4.42 × 10-11 0.245031945 1.09 × 10-5 0.154806363 ENSG00000167604.9 4.44 × 10-11 0.216437057 9.38 × 10-6 0.138450326 ENSG00000127334.10 4.50 × 10-11 0.232635678 9.52 × 10-5 0.11932727 ENSG00000174780.11 4.62 × 10-11 0.22155175 1.31 × 10-5 0.136651243 ENSG00000115232.9 4.67 × 10-11 0.239604776 1.24 × 10-5 0.147047032 ENSG00000148672.7 4.71 × 10-11 0.227223681 1.94 × 10-5 0.136864303 ENSG00000106483.7 4.72 × 10-11 0.246221807 4.56 × 10-5 0.138902717 ENSG00000158481.8 4.74 × 10-11 0.247299722 6.30 × 10-5 0.135223062 ENSG00000122861.11 4.79 × 10-11 0.235925982 4.80 × 10-6 0.156639066 ENSG00000112799.4 5.01 × 10-11 0.243452264 4.60 × 10-5 0.138322315 ENSG00000135655.9 5.25 × 10-11 0.209937792 1.06 × 10-5 0.132252981 ENSG00000167994.7 5.34 × 10-11 0.210709488 0.000135313 0.107294455 ENSG00000178904.14 6.00 × 10-11 0.217308363 1.38 × 10-5 0.133966118 ENSG00000068912.9 6.48 × 10-11 0.225013027 1.23 × 10-5 0.140234246 ENSG00000169490.12 6.55 × 10-11 0.231441348 9.02 × 10-6 0.15013623 ENSG00000204516.5 6.91 × 10-11 0.246115971 1.21 × 10-5 0.158384942 ENSG00000101342.5 7.01 × 10-11 0.238344703 7.02 × 10-5 0.129810453 ENSG00000174130.8 7.35 × 10-11 0.240989304 0.000459927 0.111187099 ENSG00000168118.7 7.47 × 10-11 0.228145644 6.72 × 10-6 0.150205877 ENSG00000131795.8 7.57 × 10-11 0.236181432 3.86 × 10-6 0.160644522 ENSG00000119333.7 7.80 × 10-11 0.188465944 7.94 × 10-6 0.123460191 ENSG00000109586.7 7.80 × 10-11 0.233794892 5.80 × 10-5 0.130280302 ENSG00000100767.11 8.01 × 10-11 0.235026101 0.000319679 0.115513563 ENSG00000137273.3 8.28 × 10-11 0.239358565 6.29 × 10-6 0.160912308 ENSG00000136628.13 8.94 × 10-11 0.173456582 8.06 × 10-6 0.113911163 ENSG00000147394.14 9.82 × 10-11 0.243581033 0.000165133 0.124820684 ENSG00000121933.13 1.00 × 10-10 0.239388243 0.000824528 0.103458803 ENSG00000159322.13 1.02 × 10-10 0.236446087 4.63 × 10-5 0.139124633 ENSG00000135211.5 1.06 × 10-10 0.214888522 1.42 × 10-5 0.139361953

155 ENSG00000100603.9 1.07 × 10-10 0.215939956 1.57 × 10-5 0.137679373 ENSG00000198786.2 1.07 × 10-10 -0.20252869 9.60 × 10-6 -0.132666753 ENSG00000185567.6 1.08 × 10-10 0.241289315 0.000132438 0.132825739 ENSG00000113643.4 1.10 × 10-10 0.177496566 1.01 × 10-5 0.116837915 ENSG00000143222.7 1.11 × 10-10 0.200096483 8.98 × 10-6 0.132980375 ENSG00000124786.8 1.18 × 10-10 0.204651526 8.33 × 10-6 0.135198837 ENSG00000130813.13 1.26 × 10-10 0.209906549 7.38 × 10-6 0.142846723 ENSG00000113368.7 1.26 × 10-10 0.239022217 0.000103622 0.128121547 ENSG00000137845.10 1.27 × 10-10 0.215381329 5.88 × 10-5 0.121382772 ENSG00000125821.7 1.28 × 10-10 0.204596131 5.26 × 10-5 0.122037795 ENSG00000187796.9 1.29 × 10-10 0.218529827 0.000171315 0.112793401 ENSG00000204209.6 1.29 × 10-10 0.213150365 1.74 × 10-5 0.134829172 ENSG00000113648.12 1.36 × 10-10 0.238301843 0.000857889 0.105308179 ENSG00000255302.3 1.36 × 10-10 0.221870764 2.81 × 10-5 0.133982966 ENSG00000198795.6 1.57 × 10-10 0.231058997 4.95 × 10-6 0.159357125 ENSG00000136895.14 1.58 × 10-10 0.234226485 3.75 × 10-5 0.141313495 ENSG00000099377.9 1.58 × 10-10 0.202406872 0.000225858 0.105167921 ENSG00000153823.14 1.59 × 10-10 0.239853628 4.65 × 10-5 0.142888301 ENSG00000140367.7 1.70 × 10-10 0.220348116 3.25 × 10-5 0.133377673 ENSG00000121297.6 1.75 × 10-10 0.233195385 4.43 × 10-6 0.161761289 ENSG00000173200.8 1.80 × 10-10 0.234585604 0.000337252 0.116122856 ENSG00000132341.7 1.87 × 10-10 0.218513065 2.88 × 10-5 0.137012025 ENSG00000198832.6 1.92 × 10-10 0.187841159 6.35 × 10-6 0.129326369 ENSG00000227057.3 1.94 × 10-10 0.179440974 3.66 × 10-5 0.109837324 ENSG00000092010.10 2.00 × 10-10 0.208367812 3.93 × 10-5 0.127370393 ENSG00000106829.14 2.03 × 10-10 0.236329097 0.000167068 0.126678509 ENSG00000139190.12 2.06 × 10-10 0.23772463 4.16 × 10-6 0.168779846 ENSG00000177106.10 2.08 × 10-10 0.213454086 0.00017613 0.115162628 ENSG00000167600.9 2.21 × 10-10 0.229499567 0.00097895 0.10105882 ENSG00000158473.6 2.24 × 10-10 0.236431083 3.90 × 10-5 0.144868473 ENSG00000180448.6 2.25 × 10-10 0.180976727 4.21 × 10-5 0.107824259 ENSG00000204386.6 2.26 × 10-10 0.20522375 5.05 × 10-5 0.12278487 ENSG00000135845.5 2.39 × 10-10 0.224520295 5.23 × 10-5 0.135934033 ENSG00000143515.12 2.41 × 10-10 0.233064838 1.07 × 10-5 0.156258138 ENSG00000125744.7 2.42 × 10-10 0.203688782 7.13 × 10-5 0.118220297 ENSG00000131080.10 2.43 × 10-10 0.233326565 5.50 × 10-6 0.16223549

156 ENSG00000174720.11 2.60 × 10-10 0.17634824 6.78 × 10-6 0.122097505 ENSG00000102007.6 2.61 × 10-10 0.23080839 0.001216531 0.102368434 ENSG00000100196.6 2.63 × 10-10 0.235034958 6.24 × 10-6 0.16349642 ENSG00000183431.7 2.68 × 10-10 0.202775163 1.13 × 10-5 0.136595364 ENSG00000136235.11 2.77 × 10-10 0.233805647 0.00591567 0.07919648 ENSG00000100351.12 2.87 × 10-10 0.2345257 3.97 × 10-6 0.16691652 ENSG00000122705.12 2.98 × 10-10 0.173039692 8.02 × 10-6 0.120264945 ENSG00000112249.9 3.13 × 10-10 0.218289741 9.47 × 10-6 0.147384686 ENSG00000113575.5 3.23 × 10-10 0.213038187 3.93 × 10-5 0.131022885 ENSG00000108950.7 3.25 × 10-10 0.228314635 0.006635455 0.076668326 ENSG00000139178.6 3.27 × 10-10 0.234248884 0.000220743 0.125455596 ENSG00000114978.13 3.30 × 10-10 0.201263846 0.000117833 0.1109309 ENSG00000156973.9 3.38 × 10-10 0.227679843 1.26 × 10-5 0.155542088 ENSG00000100030.10 3.54 × 10-10 0.211288327 6.45 × 10-6 0.146543019 ENSG00000198700.5 3.59 × 10-10 0.223757179 3.97 × 10-6 0.160606498 ENSG00000100983.5 3.63 × 10-10 0.185672767 5.36 × 10-6 0.132906566 ENSG00000175224.12 3.69 × 10-10 0.192951823 1.47 × 10-5 0.129419146 ENSG00000044459.10 3.70 × 10-10 0.218571574 2.46 × 10-5 0.140776227 ENSG00000168056.10 3.80 × 10-10 0.144969689 2.28 × 10-5 0.093741108 ENSG00000104889.4 3.82 × 10-10 0.210921941 5.41 × 10-6 0.14962304 ENSG00000108883.8 3.88 × 10-10 0.166761909 8.51 × 10-6 0.116454891 ENSG00000181195.6 4.07 × 10-10 0.232819613 0.000313386 0.12017343 ENSG00000204577.7 4.23 × 10-10 0.22996086 0.001103015 0.103427041 ENSG00000108479.7 4.26 × 10-10 0.167862201 0.000110235 0.09739374 ENSG00000115318.7 4.50 × 10-10 0.224795484 0.00048161 0.113719759 ENSG00000102466.11 4.70 × 10-10 0.233434823 8.46 × 10-5 0.139253113 ENSG00000144909.7 4.86 × 10-10 0.218749348 5.10 × 10-6 0.155430048 ENSG00000214753.2 4.88 × 10-10 0.206705462 5.24 × 10-5 0.127355308 ENSG00000146112.7 4.96 × 10-10 0.223463168 0.000142318 0.126021158 ENSG00000065600.8 5.01 × 10-10 0.232936588 0.000311166 0.121080323 ENSG00000198938.2 5.09 × 10-10 -0.228296673 2.57 × 10-5 -0.14823164 ENSG00000100612.9 5.18 × 10-10 0.171316577 3.89 × 10-5 0.108173586 ENSG00000123609.6 5.21 × 10-10 0.215471337 0.000121874 0.126765577 ENSG00000158526.7 5.24 × 10-10 0.191006762 9.26 × 10-6 0.133965586 ENSG00000211460.7 5.25 × 10-10 0.202611522 1.05 × 10-5 0.139522281 ENSG00000018510.8 5.27 × 10-10 0.19598698 5.66 × 10-5 0.116616713

157 ENSG00000002933.3 5.29 × 10-10 0.221721662 0.000507625 0.111152455 ENSG00000166986.8 5.40 × 10-10 0.212773939 6.27 × 10-6 0.151554579 ENSG00000197629.5 5.45 × 10-10 0.233596168 0.018912807 0.064865324 ENSG00000123815.7 5.59 × 10-10 0.174533472 5.11 × 10-6 0.12668089 ENSG00000147400.8 5.86 × 10-10 0.203119702 5.89 × 10-6 0.14661312 ENSG00000116809.7 6.10 × 10-10 0.15755777 2.31 × 10-5 0.104519838 ENSG00000157350.8 6.20 × 10-10 0.189987275 8.99 × 10-6 0.132478854 ENSG00000142867.8 6.20 × 10-10 0.218387421 1.98 × 10-5 0.143948732 ENSG00000163751.3 6.28 × 10-10 0.231515417 0.000428673 0.11948735 ENSG00000134910.8 6.30 × 10-10 0.173534991 0.000301523 0.091763599 ENSG00000125166.8 6.37 × 10-10 0.214415753 4.47 × 10-6 0.156598578 ENSG00000158715.5 6.39 × 10-10 0.230517517 0.000410614 0.116493793 ENSG00000113758.9 6.69 × 10-10 0.198376874 9.17 × 10-5 0.118753097 ENSG00000163875.11 6.84 × 10-10 0.213195716 0.000222845 0.118523329 ENSG00000204619.3 7.01 × 10-10 0.212328205 0.000139374 0.1230432 ENSG00000127084.13 7.02 × 10-10 0.213667754 0.001306695 0.09577473 ENSG00000136816.11 7.30 × 10-10 0.232374637 0.000112181 0.134069799 ENSG00000139726.6 7.43 × 10-10 0.204938714 5.76 × 10-6 0.149111422 ENSG00000166946.9 7.44 × 10-10 0.220288876 0.000117949 0.127838506 ENSG00000090581.5 7.46 × 10-10 0.165066118 2.88 × 10-5 0.108222923 ENSG00000117602.7 7.47 × 10-10 0.215246271 3.82 × 10-6 0.158465092 ENSG00000150636.11 7.55 × 10-10 0.224903399 5.61 × 10-6 0.16450572 ENSG00000184428.8 7.58 × 10-10 0.209430758 3.57 × 10-5 0.134412948 ENSG00000108828.11 7.69 × 10-10 0.202605681 0.000529818 0.103171051 ENSG00000107968.5 8.08 × 10-10 0.228021684 1.02 × 10-5 0.160105472 ENSG00000198821.6 8.09 × 10-10 0.225514683 1.21 × 10-5 0.157047462 ENSG00000019991.11 8.24 × 10-10 0.222157718 0.000197317 0.123427815 ENSG00000151491.8 8.24 × 10-10 0.206923253 1.60 × 10-5 0.139739726 ENSG00000086589.7 8.42 × 10-10 0.210271866 8.93 × 10-5 0.128005953 ENSG00000198851.5 8.53 × 10-10 0.227320783 3.47 × 10-5 0.146912174 ENSG00000170340.10 8.60 × 10-10 0.217335621 1.33 × 10-5 0.148756668 ENSG00000117691.5 8.91 × 10-10 0.183418467 4.18 × 10-5 0.119020321 ENSG00000084234.12 8.96 × 10-10 0.226194926 0.000716166 0.110550778 ENSG00000135749.14 9.10 × 10-10 0.232353028 2.04 × 10-5 0.156182226 ENSG00000183397.5 9.17 × 10-10 0.193953547 1.27 × 10-5 0.134221665 ENSG00000139597.12 9.18 × 10-10 0.219544632 3.88 × 10-5 0.141147795

158 ENSG00000005020.8 9.35 × 10-10 0.199094351 4.57 × 10-5 0.125715181 ENSG00000162819.7 9.49 × 10-10 0.185260201 5.78 × 10-5 0.115254962 ENSG00000116819.6 9.55 × 10-10 0.222816933 4.60 × 10-6 0.16470375 ENSG00000063978.11 9.63 × 10-10 0.225897884 1.23 × 10-5 0.156703523 ENSG00000174243.5 9.63 × 10-10 0.177014025 1.55 × 10-5 0.121852804 ENSG00000047578.8 9.80 × 10-10 0.217667587 4.30 × 10-6 0.161997655 ENSG00000138600.5 9.82 × 10-10 0.226433024 5.55 × 10-6 0.164840531 ENSG00000120332.11 1.00 × 10-9 -0.225807842 5.91 × 10-6 -0.163486405 ENSG00000185669.5 1.00 × 10-9 0.198254742 1.30 × 10-5 0.138055364 ENSG00000163683.7 1.08 × 10-9 0.209330759 0.000190909 0.12004698 ENSG00000072858.6 1.14 × 10-9 0.227087728 0.000148982 0.130051428 ENSG00000168546.6 1.15 × 10-9 0.222445416 0.017348529 0.066929563 ENSG00000187955.7 1.17 × 10-9 0.21930983 0.000252684 0.122222305 ENSG00000131475.2 1.22 × 10-9 0.185542676 5.87 × 10-6 0.13816434 ENSG00000152558.10 1.22 × 10-9 0.204994059 0.000878164 0.097426039 ENSG00000106052.9 1.23 × 10-9 0.16095873 4.04 × 10-5 0.104437395 ENSG00000188603.12 1.24 × 10-9 0.189030896 8.31 × 10-5 0.117421267 ENSG00000106605.6 1.25 × 10-9 0.180437229 2.41 × 10-5 0.123397874 ENSG00000177426.16 1.26 × 10-9 0.226560446 8.05 × 10-5 0.139918038 ENSG00000196465.6 1.30 × 10-9 0.188199609 3.97 × 10-6 0.143717039 ENSG00000156113.16 1.32 × 10-9 0.226956364 0.000363951 0.122811016 ENSG00000116991.6 1.35 × 10-9 0.221864616 7.74 × 10-6 0.160877489 ENSG00000143164.11 1.35 × 10-9 0.180810022 3.97 × 10-6 0.136371788 ENSG00000145817.12 1.36 × 10-9 0.205305022 3.72 × 10-5 0.134402118 ENSG00000086062.8 1.38 × 10-9 0.228282141 3.16 × 10-5 0.15075207 ENSG00000181274.5 1.45 × 10-9 0.220925506 2.06 × 10-5 0.151000606 ENSG00000178397.8 1.45 × 10-9 0.226525252 1.18 × 10-5 0.160195819 ENSG00000028310.13 1.45 × 10-9 0.180485492 7.32 × 10-6 0.132583454 ENSG00000116586.7 1.45 × 10-9 0.174732904 6.69 × 10-5 0.110657901 ENSG00000183020.9 1.51 × 10-9 0.17087113 0.002369196 0.073610584 ENSG00000169689.10 1.51 × 10-9 0.177256079 6.69 × 10-6 0.131591756 ENSG00000156273.11 1.54 × 10-9 0.197557804 0.000226219 0.110019541 ENSG00000148444.11 1.56 × 10-9 0.198316406 8.25 × 10-6 0.145945089 ENSG00000095794.15 1.56 × 10-9 0.219527922 2.77 × 10-5 0.149045659 ENSG00000105374.5 1.59 × 10-9 0.21709158 2.11 × 10-5 0.149828319 ENSG00000135605.8 1.59 × 10-9 0.225066621 0.000402741 0.121031552

159 ENSG00000163156.7 1.65 × 10-9 0.190310994 0.000163203 0.113180202 ENSG00000108349.10 1.66 × 10-9 0.224400987 5.94 × 10-6 0.167691713 ENSG00000116679.11 1.68 × 10-9 0.2184088 2.80 × 10-5 0.146376089 ENSG00000181929.7 1.69 × 10-9 0.189616246 6.47 × 10-6 0.141448414 ENSG00000196396.5 1.70 × 10-9 0.222646912 3.27 × 10-5 0.148189473 ENSG00000169515.5 1.71 × 10-9 0.213111948 9.59 × 10-5 0.13174648 ENSG00000155304.4 1.73 × 10-9 0.191792799 7.81 × 10-6 0.139164043 ENSG00000183765.16 1.77 × 10-9 0.224520772 4.91 × 10-5 0.148055306 ENSG00000033178.8 1.78 × 10-9 0.189454318 9.06 × 10-6 0.13643016 ENSG00000126749.10 1.84 × 10-9 0.172708574 2.04 × 10-5 0.120684923 ENSG00000139921.8 1.85 × 10-9 0.197273722 1.11 × 10-5 0.141580835 ENSG00000067992.8 1.85 × 10-9 0.222275658 8.16 × 10-6 0.163473982 ENSG00000129521.9 1.89 × 10-9 0.223869486 5.88 × 10-6 0.167199915 ENSG00000183486.8 1.92 × 10-9 0.225256554 0.000169621 0.134320571 ENSG00000110107.4 1.95 × 10-9 0.163458228 0.000160534 0.097579145 ENSG00000213390.6 1.97 × 10-9 0.222566631 4.32 × 10-6 0.169398715 ENSG00000143727.11 1.97 × 10-9 0.210314047 1.73 × 10-5 0.148036782 ENSG00000111237.14 1.98 × 10-9 0.170261498 8.78 × 10-5 0.106327612 ENSG00000166483.6 1.99 × 10-9 0.221569281 1.57 × 10-5 0.157240547 ENSG00000104299.10 2.02 × 10-9 0.184198577 1.01 × 10-5 0.134808593 ENSG00000178988.10 2.04 × 10-9 0.211241972 0.000276878 0.118812571 ENSG00000115539.9 2.06 × 10-9 0.208884349 2.03 × 10-5 0.145252503 ENSG00000163486.8 2.12 × 10-9 0.225615301 0.000445734 0.121424581 ENSG00000116863.10 2.13 × 10-9 0.169987824 7.68 × 10-6 0.126495733 ENSG00000185442.8 2.13 × 10-9 0.211525423 7.09 × 10-6 0.158061273 ENSG00000130958.7 2.17 × 10-9 0.202702254 8.16 × 10-5 0.128133485 ENSG00000085514.11 2.23 × 10-9 0.217912846 0.000902908 0.109734093 ENSG00000078043.11 2.24 × 10-9 0.196803182 4.38 × 10-6 0.149232102 ENSG00000087076.4 2.26 × 10-9 0.204775307 8.62 × 10-5 0.129420952 ENSG00000163071.6 2.30 × 10-9 0.221463796 3.84 × 10-6 0.169803529 ENSG00000187514.10 2.38 × 10-9 0.212626258 0.000166859 0.12648669 ENSG00000164144.10 2.40 × 10-9 0.198763231 3.59 × 10-5 0.134182211 ENSG00000125148.6 2.41 × 10-9 0.196680912 0.000524848 0.106723416 ENSG00000243649.4 2.51 × 10-9 0.221107191 0.00040933 0.121037962 ENSG00000164961.11 2.55 × 10-9 0.166549451 7.95 × 10-5 0.10531297 ENSG00000130270.12 2.62 × 10-9 0.213079497 1.56 × 10-5 0.152657651

160 ENSG00000198951.7 2.62 × 10-9 0.212081125 0.023639805 0.06284394 ENSG00000009694.9 2.67 × 10-9 0.207369193 5.27 × 10-6 0.157264731 ENSG00000074410.9 2.74 × 10-9 0.221611561 0.000104653 0.139589176 ENSG00000253626.2 2.80 × 10-9 0.173014606 4.55 × 10-6 0.13384611 ENSG00000159753.9 2.82 × 10-9 0.206899885 4.20 × 10-5 0.139158286 ENSG00000107485.11 2.88 × 10-9 -0.219277031 9.82 × 10-6 -0.160424602 ENSG00000129911.4 3.00 × 10-9 0.188182024 0.000258523 0.108019943 ENSG00000010295.15 3.02 × 10-9 0.182491824 0.000340303 0.101863594 ENSG00000175544.9 3.03 × 10-9 0.210057534 8.72 × 10-5 0.131600534 ENSG00000123737.8 3.08 × 10-9 0.208099766 3.46 × 10-5 0.142109656 ENSG00000196504.11 3.19 × 10-9 0.180860194 0.000199445 0.107332672 ENSG00000104903.4 3.38 × 10-9 0.177222115 0.000232765 0.103877525 ENSG00000005175.5 3.42 × 10-9 0.192653509 5.32 × 10-6 0.148842442 ENSG00000164691.12 3.50 × 10-9 0.223705266 0.000829698 0.112016455 ENSG00000159348.8 3.59 × 10-9 0.16916579 4.00 × 10-6 0.132977913 ENSG00000166825.9 3.61 × 10-9 0.20473723 0.012112674 0.074286211 ENSG00000115085.9 3.61 × 10-9 0.208515355 1.10 × 10-5 0.153326058 ENSG00000158019.16 3.64 × 10-9 0.18244854 8.29 × 10-5 0.118569791 ENSG00000119682.12 3.69 × 10-9 0.216008893 1.18 × 10-5 0.158026235 ENSG00000136758.14 3.70 × 10-9 0.183643832 4.94 × 10-5 0.122156474 ENSG00000102316.12 3.75 × 10-9 0.17894974 7.31 × 10-5 0.115705238 ENSG00000100982.7 3.80 × 10-9 0.165362919 3.45 × 10-5 0.114211151 ENSG00000041353.5 3.83 × 10-9 0.219989316 5.79 × 10-5 0.145616561 ENSG00000177971.7 3.85 × 10-9 0.181148786 6.07 × 10-6 0.139404365 ENSG00000110628.9 3.85 × 10-9 0.18371982 1.49 × 10-5 0.134582016 ENSG00000166813.10 3.91 × 10-9 0.192306298 0.000319356 0.111532109 ENSG00000223501.4 4.06 × 10-9 0.181863208 1.52 × 10-5 0.133218492 ENSG00000155158.16 4.08 × 10-9 0.205816981 0.000844769 0.107329988 ENSG00000121060.10 4.13 × 10-9 0.217904211 4.06 × 10-5 0.147662999 ENSG00000144554.6 4.16 × 10-9 0.218234011 6.06 × 10-5 0.145705287 ENSG00000175220.7 4.27 × 10-9 0.177038435 0.000106976 0.112108749 ENSG00000100632.6 4.32 × 10-9 0.18610364 1.43 × 10-5 0.13742059 ENSG00000164087.3 4.39 × 10-9 0.202681776 3.91 × 10-6 0.159806597 ENSG00000110665.7 4.44 × 10-9 0.219692879 0.000143286 0.136615601 ENSG00000150459.8 4.45 × 10-9 0.206981311 9.30 × 10-5 0.132620072 ENSG00000187239.12 4.89 × 10-9 0.215624388 0.004433172 0.092262254

161 ENSG00000149591.12 4.94 × 10-9 0.200404239 8.49 × 10-5 0.12978811 ENSG00000014824.9 4.96 × 10-9 0.185264633 4.86 × 10-5 0.126749426 ENSG00000151651.11 4.98 × 10-9 0.19933158 0.000163624 0.123030622 ENSG00000198517.5 4.99 × 10-9 0.163379921 1.42 × 10-5 0.119553633 ENSG00000188342.7 5.01 × 10-9 0.196211136 5.01 × 10-5 0.134270296 ENSG00000033867.12 5.02 × 10-9 0.193347508 5.16 × 10-5 0.128703328 ENSG00000125753.9 5.04 × 10-9 0.195703652 3.02 × 10-5 0.136142045 ENSG00000171055.10 5.06 × 10-9 0.195572993 9.98 × 10-6 0.147127664 ENSG00000088756.8 5.13 × 10-9 0.204325174 5.01 × 10-6 0.1597757 ENSG00000087245.8 5.16 × 10-9 0.204034682 0.003588 0.091124917 ENSG00000160602.9 5.31 × 10-9 0.207397146 4.60 × 10-6 0.1634603 ENSG00000128791.7 5.37 × 10-9 0.187808241 0.000117192 0.117900547 ENSG00000171227.6 5.39 × 10-9 0.211385339 2.11 × 10-5 0.151645954 ENSG00000119632.3 5.55 × 10-9 0.174513255 0.001395184 0.087630965 ENSG00000104695.8 5.60 × 10-9 0.192565839 1.61 × 10-5 0.14176165 ENSG00000090061.13 5.65 × 10-9 0.218779009 0.000151459 0.139379997 ENSG00000182827.8 5.67 × 10-9 0.18749707 9.04 × 10-5 0.119918899 ENSG00000174125.3 5.70 × 10-9 0.214938962 0.010444326 0.080009565 ENSG00000135341.13 5.74 × 10-9 0.180304028 0.000150421 0.110691018 ENSG00000123427.11 5.74 × 10-9 0.211581562 6.27 × 10-5 0.140935977 ENSG00000146457.10 5.82 × 10-9 0.212644623 0.00022474 0.128484927 ENSG00000162512.11 5.93 × 10-9 0.20354835 0.000134843 0.127123462 ENSG00000149218.4 5.96 × 10-9 0.211431138 2.74 × 10-5 0.14906272 ENSG00000109220.6 6.01 × 10-9 0.209957448 5.30 × 10-5 0.141620561 ENSG00000134698.10 6.30 × 10-9 0.196514533 1.53 × 10-5 0.143827095 ENSG00000077264.10 6.33 × 10-9 0.215337291 4.27 × 10-5 0.147400088 ENSG00000064995.12 6.36 × 10-9 0.203907395 6.60 × 10-6 0.159136555 ENSG00000092969.7 6.44 × 10-9 0.215871591 0.000624576 0.119481116 ENSG00000151135.5 6.62 × 10-9 0.195408057 3.75 × 10-5 0.134477428 ENSG00000100629.12 6.66 × 10-9 0.20713334 6.94 × 10-5 0.139396664 ENSG00000075618.13 6.71 × 10-9 0.174372059 8.93 × 10-6 0.132935036 ENSG00000117228.9 6.72 × 10-9 0.212683878 7.19 × 10-6 0.165580107 ENSG00000196290.10 6.73 × 10-9 0.194877107 8.41 × 10-6 0.150771562 ENSG00000076928.13 6.79 × 10-9 0.148147211 9.89 × 10-5 0.096609984 ENSG00000163069.8 6.84 × 10-9 0.197479283 8.68 × 10-5 0.128808914 ENSG00000171150.7 6.91 × 10-9 0.181817341 5.16 × 10-6 0.141833372

162 ENSG00000173193.9 6.93 × 10-9 0.208566933 1.97 × 10-5 0.153557664 ENSG00000198689.5 7.09 × 10-9 0.199286112 8.58 × 10-5 0.129161782 ENSG00000077463.10 7.11 × 10-9 0.158027079 3.34 × 10-5 0.111732683 ENSG00000204315.3 7.17 × 10-9 0.202751847 1.71 × 10-5 0.150264239 ENSG00000196470.7 7.20 × 10-9 0.204212433 4.20 × 10-5 0.142587362 ENSG00000116670.10 7.35 × 10-9 0.1728335 3.05 × 10-5 0.123562529 ENSG00000113263.8 7.35 × 10-9 0.218044304 0.00012953 0.138849731 ENSG00000081154.7 7.40 × 10-9 0.170180149 0.000196427 0.103864133 ENSG00000204161.9 7.44 × 10-9 0.211091885 1.96 × 10-5 0.154679231 ENSG00000181896.7 7.50 × 10-9 0.183212171 1.11 × 10-5 0.13819252 ENSG00000214021.11 7.53 × 10-9 0.210038836 9.11 × 10-5 0.138182791 ENSG00000198026.6 7.55 × 10-9 0.180678705 8.06 × 10-6 0.13905949 ENSG00000067955.9 7.57 × 10-9 0.202924105 0.001484898 0.100702105 ENSG00000163539.11 7.63 × 10-9 0.177934756 4.76 × 10-5 0.123015208 ENSG00000023318.7 7.67 × 10-9 0.143198652 0.000156484 0.090614462 ENSG00000115604.6 7.99 × 10-9 0.2116348 1.46 × 10-5 0.158060962 ENSG00000136146.10 8.51 × 10-9 0.186995764 4.68 × 10-5 0.130402022 ENSG00000177854.7 8.55 × 10-9 0.166789955 1.52 × 10-5 0.124895688 ENSG00000118515.7 8.64 × 10-9 0.218730288 2.70 × 10-5 0.15614482 ENSG00000183023.14 8.64 × 10-9 0.204803451 3.15 × 10-5 0.144366596 ENSG00000143514.12 8.66 × 10-9 0.208602203 0.00011297 0.133652077 ENSG00000182158.10 8.68 × 10-9 0.204070477 1.34 × 10-5 0.153291325 ENSG00000142765.13 8.75 × 10-9 0.190531862 3.65 × 10-5 0.133567148 ENSG00000227507.2 8.88 × 10-9 0.205261889 4.02 × 10-5 0.143446024 ENSG00000058804.10 9.24 × 10-9 0.19843767 4.29 × 10-5 0.137660406 ENSG00000100225.13 9.26 × 10-9 0.168371519 0.000231964 0.104267102 ENSG00000135426.10 9.38 × 10-9 0.215355262 0.000144455 0.137273413 ENSG00000155846.12 9.38 × 10-9 -0.200521448 6.13 × 10-6 -0.159150898 ENSG00000101493.6 9.45 × 10-9 0.210974941 3.09 × 10-5 0.149707262 ENSG00000006074.4 9.56 × 10-9 0.215630082 0.030704261 0.064256437 ENSG00000138698.10 9.58 × 10-9 0.192734052 9.99 × 10-5 0.127334724 ENSG00000180957.13 9.65 × 10-9 0.190379028 0.000131486 0.123243016 ENSG00000160284.10 9.72 × 10-9 0.19457446 7.81 × 10-6 0.151156727 ENSG00000102893.11 9.74 × 10-9 0.176787615 6.71 × 10-6 0.138721106 ENSG00000104549.7 9.76 × 10-9 0.214052044 1.09 × 10-5 0.164727656 ENSG00000116741.6 9.86 × 10-9 0.214137788 0.000334614 0.126735914

163 ENSG00000140368.8 1.00 × 10-8 0.192481074 0.009216629 0.07649488 ENSG00000181631.6 1.04 × 10-8 0.215712064 0.003225791 0.098054096 ENSG00000165084.11 1.04 × 10-8 -0.206073868 1.11 × 10-5 -0.158494536 ENSG00000160803.7 1.06 × 10-8 0.212882856 9.23 × 10-5 0.14215277 ENSG00000135317.8 1.07 × 10-8 0.178665733 0.000279904 0.106110823 ENSG00000184988.4 1.09 × 10-8 0.185293691 0.008378006 0.071772714 ENSG00000185522.4 1.12 × 10-8 0.18253425 6.11 × 10-6 0.146483751 ENSG00000114331.8 1.13 × 10-8 0.168943951 2.84 × 10-5 0.122163961 ENSG00000164022.12 1.16 × 10-8 0.179223959 2.08 × 10-5 0.133435396 ENSG00000138160.4 1.16 × 10-8 0.207937939 0.000507169 0.123117365 ENSG00000168214.16 1.17 × 10-8 0.192580725 0.018179113 0.066943357 ENSG00000100600.10 1.17 × 10-8 0.202111364 0.145835457 0.03742199 ENSG00000145908.8 1.20 × 10-8 0.207499167 0.000133064 0.13576345 ENSG00000188042.5 1.21 × 10-8 0.215551734 0.006356039 0.089774177 ENSG00000162695.7 1.23 × 10-8 0.176494109 6.17 × 10-5 0.119732984 ENSG00000125505.12 1.25 × 10-8 0.16390117 6.35 × 10-5 0.112922949 ENSG00000126698.6 1.26 × 10-8 0.199987414 0.000192993 0.127326174 ENSG00000127152.13 1.27 × 10-8 0.213511611 6.89 × 10-5 0.146405099 ENSG00000168685.10 1.29 × 10-8 0.214651894 0.000188774 0.133924178 ENSG00000113272.9 1.29 × 10-8 0.200251667 0.000935537 0.108924598 ENSG00000158234.8 1.29 × 10-8 0.195449732 2.35 × 10-5 0.145860987 ENSG00000106415.8 1.31 × 10-8 -0.200743854 7.92 × 10-6 -0.158306706 ENSG00000092094.6 1.33 × 10-8 0.207458823 0.000164347 0.13387728 ENSG00000137752.18 1.33 × 10-8 0.202189286 0.000897558 0.110509846 ENSG00000054392.8 1.34 × 10-8 0.213189903 3.53 × 10-5 0.153255024 ENSG00000198189.6 1.36 × 10-8 0.196944628 0.002262967 0.096329654 ENSG00000075142.9 1.38 × 10-8 0.186561305 0.000296346 0.11301551 ENSG00000067533.5 1.40 × 10-8 0.190280404 9.68 × 10-6 0.147813692 ENSG00000119396.6 1.44 × 10-8 0.179442463 0.000109631 0.117577824 ENSG00000125814.13 1.44 × 10-8 0.198859529 5.89 × 10-6 0.159324019 ENSG00000128335.9 1.45 × 10-8 0.205789434 6.45 × 10-6 0.164631389 ENSG00000257704.2 1.47 × 10-8 0.144547605 8.44 × 10-5 0.09850839 ENSG00000167371.12 1.47 × 10-8 0.206750482 7.19 × 10-5 0.143048478 ENSG00000139433.5 1.48 × 10-8 0.203502864 0.000203752 0.129324733 ENSG00000183726.6 1.48 × 10-8 0.195788382 0.001859004 0.099382165 ENSG00000133703.7 1.49 × 10-8 0.184598867 6.63 × 10-5 0.12609032

164 ENSG00000117010.11 1.49 × 10-8 0.205766493 1.36 × 10-5 0.158622085 ENSG00000182004.8 1.49 × 10-8 0.178608385 0.0001061 0.120946083 ENSG00000185963.9 1.57 × 10-8 0.210688573 3.89 × 10-6 0.172397565 ENSG00000131323.10 1.62 × 10-8 0.203043756 7.34 × 10-6 0.161811984 ENSG00000153563.11 1.63 × 10-8 0.212848577 0.000171483 0.137372592 ENSG00000129538.9 1.63 × 10-8 0.198047186 0.055406964 0.053527176 ENSG00000138092.6 1.66 × 10-8 0.204486453 1.03 × 10-5 0.161255004 ENSG00000173681.12 1.66 × 10-8 0.183477828 6.08 × 10-6 0.148017522 ENSG00000122884.8 1.67 × 10-8 0.207992779 0.00093842 0.113138042 ENSG00000119688.16 1.68 × 10-8 0.192079664 9.86 × 10-5 0.129431474 ENSG00000136193.12 1.69 × 10-8 0.213263631 0.001218003 0.113189628 ENSG00000119685.15 1.69 × 10-8 0.192450673 6.92 × 10-5 0.134399954 ENSG00000205352.6 1.70 × 10-8 0.18316003 0.000369733 0.112762078 ENSG00000117226.7 1.72 × 10-8 0.202683335 8.54 × 10-6 0.160750237 ENSG00000162881.5 1.73 × 10-8 0.189564857 0.001967271 0.096541164 ENSG00000215187.5 1.76 × 10-8 -0.212701909 0.000741927 -0.120430482 ENSG00000012048.15 1.77 × 10-8 0.19978536 0.000113018 0.133657847 ENSG00000141499.12 1.82 × 10-8 0.179193794 1.45 × 10-5 0.138433798 ENSG00000183049.8 1.82 × 10-8 0.210692427 0.005935336 0.091599844 ENSG00000105088.4 1.88 × 10-8 -0.200030172 7.59 × 10-5 -0.137996618 ENSG00000134697.8 1.88 × 10-8 0.166380708 5.54 × 10-5 0.118665271 ENSG00000166166.8 1.91 × 10-8 0.138412921 2.83 × 10-5 0.103151719 ENSG00000134049.3 1.93 × 10-8 0.189055226 0.000373718 0.115862728 ENSG00000162869.11 1.94 × 10-8 0.192153498 0.000966543 0.105727716 ENSG00000146192.10 1.95 × 10-8 0.204023988 0.06800797 0.052234427 ENSG00000177675.4 1.95 × 10-8 0.210390297 0.02989577 0.066640142 ENSG00000173674.6 1.96 × 10-8 0.178649634 4.36 × 10-5 0.128502266 ENSG00000136280.11 1.96 × 10-8 0.144418351 7.02 × 10-5 0.10102603 ENSG00000108798.4 1.97 × 10-8 0.199723977 2.89 × 10-5 0.147883468 ENSG00000082641.11 1.98 × 10-8 0.206028754 0.00030366 0.127813344 ENSG00000112367.6 1.98 × 10-8 0.15566183 0.000211072 0.099801416 ENSG00000240065.3 2.02 × 10-8 0.185183301 1.70 × 10-5 0.141920202 ENSG00000131116.7 2.02 × 10-8 0.137333166 4.18 × 10-6 0.114530427 ENSG00000172465.9 2.03 × 10-8 0.204847852 0.000369495 0.125296382 ENSG00000205423.7 2.04 × 10-8 0.195182268 0.000341877 0.11856485 ENSG00000155363.14 2.05 × 10-8 0.191223255 4.50 × 10-5 0.138868534

165 ENSG00000160219.7 2.09 × 10-8 0.207084494 0.000268385 0.130284037 ENSG00000116685.11 2.09 × 10-8 0.127345309 8.82 × 10-6 0.101515156 ENSG00000123124.9 2.11 × 10-8 0.172447587 0.004113242 0.079513353 ENSG00000145293.10 2.13 × 10-8 0.19105066 0.000565348 0.109485278 ENSG00000124151.14 2.13 × 10-8 0.180560489 1.04 × 10-5 0.14127458 ENSG00000151689.8 2.14 × 10-8 0.193266212 5.25 × 10-6 0.158935618 ENSG00000068885.10 2.17 × 10-8 0.193146089 0.001371064 0.103540663 ENSG00000070214.11 2.18 × 10-8 0.192259465 0.00291742 0.091916905 ENSG00000157181.10 2.19 × 10-8 0.173621818 6.54 × 10-6 0.140256474 ENSG00000123384.9 2.20 × 10-8 0.185654105 1.33 × 10-5 0.144565859 ENSG00000185298.8 2.21 × 10-8 0.168937624 4.69 × 10-6 0.140073952 ENSG00000143919.10 2.24 × 10-8 0.202239213 5.86 × 10-6 0.167177832 ENSG00000198901.9 2.24 × 10-8 0.209779059 0.000116973 0.14204249 ENSG00000135914.5 2.29 × 10-8 0.210014939 0.002258712 0.107264256 ENSG00000171606.13 2.30 × 10-8 0.209156676 8.24 × 10-5 0.145951012 ENSG00000172375.8 2.30 × 10-8 0.191470804 8.78 × 10-5 0.131498572 ENSG00000152240.8 2.30 × 10-8 0.194129976 9.87 × 10-5 0.133268863 ENSG00000134215.11 2.36 × 10-8 0.210455572 6.83 × 10-5 0.146137912 ENSG00000165355.7 2.38 × 10-8 0.190322193 4.48 × 10-5 0.137378895 ENSG00000108700.4 2.39 × 10-8 0.209708379 0.000243617 0.131838137 ENSG00000187741.10 2.43 × 10-8 0.200463427 9.87 × 10-6 0.161254819 ENSG00000105486.9 2.44 × 10-8 0.201431288 5.25 × 10-5 0.145419316 ENSG00000131508.11 2.44 × 10-8 0.194811077 0.00053152 0.116889834 ENSG00000198522.9 2.45 × 10-8 0.178685922 0.000126538 0.12128449 ENSG00000186222.3 2.46 × 10-8 0.176757394 1.03 × 10-5 0.141093174 ENSG00000148484.13 2.47 × 10-8 0.186410948 5.61 × 10-6 0.153168713 ENSG00000185697.12 2.48 × 10-8 0.208396988 6.40 × 10-5 0.146088306 ENSG00000165669.9 2.49 × 10-8 0.199815444 4.63 × 10-5 0.145261216 ENSG00000102910.9 2.49 × 10-8 -0.175143397 5.40 × 10-6 -0.14522764 ENSG00000095739.7 2.50 × 10-8 0.20847833 5.52 × 10-5 0.149570346 ENSG00000176809.6 2.53 × 10-8 0.210313652 4.00 × 10-6 0.176622516 ENSG00000181481.9 2.59 × 10-8 0.177517601 0.004535579 0.082662459 ENSG00000169372.8 2.60 × 10-8 0.206827071 4.35 × 10-6 0.172461378 ENSG00000120690.9 2.61 × 10-8 0.187658373 0.000250034 0.11651843 ENSG00000133619.13 2.66 × 10-8 0.158201009 4.20 × 10-5 0.114318403 ENSG00000105619.9 2.69 × 10-8 0.164690407 2.01 × 10-5 0.127243424

166 ENSG00000120159.7 2.73 × 10-8 0.191133545 9.62 × 10-6 0.152358489 ENSG00000065809.9 2.82 × 10-8 0.201715901 1.23 × 10-5 0.160092934 ENSG00000102003.6 2.82 × 10-8 0.180556947 4.83 × 10-5 0.131468519 ENSG00000236287.3 2.83 × 10-8 0.185932231 0.00024961 0.119003323 ENSG00000196954.8 2.90 × 10-8 0.175065801 0.006001882 0.076529081 ENSG00000197102.6 2.92 × 10-8 0.194337605 4.43 × 10-6 0.162268587 ENSG00000134874.13 2.94 × 10-8 0.207229811 0.000469525 0.12435734 ENSG00000132842.9 2.97 × 10-8 0.160626756 0.000890147 0.091499473 ENSG00000167770.7 2.98 × 10-8 0.142145373 0.000286223 0.090804154 ENSG00000152213.3 2.99 × 10-8 0.205268072 0.039676003 0.064129959 ENSG00000107929.10 3.03 × 10-8 0.200810346 3.43 × 10-5 0.149672186 ENSG00000167874.6 3.04 × 10-8 -0.198232768 4.64 × 10-5 -0.14441127 ENSG00000115548.12 3.10 × 10-8 0.194206188 5.96 × 10-5 0.138008233 ENSG00000120688.7 3.12 × 10-8 0.187273146 4.63 × 10-6 0.156652719 ENSG00000071073.8 3.14 × 10-8 0.196343042 4.44 × 10-6 0.163738058 ENSG00000188993.3 3.15 × 10-8 0.209030139 1.43 × 10-5 0.164672027 ENSG00000077458.8 3.19 × 10-8 0.182423447 1.41 × 10-5 0.144452156 ENSG00000213024.6 3.21 × 10-8 0.178978401 0.001172772 0.099510608 ENSG00000119729.6 3.29 × 10-8 0.192515754 6.97 × 10-5 0.137289172 ENSG00000219200.6 3.34 × 10-8 0.176783921 2.90 × 10-5 0.134611943 ENSG00000072042.8 3.37 × 10-8 0.177523026 0.000160682 0.119057009 ENSG00000078142.7 3.44 × 10-8 0.185549688 6.64 × 10-5 0.131298292 ENSG00000065325.8 3.45 × 10-8 0.202871315 1.12 × 10-5 0.16251186 ENSG00000087152.11 3.47 × 10-8 0.178544863 1.15 × 10-5 0.142309163 ENSG00000064199.2 3.49 × 10-8 0.206378291 0.000192998 0.135805588 ENSG00000131196.13 3.54 × 10-8 0.195811339 0.00020951 0.128385226 ENSG00000082996.15 3.56 × 10-8 0.185504008 0.008921521 0.078264077 ENSG00000182179.6 3.62 × 10-8 0.182838734 4.96 × 10-5 0.134522681 ENSG00000187068.2 3.63 × 10-8 -0.197551461 8.84 × 10-6 -0.161314073 ENSG00000064012.17 3.63 × 10-8 0.200836383 0.00055325 0.121276174 ENSG00000185627.13 3.64 × 10-8 0.142786518 0.000114589 0.099118688 ENSG00000144591.13 3.65 × 10-8 0.159636372 0.000598383 0.096219672 ENSG00000111412.4 3.68 × 10-8 0.195138415 2.43 × 10-5 0.14868715 ENSG00000163479.9 3.76 × 10-8 0.175022735 0.001874502 0.09293187 ENSG00000175324.5 3.77 × 10-8 0.184535909 1.46 × 10-5 0.147196345 ENSG00000014914.15 3.82 × 10-8 0.201361826 0.000900996 0.115393794

167 ENSG00000128908.11 3.90 × 10-8 0.194170686 0.001154722 0.110679766 ENSG00000213625.4 3.90 × 10-8 0.169205344 0.000113799 0.11636049 ENSG00000130038.5 3.92 × 10-8 0.205476332 0.001339979 0.113585658 ENSG00000119599.12 3.93 × 10-8 0.185262731 0.000127651 0.128694619 ENSG00000114480.8 3.95 × 10-8 -0.169643955 6.44 × 10-6 -0.141124036 ENSG00000145882.6 4.01 × 10-8 0.203530908 0.000598894 0.121252381 ENSG00000176871.4 4.04 × 10-8 0.201261496 2.79 × 10-5 0.154404193 ENSG00000095585.12 4.10 × 10-8 0.203685711 0.011731315 0.08283417 ENSG00000126456.11 4.12 × 10-8 0.15557489 0.000102943 0.109358755 ENSG00000144736.9 4.16 × 10-8 0.199188028 0.000140138 0.135522698 ENSG00000107833.6 4.29 × 10-8 0.180531897 0.000125959 0.125129873 ENSG00000172578.7 4.32 × 10-8 0.204659029 0.001610283 0.109291965 ENSG00000092841.14 4.33 × 10-8 0.153187812 2.32 × 10-5 0.119450665 ENSG00000115685.10 4.34 × 10-8 0.151943942 0.000153109 0.104098703 ENSG00000126581.8 4.35 × 10-8 0.186387901 1.59 × 10-5 0.147058316 ENSG00000136997.10 4.37 × 10-8 0.20424374 6.62 × 10-5 0.147118155 ENSG00000186153.12 4.42 × 10-8 0.20245745 5.06 × 10-6 0.171589946 ENSG00000175203.11 4.46 × 10-8 0.150869471 0.001657225 0.081856668 ENSG00000136098.12 4.46 × 10-8 0.204884954 6.27 × 10-5 0.148778043 ENSG00000135722.4 4.51 × 10-8 0.144237158 2.10 × 10-5 0.112456307 ENSG00000246705.3 4.68 × 10-8 0.152613623 4.17 × 10-6 0.13107492 ENSG00000058799.9 4.72 × 10-8 0.18942192 0.000503175 0.11769641 ENSG00000159377.6 4.79 × 10-8 0.161652009 4.25 × 10-5 0.122107542 ENSG00000106565.13 4.80 × 10-8 0.199162098 0.009526847 0.084235882 ENSG00000148481.9 4.82 × 10-8 0.16987305 8.14 × 10-6 0.140710189 ENSG00000243811.3 4.84 × 10-8 0.196814042 0.000256041 0.129013445 ENSG00000182866.12 4.87 × 10-8 0.201998753 0.00040352 0.126025893 ENSG00000168032.4 4.87 × 10-8 -0.195713766 5.45 × 10-6 -0.165531892 ENSG00000075151.15 4.93 × 10-8 0.180224457 2.65 × 10-5 0.138889058 ENSG00000104974.6 4.94 × 10-8 0.205847009 0.005324879 0.095229099 ENSG00000060339.9 4.97 × 10-8 0.176053603 1.20 × 10-5 0.142401164 ENSG00000100764.9 4.97 × 10-8 0.193941206 1.29 × 10-5 0.157072627 ENSG00000120805.9 5.07 × 10-8 0.178445984 0.000365708 0.113042281 ENSG00000144580.9 5.09 × 10-8 0.196521927 9.11 × 10-5 0.139606049 ENSG00000120314.14 5.10 × 10-8 0.191330841 2.42 × 10-5 0.148630855 ENSG00000102401.15 5.11 × 10-8 0.171875332 2.87 × 10-5 0.131939924

168 ENSG00000181852.13 5.22 × 10-8 -0.202154117 1.30 × 10-5 -0.162953961 ENSG00000170502.8 5.28 × 10-8 0.170349252 2.99 × 10-5 0.131067738 ENSG00000088827.8 5.29 × 10-8 0.197007535 0.190403806 0.0359303 ENSG00000172059.6 5.31 × 10-8 0.198603952 4.37 × 10-5 0.14826503 ENSG00000127463.9 5.32 × 10-8 0.188405761 3.39 × 10-5 0.143534264 ENSG00000196090.8 5.40 × 10-8 0.205332006 4.40 × 10-5 0.153995608 ENSG00000161217.7 5.41 × 10-8 0.184444481 4.52 × 10-6 0.158052262 ENSG00000110057.3 5.45 × 10-8 0.144441586 0.022833647 0.052304188 ENSG00000143178.8 5.46 × 10-8 0.195810237 0.000725186 0.117016642 ENSG00000108465.10 5.47 × 10-8 0.16105102 0.000481956 0.100477869 ENSG00000173264.9 5.47 × 10-8 0.139472884 5.91 × 10-5 0.102899164 ENSG00000141252.15 5.53 × 10-8 0.183563019 1.62 × 10-5 0.14666263 ENSG00000137801.9 5.56 × 10-8 0.205039284 0.00397411 0.100247148 ENSG00000103495.9 5.57 × 10-8 0.148210101 0.001855189 0.079931165 ENSG00000198763.3 5.59 × 10-8 -0.185044083 0.000485557 -0.114497502 ENSG00000107874.6 5.62 × 10-8 0.153986376 0.00012751 0.107958681 ENSG00000075399.8 5.70 × 10-8 0.149676993 1.26 × 10-5 0.121846168 ENSG00000100097.7 5.79 × 10-8 0.128532926 0.000107154 0.091436104 ENSG00000112335.10 5.81 × 10-8 0.183835501 1.10 × 10-5 0.150436282 ENSG00000163162.4 5.82 × 10-8 0.200922349 0.011935488 0.082791114 ENSG00000173208.3 5.89 × 10-8 -0.175298611 0.001257858 -0.098919276 ENSG00000119431.5 5.92 × 10-8 -0.182911099 4.74 × 10-6 -0.156070075 ENSG00000013725.10 5.98 × 10-8 0.196558893 0.000186409 0.132449068 ENSG00000116830.7 5.99 × 10-8 0.189894514 0.00031347 0.122871167 ENSG00000115207.9 6.07 × 10-8 0.200710803 1.20 × 10-5 0.164344979 ENSG00000135205.10 6.11 × 10-8 0.197583588 5.02 × 10-5 0.147562818 ENSG00000138496.12 6.20 × 10-8 0.198661689 0.000207115 0.134947448 ENSG00000151292.13 6.20 × 10-8 0.166813246 4.65 × 10-6 0.142423423 ENSG00000171984.10 6.23 × 10-8 0.194294587 1.00 × 10-5 0.16078596 ENSG00000119616.7 6.25 × 10-8 0.193971271 6.59 × 10-5 0.14291737 ENSG00000111206.8 6.27 × 10-8 0.19281149 0.000276344 0.126228919 ENSG00000172687.9 6.29 × 10-8 0.183243332 1.89 × 10-5 0.147508125 ENSG00000159685.6 6.30 × 10-8 0.160102019 0.000169013 0.110538181 ENSG00000124762.9 6.32 × 10-8 0.19963584 0.000194279 0.134659126 ENSG00000174799.6 6.34 × 10-8 0.189733361 0.000104544 0.133546997 ENSG00000148358.15 6.35 × 10-8 0.19315036 0.000393588 0.123463814

169 ENSG00000101405.3 6.36 × 10-8 0.171916951 0.000290718 0.112342876 ENSG00000143196.4 6.38 × 10-8 0.19067488 0.000448077 0.119758018 ENSG00000196867.3 6.39 × 10-8 0.188861105 4.34 × 10-6 0.162582103 ENSG00000173581.3 6.40 × 10-8 -0.143264836 1.73 × 10-5 -0.113470826 ENSG00000140280.9 6.44 × 10-8 0.200655868 7.03 × 10-5 0.146737113 ENSG00000126709.10 6.49 × 10-8 0.190334119 0.000791768 0.115285166 ENSG00000023734.6 6.49 × 10-8 0.167741412 3.98 × 10-6 0.146082101 ENSG00000090989.13 6.53 × 10-8 0.165949646 0.003130837 0.084386043 ENSG00000088035.11 6.55 × 10-8 0.19155247 0.000236494 0.127170039 ENSG00000119535.13 6.55 × 10-8 0.198387895 0.011838204 0.082314543 ENSG00000116663.6 6.65 × 10-8 0.1841915 4.84 × 10-6 0.158765487 ENSG00000198393.3 6.66 × 10-8 0.195073246 0.000440881 0.122838429 ENSG00000163694.10 6.67 × 10-8 0.199027591 0.005574897 0.092573097 ENSG00000138303.13 6.78 × 10-8 0.198471225 6.94 × 10-5 0.146148699 ENSG00000147144.8 6.83 × 10-8 0.174445074 5.63 × 10-6 0.149353952 ENSG00000133136.3 6.84 × 10-8 0.19307958 0.002773348 0.103109641 ENSG00000138018.13 6.94 × 10-8 0.180697329 7.92 × 10-6 0.15127455 ENSG00000118242.11 6.96 × 10-8 -0.178080737 1.55 × 10-5 -0.144713217 ENSG00000179403.10 7.04 × 10-8 0.133274405 8.20 × 10-6 0.111582378 ENSG00000100239.11 7.05 × 10-8 0.135259563 1.66 × 10-5 0.109183101 ENSG00000185774.10 7.20 × 10-8 0.190955623 4.97 × 10-5 0.144348005 ENSG00000083817.8 7.21 × 10-8 0.195803104 8.16 × 10-5 0.143494534 ENSG00000123191.9 7.22 × 10-8 -0.200699436 7.19 × 10-5 -0.147834988 ENSG00000108262.11 7.22 × 10-8 0.151586935 3.91 × 10-6 0.132508555 ENSG00000125817.7 7.23 × 10-8 0.123324243 1.51 × 10-5 0.100411945 ENSG00000120265.12 7.24 × 10-8 0.170051662 0.000102683 0.122203813 ENSG00000178234.8 7.43 × 10-8 0.188186293 0.004133471 0.092994718 ENSG00000005238.15 7.43 × 10-8 0.187336636 0.000131653 0.132486097 ENSG00000181827.10 7.46 × 10-8 0.178737564 7.45 × 10-6 0.150897478 ENSG00000137760.10 7.48 × 10-8 0.187625193 0.000320323 0.121515482 ENSG00000116752.5 7.52 × 10-8 0.180253704 6.52 × 10-5 0.134088303 ENSG00000109270.8 7.55 × 10-8 0.17319995 0.001685866 0.095283846 ENSG00000095139.9 7.58 × 10-8 0.187753693 0.000119338 0.133188346 ENSG00000198816.5 7.62 × 10-8 0.130435399 2.94 × 10-5 0.10227865 ENSG00000198952.7 7.64 × 10-8 0.156654804 0.000100404 0.112422636 ENSG00000125676.15 7.74 × 10-8 0.164242587 0.00032888 0.106258255

170 ENSG00000124767.6 7.75 × 10-8 0.174534117 0.000175263 0.120479046 ENSG00000185359.8 7.76 × 10-8 0.119675836 0.000358286 0.078119974 ENSG00000153485.5 7.87 × 10-8 0.19100836 0.000167041 0.133975559 ENSG00000146731.6 7.89 × 10-8 0.183644573 0.000139435 0.129389059 ENSG00000160214.8 7.98 × 10-8 0.148311412 0.000406186 0.096762721 ENSG00000139508.10 8.05 × 10-8 0.189329653 3.84 × 10-5 0.145536125 ENSG00000123106.6 8.08 × 10-8 0.168073933 3.17 × 10-5 0.13150994 ENSG00000054116.7 8.10 × 10-8 0.163933925 2.92 × 10-5 0.129136264 ENSG00000138050.10 8.17 × 10-8 0.195876845 2.27 × 10-5 0.157028291 ENSG00000127481.10 8.23 × 10-8 0.196135348 1.49 × 10-5 0.159924846 ENSG00000013563.9 8.32 × 10-8 0.186396915 0.000217144 0.126560122 ENSG00000074603.14 8.32 × 10-8 0.158364902 5.21 × 10-6 0.136510177 ENSG00000143924.14 8.38 × 10-8 0.174986411 0.000356679 0.112344912 ENSG00000175550.3 8.41 × 10-8 0.14433954 0.000578649 0.089826406 ENSG00000145246.9 8.48 × 10-8 0.188474461 0.000219245 0.125873353 ENSG00000147324.6 8.49 × 10-8 0.200547102 0.021100166 0.075124823 ENSG00000081014.6 8.59 × 10-8 0.178543415 7.32 × 10-6 0.151703244 ENSG00000217128.7 8.69 × 10-8 0.180742964 9.12 × 10-6 0.151889461 ENSG00000008323.11 8.75 × 10-8 -0.19531856 6.03 × 10-6 -0.167595266 ENSG00000179832.13 8.81 × 10-8 0.170404399 7.06 × 10-6 0.145047704 ENSG00000241852.5 8.90 × 10-8 0.186707273 1.14 × 10-5 0.155646617 ENSG00000073578.12 8.97 × 10-8 -0.152155118 5.38 × 10-6 -0.131707126 ENSG00000182541.13 8.99 × 10-8 0.19352118 1.31 × 10-5 0.159466485 ENSG00000092531.5 9.00 × 10-8 0.182532185 0.002593194 0.096964561 ENSG00000134262.8 9.04 × 10-8 0.19941508 1.61 × 10-5 0.162467809 ENSG00000164924.13 9.18 × 10-8 0.183985233 5.04 × 10-6 0.160182836 ENSG00000139626.11 9.20 × 10-8 0.190921091 0.000501856 0.12148717 ENSG00000189143.8 9.22 × 10-8 0.198882028 0.000471729 0.126843598 ENSG00000128652.7 9.27 × 10-8 -0.197840885 2.38 × 10-5 -0.157734376 ENSG00000106244.8 9.27 × 10-8 0.144407374 0.000380545 0.094712482 ENSG00000140836.10 9.27 × 10-8 0.169735999 2.85 × 10-5 0.133347196 ENSG00000168329.9 9.31 × 10-8 0.20197299 8.16 × 10-5 0.148589663 ENSG00000004961.10 9.35 × 10-8 0.190981149 7.17 × 10-6 0.163960011 ENSG00000203950.6 9.42 × 10-8 0.155377323 0.000271946 0.104931442 ENSG00000084207.11 9.56 × 10-8 0.153789275 0.000322536 0.102252067 ENSG00000198843.8 9.59 × 10-8 0.179168741 0.000116777 0.127831745

171 ENSG00000005243.5 9.63 × 10-8 0.152662676 2.40 × 10-5 0.122570971 ENSG00000097046.8 9.63 × 10-8 0.202438285 0.000604237 0.125307089 ENSG00000169814.8 9.92 × 10-8 0.199384955 4.09 × 10-5 0.153350516 ENSG00000165684.3 9.94 × 10-8 0.138116238 3.26 × 10-5 0.10814546 ENSG00000123179.9 9.96 × 10-8 0.196099311 0.000251195 0.134000226 ENSG00000086666.14 9.98 × 10-8 0.178744144 0.000171588 0.126507564 ENSG00000158887.11 1.01 × 10-7 0.194461843 2.25 × 10-5 0.155981096 ENSG00000130382.7 1.03 × 10-7 -0.160110217 4.75 × 10-6 -0.1406454 ENSG00000135506.11 1.03 × 10-7 0.153534166 0.004987245 0.074821601 ENSG00000128596.12 1.03 × 10-7 0.199209156 0.001790823 0.111568963 ENSG00000171223.4 1.04 × 10-7 0.182590469 7.55 × 10-5 0.135257329 ENSG00000131944.5 1.04 × 10-7 0.193739207 0.000103067 0.140838464 ENSG00000109971.9 1.05 × 10-7 0.200850711 1.99 × 10-5 0.162542804 ENSG00000130724.4 1.06 × 10-7 0.170130345 0.000128479 0.122670478 ENSG00000172123.8 1.06 × 10-7 0.190408471 3.72 × 10-5 0.146996977 ENSG00000123374.6 1.08 × 10-7 0.193907878 9.10 × 10-6 0.163842172 ENSG00000127415.8 1.09 × 10-7 0.123841955 3.47 × 10-5 0.097420954 ENSG00000177868.7 1.10 × 10-7 0.171132857 5.19 × 10-5 0.130580801 ENSG00000136738.10 1.11 × 10-7 0.187899125 0.000255696 0.12757281 ENSG00000022840.11 1.12 × 10-7 0.159871866 4.70 × 10-5 0.122871894 ENSG00000125450.6 1.14 × 10-7 0.18635084 0.000778488 0.114494988 ENSG00000198089.10 1.15 × 10-7 0.158066916 0.000202241 0.109653677 ENSG00000100360.10 1.15 × 10-7 0.184563498 9.07 × 10-5 0.136095939 ENSG00000115840.9 1.18 × 10-7 0.18428297 0.000256487 0.125442081 ENSG00000105185.7 1.18 × 10-7 0.170378471 0.000117245 0.123920638 ENSG00000172725.9 1.18 × 10-7 0.114719304 0.000805842 0.070371352 ENSG00000110442.7 1.20 × 10-7 0.171128943 0.000227474 0.118739079 ENSG00000174939.6 1.20 × 10-7 0.189484385 0.001702693 0.105797745 ENSG00000119917.9 1.20 × 10-7 0.194969941 0.000175145 0.138252212 ENSG00000168772.9 1.20 × 10-7 -0.186232401 8.23 × 10-6 -0.16003536 ENSG00000081041.8 1.21 × 10-7 0.199293455 2.38 × 10-5 0.160469238 ENSG00000156103.11 1.22 × 10-7 0.187671323 0.000330459 0.124264005 ENSG00000139116.13 1.25 × 10-7 -0.172623278 2.16 × 10-5 -0.140350924 ENSG00000151239.9 1.25 × 10-7 0.153084407 2.14 × 10-5 0.12405499 ENSG00000105609.12 1.26 × 10-7 0.194700435 0.181207409 0.03787675 ENSG00000144677.10 1.26 × 10-7 0.196091943 0.004064384 0.099366336

172 ENSG00000086300.11 1.28 × 10-7 0.184834254 8.96 × 10-6 0.157238695 ENSG00000107263.14 1.30 × 10-7 0.181033243 2.25 × 10-5 0.1466294 ENSG00000106624.4 1.31 × 10-7 0.178349254 0.004624898 0.089857767 ENSG00000104738.12 1.31 × 10-7 0.198542545 0.000719734 0.124882636 ENSG00000137198.5 1.31 × 10-7 0.179001664 2.89 × 10-5 0.143237115 ENSG00000130518.12 1.31 × 10-7 0.174897861 1.18 × 10-5 0.147276792 ENSG00000082074.11 1.32 × 10-7 0.193943263 0.002296011 0.103485575 ENSG00000166592.7 1.34 × 10-7 0.18478489 1.51 × 10-5 0.152509065 ENSG00000112118.13 1.34 × 10-7 0.145918833 0.000625191 0.092516412 ENSG00000108679.8 1.37 × 10-7 0.161445777 0.000168048 0.115228098 ENSG00000122417.11 1.38 × 10-7 0.17082174 8.92 × 10-6 0.146770858 ENSG00000179010.10 1.40 × 10-7 0.176132108 0.002891785 0.093774572 ENSG00000102445.14 1.41 × 10-7 0.198747202 0.02345585 0.075365763 ENSG00000105717.9 1.42 × 10-7 0.182169113 0.002743248 0.098864715 ENSG00000129675.11 1.44 × 10-7 0.189783001 3.39 × 10-5 0.150783366 ENSG00000155324.5 1.44 × 10-7 0.190909156 2.56 × 10-5 0.153918065 ENSG00000139505.10 1.44 × 10-7 0.167952491 0.000166383 0.119023586 ENSG00000132256.14 1.45 × 10-7 0.186237603 3.32 × 10-5 0.147677673 ENSG00000119661.10 1.45 × 10-7 0.171442941 9.55 × 10-6 0.146472922 ENSG00000155256.13 1.48 × 10-7 0.162635939 8.85 × 10-5 0.121957754 ENSG00000141753.5 1.52 × 10-7 0.162657302 0.002037153 0.091674609 ENSG00000116521.6 1.54 × 10-7 0.146226316 7.53 × 10-5 0.111199399 ENSG00000182853.7 1.54 × 10-7 0.180709044 0.005526934 0.087686894 ENSG00000170348.4 1.54 × 10-7 0.178353301 0.003239212 0.092820787 ENSG00000169718.13 1.55 × 10-7 0.131791077 1.09 × 10-5 0.112690325 ENSG00000068137.10 1.55 × 10-7 -0.163169427 3.00 × 10-5 -0.131219566 ENSG00000153936.12 1.55 × 10-7 0.172177718 0.000244012 0.118117133 ENSG00000187608.5 1.55 × 10-7 0.170364207 0.000350665 0.115962595 ENSG00000198931.6 1.56 × 10-7 0.137185281 0.00012342 0.100373592 ENSG00000204514.5 1.58 × 10-7 0.170616985 5.93 × 10-5 0.13143436 ENSG00000103253.13 1.60 × 10-7 0.163664282 0.000436685 0.108292033 ENSG00000138336.8 1.63 × 10-7 -0.171866836 7.39 × 10-6 -0.150709292 ENSG00000073584.14 1.64 × 10-7 0.171866277 0.000526705 0.111400347 ENSG00000145979.13 1.64 × 10-7 0.17902971 3.67 × 10-5 0.14277983 ENSG00000167207.7 1.64 × 10-7 0.196143202 0.016972133 0.081075016 ENSG00000146828.13 1.65 × 10-7 0.143400197 0.002923035 0.077255188

173 ENSG00000179071.3 1.67 × 10-7 0.195591306 0.002180137 0.109028857 ENSG00000134884.9 1.68 × 10-7 0.183055837 3.98 × 10-5 0.145034949 ENSG00000075702.12 1.69 × 10-7 0.179479402 4.05 × 10-5 0.142146808 ENSG00000070785.12 1.71 × 10-7 0.115493013 1.33 × 10-5 0.098413832 ENSG00000156265.11 1.71 × 10-7 0.195447357 7.68 × 10-5 0.146946676 ENSG00000101463.5 1.72 × 10-7 -0.197508722 4.09 × 10-5 -0.156174318 ENSG00000267216.1 1.72 × 10-7 0.16935795 0.000490633 0.111076081 ENSG00000121766.10 1.72 × 10-7 0.156979817 4.19 × 10-5 0.124796629 ENSG00000081177.14 1.72 × 10-7 0.188541451 4.41 × 10-6 0.168476086 ENSG00000108551.4 1.76 × 10-7 0.181655727 8.05 × 10-5 0.137343432 ENSG00000166928.6 1.79 × 10-7 0.197789887 0.02230144 0.075679507 ENSG00000175115.7 1.80 × 10-7 0.167055327 9.44 × 10-5 0.124908831 ENSG00000168283.9 1.83 × 10-7 0.161877255 3.31 × 10-5 0.129513692 ENSG00000007933.8 1.84 × 10-7 0.188131618 1.71 × 10-5 0.156912322 ENSG00000168803.10 1.85 × 10-7 -0.184747555 7.94 × 10-6 -0.161902907 ENSG00000156504.12 1.88 × 10-7 0.177573897 0.000299301 0.119975734 ENSG00000126759.8 1.89 × 10-7 0.183941575 0.047710918 0.060096281 ENSG00000188186.6 1.90 × 10-7 0.142425592 0.000436504 0.095287393 ENSG00000186166.4 1.91 × 10-7 0.174474722 2.03 × 10-5 0.145291842 ENSG00000069493.10 1.92 × 10-7 0.190431778 0.000847308 0.117787546 ENSG00000106976.14 1.94 × 10-7 0.184319726 0.047224312 0.061702763 ENSG00000184939.11 1.96 × 10-7 0.178373447 0.000794115 0.111794166 ENSG00000198301.7 1.96 × 10-7 0.18658092 0.000816755 0.117208925 ENSG00000163638.9 1.97 × 10-7 -0.195504825 5.44 × 10-6 -0.174721721 ENSG00000049883.10 1.98 × 10-7 -0.186213196 8.38 × 10-6 -0.162917784 ENSG00000186907.3 2.00 × 10-7 0.170040375 2.24 × 10-5 0.140757715 ENSG00000102078.11 2.00 × 10-7 0.178131157 4.15 × 10-5 0.142641795 ENSG00000165097.9 2.00 × 10-7 0.179873084 0.008380748 0.082371626 ENSG00000138035.10 2.01 × 10-7 0.190073138 0.000141642 0.139583065 ENSG00000135931.13 2.03 × 10-7 0.194995213 0.001543048 0.113994452 ENSG00000122481.12 2.04 × 10-7 0.177155418 5.08 × 10-5 0.137998073 ENSG00000167641.6 2.04 × 10-7 -0.16662502 4.03 × 10-5 -0.132508019 ENSG00000162105.12 2.07 × 10-7 -0.193874774 2.66 × 10-5 -0.161705501 ENSG00000177030.12 2.16 × 10-7 0.146377475 0.00062565 0.094118443 ENSG00000232112.3 2.17 × 10-7 0.179501397 0.000482762 0.119231607 ENSG00000106009.11 2.18 × 10-7 0.112518193 7.73 × 10-6 0.099093468

174 ENSG00000175463.7 2.18 × 10-7 0.17254229 4.32 × 10-5 0.137754077 ENSG00000187778.9 2.23 × 10-7 0.137528253 0.00067749 0.089040531 ENSG00000139155.4 2.24 × 10-7 -0.184960648 3.44 × 10-5 -0.149690065 ENSG00000237765.2 2.24 × 10-7 0.176661124 0.000106217 0.132780738 ENSG00000188807.8 2.24 × 10-7 -0.171964297 8.29 × 10-6 -0.151088059 ENSG00000197496.4 2.25 × 10-7 0.192962548 0.001357176 0.114964781 ENSG00000073111.9 2.26 × 10-7 0.181510914 0.001574686 0.106748481 ENSG00000128563.9 2.28 × 10-7 0.155113067 3.23 × 10-5 0.126443802 ENSG00000119862.8 2.29 × 10-7 0.17943937 3.98 × 10-6 0.163365423 ENSG00000074201.4 2.32 × 10-7 0.172704524 0.000249377 0.121660683 ENSG00000179348.7 2.37 × 10-7 0.186214189 4.06 × 10-6 0.169592286 ENSG00000243667.2 2.40 × 10-7 0.190714363 3.81 × 10-6 0.174935854 ENSG00000047579.15 2.40 × 10-7 0.162952618 0.002366892 0.092932887 ENSG00000114520.6 2.41 × 10-7 0.167248355 0.000292807 0.115634078 ENSG00000196812.4 2.43 × 10-7 0.188506123 3.29 × 10-5 0.154108138 ENSG00000183748.4 2.44 × 10-7 0.1919369 0.011841208 0.085097015 ENSG00000165689.12 2.44 × 10-7 0.167120343 2.72 × 10-5 0.137967435 ENSG00000107651.8 2.47 × 10-7 0.171959684 6.18 × 10-5 0.134802624 ENSG00000110321.11 2.48 × 10-7 0.173180246 7.20 × 10-5 0.133429874 ENSG00000198695.2 2.49 × 10-7 -0.182947888 0.00119704 -0.110733677 ENSG00000114315.3 2.49 × 10-7 0.191448267 1.70 × 10-5 0.162084783 ENSG00000105939.8 2.49 × 10-7 0.174649936 0.006296622 0.086588974 ENSG00000126088.8 2.50 × 10-7 0.147695579 0.000286487 0.10318167 ENSG00000261115.1 2.53 × 10-7 0.191704639 2.09 × 10-5 0.160342831 ENSG00000047056.10 2.55 × 10-7 0.191833968 0.000148901 0.140207788 ENSG00000182831.7 2.55 × 10-7 0.173875475 1.25 × 10-5 0.149266963 ENSG00000183763.4 2.55 × 10-7 0.180428858 0.001018559 0.112354902 ENSG00000078668.9 2.56 × 10-7 0.16083037 0.000178781 0.117462644 ENSG00000198554.7 2.56 × 10-7 0.185842506 0.000466875 0.120412792 ENSG00000143669.9 2.57 × 10-7 0.17213395 5.89 × 10-5 0.133392795 ENSG00000113845.5 2.57 × 10-7 0.173074204 0.000417479 0.117612693 ENSG00000099956.13 2.58 × 10-7 0.143058302 0.000150458 0.104922251 ENSG00000167578.12 2.59 × 10-7 0.162547325 1.26 × 10-5 0.141217816 ENSG00000128534.3 2.59 × 10-7 0.178005504 8.79 × 10-5 0.135055321 ENSG00000164161.5 2.59 × 10-7 -0.188013015 5.27 × 10-6 -0.170277421 ENSG00000094841.9 2.59 × 10-7 0.18247073 0.000437177 0.123046588

175 ENSG00000089693.6 2.61 × 10-7 0.161425643 1.23 × 10-5 0.14004717 ENSG00000213402.2 2.63 × 10-7 0.160864167 0.000187643 0.115481137 ENSG00000100344.6 2.65 × 10-7 -0.178479236 0.005291113 -0.09106625 ENSG00000161265.10 2.66 × 10-7 0.168302057 0.000213635 0.121243666 ENSG00000116120.8 2.67 × 10-7 0.155143418 0.000223467 0.111403607 ENSG00000186575.13 2.68 × 10-7 0.140160163 0.000120247 0.105031422 ENSG00000111647.8 2.69 × 10-7 0.173929518 0.000335063 0.119386827 ENSG00000040487.8 2.72 × 10-7 0.148865917 4.28 × 10-6 0.136597103 ENSG00000144744.12 2.73 × 10-7 0.165250951 0.000460062 0.111625307 ENSG00000172531.10 2.73 × 10-7 0.13387807 0.002885933 0.074719208 ENSG00000179950.9 2.76 × 10-7 0.130312675 0.000171529 0.095452851 ENSG00000040531.10 2.78 × 10-7 0.156808026 0.000136192 0.115999197 ENSG00000178226.6 2.79 × 10-7 0.170720549 0.114380304 0.04330955 ENSG00000105677.7 2.81 × 10-7 0.1509433 0.000793364 0.096585709 ENSG00000198133.4 2.84 × 10-7 0.189154016 0.000127445 0.141729941 ENSG00000120063.5 2.85 × 10-7 0.164349457 0.004067363 0.085391538 ENSG00000165810.12 2.87 × 10-7 -0.192693653 0.000161107 -0.141519597 ENSG00000164163.6 2.89 × 10-7 0.159342019 0.000334022 0.110779794 ENSG00000135929.4 2.90 × 10-7 0.169716349 0.08110444 0.049439283 ENSG00000155265.6 2.91 × 10-7 0.18516505 0.000418804 0.125884924 ENSG00000196776.10 2.95 × 10-7 0.17198539 0.0167316 0.071453539 ENSG00000157538.9 2.99 × 10-7 0.189760494 0.002328508 0.107383 ENSG00000162639.11 3.00 × 10-7 0.182464284 0.000175967 0.133781483 ENSG00000136840.14 3.00 × 10-7 0.153015929 0.000897956 0.09675483 ENSG00000103043.10 3.02 × 10-7 0.126426382 0.002116449 0.073286207 ENSG00000174010.9 3.06 × 10-7 -0.152351602 6.43 × 10-6 -0.137690263 ENSG00000096093.10 3.06 × 10-7 0.184477579 0.002596181 0.10420934 ENSG00000185267.5 3.08 × 10-7 -0.190184263 7.77 × 10-6 -0.169309663 ENSG00000103196.7 3.08 × 10-7 0.195839378 0.000437981 0.133611529 ENSG00000105248.11 3.12 × 10-7 0.150889015 0.000174989 0.110893676 ENSG00000163866.8 3.13 × 10-7 0.182850199 1.05 × 10-5 0.160266025 ENSG00000090020.6 3.15 × 10-7 0.167827075 4.59 × 10-6 0.153137173 ENSG00000065833.7 3.15 × 10-7 -0.179111275 0.006915294 -0.08821225 ENSG00000079156.12 3.18 × 10-7 -0.18315937 3.89 × 10-6 -0.169403652 ENSG00000179029.10 3.19 × 10-7 0.193228944 0.000671218 0.126512672 ENSG00000197329.7 3.24 × 10-7 0.176215332 0.002175442 0.099939111

176 ENSG00000033011.7 3.24 × 10-7 0.18000383 1.29 × 10-5 0.156363916 ENSG00000002330.9 3.25 × 10-7 0.131100507 0.000315401 0.092076035 ENSG00000132170.15 3.27 × 10-7 -0.173079468 0.00041217 -0.118549633 ENSG00000072849.6 3.29 × 10-7 0.181916225 0.001237871 0.112822354 ENSG00000107020.5 3.29 × 10-7 0.158258563 0.000140282 0.118939987 ENSG00000052795.8 3.33 × 10-7 -0.160294237 2.14 × 10-5 -0.136116201 ENSG00000254901.3 3.37 × 10-7 -0.151331057 3.10 × 10-5 -0.125388427 ENSG00000175455.10 3.37 × 10-7 0.161647851 0.000628712 0.106193161 ENSG00000108830.7 3.40 × 10-7 0.18779816 2.06 × 10-5 0.158996068 ENSG00000124596.12 3.41 × 10-7 0.180731081 0.002233612 0.105210906 ENSG00000108582.7 3.42 × 10-7 0.164764077 3.89 × 10-6 0.152661878 ENSG00000163521.11 3.44 × 10-7 0.189118812 0.000435726 0.128937495 ENSG00000185869.9 3.45 × 10-7 0.173039405 0.000254859 0.123733867 ENSG00000125510.11 3.50 × 10-7 0.181971613 4.22 × 10-6 0.168158208 ENSG00000118894.10 3.54 × 10-7 0.170154943 0.000131396 0.128848308 ENSG00000034693.10 3.59 × 10-7 0.177163462 0.00014677 0.131769079 ENSG00000103245.9 3.60 × 10-7 0.136596904 1.05 × 10-5 0.121068173 ENSG00000126216.8 3.65 × 10-7 0.186136736 0.001297637 0.11532129 ENSG00000124249.5 3.68 × 10-7 0.18096359 4.24 × 10-6 0.16716282 ENSG00000108591.5 3.68 × 10-7 0.134150147 0.000174012 0.099783118 ENSG00000142230.7 3.69 × 10-7 0.148255341 0.000352402 0.103968814 ENSG00000113356.6 3.70 × 10-7 0.18052564 0.000156868 0.133648619 ENSG00000213995.7 3.77 × 10-7 0.152885861 0.000425367 0.105349353 ENSG00000143368.9 3.78 × 10-7 0.164863099 0.001885997 0.09851459 ENSG00000137200.8 3.89 × 10-7 0.143258233 0.000366215 0.10022899 ENSG00000135686.8 3.92 × 10-7 -0.163347968 4.08 × 10-6 -0.15271738 ENSG00000140718.14 3.92 × 10-7 0.172519159 9.88 × 10-6 0.153324123 ENSG00000106462.6 3.94 × 10-7 0.190880992 0.00049798 0.1290654 ENSG00000169504.10 3.96 × 10-7 0.154872231 1.31 × 10-5 0.134275939 ENSG00000120662.11 4.04 × 10-7 0.177917963 0.000395737 0.123476823 ENSG00000131697.13 4.04 × 10-7 0.17640572 0.000792358 0.114079176 ENSG00000222009.4 4.04 × 10-7 0.153635807 0.00016475 0.112407504 ENSG00000010626.10 4.05 × 10-7 0.179982124 9.28 × 10-5 0.140318935 ENSG00000014919.8 4.05 × 10-7 0.177621207 1.51 × 10-5 0.154918271 ENSG00000079785.10 4.06 × 10-7 0.130768903 0.000439096 0.090368345 ENSG00000136935.9 4.08 × 10-7 0.182825023 0.000477389 0.12476912

177 ENSG00000173588.10 4.15 × 10-7 0.178832763 0.000189391 0.131266381 ENSG00000132514.9 4.17 × 10-7 0.18873845 0.060060545 0.060095273 ENSG00000172932.10 4.20 × 10-7 0.140761586 0.004159823 0.075649227 ENSG00000186198.3 4.21 × 10-7 0.172732834 1.31 × 10-5 0.15197472 ENSG00000161509.9 4.22 × 10-7 0.172618363 3.77 × 10-5 0.143664157 ENSG00000136960.8 4.23 × 10-7 0.179912501 0.001012133 0.113700219 ENSG00000188783.5 4.23 × 10-7 0.186830446 0.004609603 0.09978183 ENSG00000163528.8 4.23 × 10-7 0.184040384 6.59 × 10-5 0.14694792 ENSG00000143416.16 4.25 × 10-7 -0.154321316 1.08 × 10-5 -0.136963462 ENSG00000089847.8 4.25 × 10-7 -0.181023786 2.30 × 10-5 -0.154105759 ENSG00000169131.6 4.32 × 10-7 0.17529658 0.000165135 0.130966066 ENSG00000141577.9 4.33 × 10-7 0.141403195 1.22 × 10-5 0.12506891 ENSG00000153094.17 4.41 × 10-7 0.187505879 0.00017781 0.139467014 ENSG00000154175.12 4.42 × 10-7 0.185570389 0.067182205 0.058160382 ENSG00000179715.8 4.45 × 10-7 0.176502432 0.000128806 0.135350216 ENSG00000121807.5 4.52 × 10-7 0.191563202 0.00853805 0.093114719 ENSG00000169682.13 4.59 × 10-7 0.111999849 0.002990952 0.063155938 ENSG00000163349.17 4.61 × 10-7 0.162351981 1.40 × 10-5 0.142167569 ENSG00000099899.10 4.61 × 10-7 0.124364311 0.00020003 0.092377115 ENSG00000163344.5 4.62 × 10-7 0.122150799 0.000130587 0.093452266 ENSG00000161904.7 4.68 × 10-7 0.137008704 0.000445265 0.094775369 ENSG00000166002.2 4.70 × 10-7 0.173098953 0.015935757 0.076491261 ENSG00000164949.3 4.77 × 10-7 0.18725766 0.006857207 0.094178111 ENSG00000137942.12 4.82 × 10-7 -0.159782718 4.50 × 10-5 -0.131894737 ENSG00000166794.4 4.90 × 10-7 0.100290538 0.000412998 0.069739251 ENSG00000090661.7 4.94 × 10-7 0.174605727 0.000220807 0.128605107 ENSG00000157916.14 5.00 × 10-7 0.159176478 0.001115947 0.101102929 ENSG00000111669.10 5.07 × 10-7 0.155056461 0.00044236 0.108753322 ENSG00000203760.4 5.07 × 10-7 0.174017458 4.04 × 10-5 0.144769152 ENSG00000168404.8 5.09 × 10-7 0.187772756 0.002411174 0.111019006 ENSG00000127511.5 5.12 × 10-7 0.155628778 1.18 × 10-5 0.138325581 ENSG00000067167.3 5.14 × 10-7 0.161231414 0.000635913 0.107003383 ENSG00000157404.11 5.15 × 10-7 0.189094979 0.001710032 0.113750263 ENSG00000117448.9 5.17 × 10-7 0.139360395 0.008440699 0.068871581 ENSG00000135587.4 5.19 × 10-7 0.160556353 0.000344231 0.114764468 ENSG00000110171.14 5.22 × 10-7 0.163521855 5.44 × 10-5 0.133143882

178 ENSG00000033327.8 5.29 × 10-7 -0.188652081 7.62 × 10-6 -0.172414639 ENSG00000132965.5 5.35 × 10-7 0.186474351 0.18250318 0.040748242 ENSG00000149541.5 5.40 × 10-7 0.13513711 0.00030459 0.097077553 ENSG00000125970.7 5.43 × 10-7 0.121724047 0.003781939 0.067771647 ENSG00000174326.7 5.44 × 10-7 -0.1670647 9.91 × 10-6 -0.150905658 ENSG00000236609.3 5.45 × 10-7 0.162842052 7.66 × 10-5 0.130235716 ENSG00000166128.8 5.46 × 10-7 0.161945423 0.015744901 0.070151131 ENSG00000120324.4 5.49 × 10-7 0.186150532 0.000567528 0.12688002 ENSG00000149743.9 5.51 × 10-7 -0.152103793 4.86 × 10-6 -0.14207961 ENSG00000125779.17 5.52 × 10-7 0.186880335 7.54 × 10-5 0.148001031 ENSG00000198018.6 5.54 × 10-7 0.173607615 5.85 × 10-6 0.160574181 ENSG00000120217.9 5.60 × 10-7 0.186110158 0.000994393 0.120565365 ENSG00000187479.4 5.62 × 10-7 0.163264218 2.11 × 10-5 0.140826622 ENSG00000167118.6 5.64 × 10-7 0.137458208 0.000340671 0.098632516 ENSG00000151883.12 5.66 × 10-7 0.18170083 0.002134764 0.10574218 ENSG00000067182.3 5.72 × 10-7 0.145291852 0.008229215 0.072114171 ENSG00000175193.8 5.72 × 10-7 0.137801925 0.000209087 0.103185942 ENSG00000178896.6 5.73 × 10-7 0.143722855 3.90 × 10-5 0.120430236 ENSG00000129636.8 5.77 × 10-7 0.13906178 0.000495606 0.095520718 ENSG00000049239.8 5.78 × 10-7 0.134210919 0.000744041 0.089481596 ENSG00000124357.8 5.82 × 10-7 0.159014845 0.088114121 0.046938404 ENSG00000213949.4 5.91 × 10-7 0.167616075 2.63 × 10-5 0.143418639 ENSG00000177370.4 5.91 × 10-7 0.158686775 0.000152592 0.121579072 ENSG00000169249.8 5.94 × 10-7 0.172168288 0.003917616 0.096288623 ENSG00000253159.1 5.97 × 10-7 0.187576106 0.000779729 0.124186302 ENSG00000125733.13 6.05 × 10-7 0.153859432 0.001875088 0.09316968 ENSG00000158483.11 6.10 × 10-7 0.186325758 1.96 × 10-5 0.162299827 ENSG00000060688.8 6.12 × 10-7 0.175219922 0.000487834 0.122090472 ENSG00000078549.10 6.14 × 10-7 -0.185687475 1.25 × 10-5 -0.166117248 ENSG00000120029.8 6.15 × 10-7 0.169772719 0.002305124 0.101466519 ENSG00000127533.3 6.16 × 10-7 0.175380987 4.71 × 10-6 0.16514804 ENSG00000137965.6 6.16 × 10-7 0.18269108 7.29 × 10-5 0.148181742 ENSG00000108175.12 6.21 × 10-7 0.182539852 0.000527225 0.126262093 ENSG00000134057.10 6.25 × 10-7 0.185708574 5.51 × 10-6 0.173878279 ENSG00000215018.5 6.30 × 10-7 -0.183966016 4.62 × 10-6 -0.173377288 ENSG00000141376.16 6.40 × 10-7 0.158095603 3.33 × 10-5 0.134953089

179 ENSG00000164976.8 6.52 × 10-7 -0.187424078 1.00 × 10-5 -0.170355126 ENSG00000114450.5 6.61 × 10-7 0.155938818 0.000579007 0.104876437 ENSG00000090554.8 6.63 × 10-7 0.159793892 0.00051028 0.110544778 ENSG00000180828.1 6.69 × 10-7 0.18743511 7.72 × 10-5 0.149945836 ENSG00000251192.3 6.75 × 10-7 0.166558226 3.12 × 10-5 0.142147217 ENSG00000113594.5 6.78 × 10-7 0.162221883 4.42 × 10-6 0.153055679 ENSG00000145390.7 6.83 × 10-7 0.167075094 0.000731187 0.112043655 ENSG00000146859.6 6.91 × 10-7 0.185417282 0.000264276 0.136044111 ENSG00000176155.14 6.96 × 10-7 0.159379155 0.000168094 0.12222009 ENSG00000119318.8 7.11 × 10-7 0.168547096 1.13 × 10-5 0.152675531 ENSG00000100591.3 7.14 × 10-7 0.138037475 0.000659625 0.09457345 ENSG00000132694.14 7.17 × 10-7 0.175511916 0.00049362 0.122395384 ENSG00000227345.4 7.34 × 10-7 0.173629505 0.00049281 0.121551446 ENSG00000115523.12 7.38 × 10-7 0.183224047 0.000573569 0.126695718 ENSG00000154127.5 7.41 × 10-7 0.184758221 0.003463772 0.105988548 ENSG00000119326.10 7.41 × 10-7 -0.154257777 0.000159002 -0.118187896 ENSG00000166164.11 7.44 × 10-7 0.182180127 0.009762661 0.089769218 ENSG00000117335.14 7.46 × 10-7 0.154732461 1.26 × 10-5 0.139250143 ENSG00000120068.5 7.54 × 10-7 -0.180777436 7.29 × 10-5 -0.145245719 ENSG00000153575.6 7.58 × 10-7 0.178921804 1.57 × 10-5 0.159444118 ENSG00000140987.15 7.62 × 10-7 0.181771611 0.000250028 0.135232198 ENSG00000125378.11 7.70 × 10-7 0.185541604 7.70 × 10-6 0.171708187 ENSG00000162231.9 7.72 × 10-7 0.174717685 0.00054442 0.120865495 ENSG00000164989.11 7.77 × 10-7 0.183072315 0.000138068 0.142061845 ENSG00000125835.13 7.77 × 10-7 0.137082176 0.001090991 0.089936852 ENSG00000092931.7 7.88 × 10-7 0.183210181 0.000886862 0.119316151 ENSG00000254470.2 7.91 × 10-7 -0.140647002 2.90 × 10-5 -0.121187721 ENSG00000101096.15 7.93 × 10-7 0.185602729 0.001188155 0.119744822 ENSG00000142453.7 7.95 × 10-7 0.141393773 0.01332964 0.06674658 ENSG00000080546.9 8.04 × 10-7 0.177476342 0.00949516 0.086391395 ENSG00000109046.10 8.12 × 10-7 0.164641128 0.000404855 0.117115738 ENSG00000152242.6 8.18 × 10-7 0.15401458 1.17 × 10-5 0.13908482 ENSG00000099949.14 8.20 × 10-7 0.117488485 0.000180278 0.089449752 ENSG00000101104.8 8.21 × 10-7 0.186070735 0.000764705 0.126342851 ENSG00000153066.8 8.27 × 10-7 0.142546664 0.000973783 0.094789393 ENSG00000154930.10 8.29 × 10-7 0.172997527 0.000894467 0.114588245

180 ENSG00000236104.2 8.37 × 10-7 0.154811383 0.000267488 0.114352412 ENSG00000163536.8 8.47 × 10-7 -0.174231891 0.000165579 -0.134400083 ENSG00000115461.4 8.52 × 10-7 0.186259114 8.14 × 10-6 0.172796496 ENSG00000182566.8 8.53 × 10-7 0.177589189 0.09181614 0.052751458 ENSG00000138686.5 8.55 × 10-7 0.170250711 0.005451984 0.089819755 ENSG00000154358.15 8.60 × 10-7 -0.160662453 8.20 × 10-5 -0.130658075 ENSG00000086061.11 8.70 × 10-7 0.173402579 0.000624655 0.120252301 ENSG00000184863.6 8.72 × 10-7 0.180121821 8.16 × 10-5 0.14586952 ENSG00000122643.14 8.76 × 10-7 0.169932792 0.002498445 0.102477645 ENSG00000125503.8 8.86 × 10-7 0.113612274 0.000417288 0.081519306 ENSG00000081059.15 8.98 × 10-7 -0.186010222 4.68 × 10-6 -0.177660414 ENSG00000058056.4 9.30 × 10-7 -0.171913752 5.21 × 10-6 -0.163768028 ENSG00000115884.6 9.39 × 10-7 0.173458441 0.000596672 0.121335729 ENSG00000178927.12 9.42 × 10-7 0.130720294 0.016603194 0.059511763 ENSG00000152133.10 9.43 × 10-7 -0.165756626 0.001185573 -0.107824508 ENSG00000170312.11 9.57 × 10-7 0.179802783 0.001820407 0.112620777 ENSG00000009307.11 9.59 × 10-7 0.162439308 0.005918922 0.08700577 ENSG00000185928.7 9.74 × 10-7 0.152645891 0.000657488 0.105268591 ENSG00000161800.8 9.76 × 10-7 0.181014047 0.003731439 0.102982265 ENSG00000127666.8 9.99 × 10-7 0.148254613 0.002946483 0.08683893 ENSG00000198399.10 9.99 × 10-7 0.162356579 0.000272746 0.120565101 ENSG00000166275.11 1.01 × 10-6 0.166255917 0.000260589 0.124509657 ENSG00000119899.11 1.01 × 10-6 0.183079939 0.009376577 0.091756439 ENSG00000126945.8 1.01 × 10-6 0.169312706 0.005711414 0.090800697 ENSG00000198160.10 1.03 × 10-6 0.149069939 0.001361123 0.094770603 ENSG00000113569.11 1.03 × 10-6 0.153459646 5.76 × 10-6 0.145638275 ENSG00000152484.9 1.04 × 10-6 0.153815546 0.000704549 0.105141022 ENSG00000184349.8 1.04 × 10-6 0.178172058 3.24 × 10-5 0.155584915 ENSG00000011465.12 1.04 × 10-6 0.18003169 0.127920626 0.048241272 ENSG00000160194.13 1.06 × 10-6 0.17198995 1.23 × 10-5 0.157298393 ENSG00000150712.6 1.06 × 10-6 0.171510283 2.89 × 10-5 0.149315504 ENSG00000089123.11 1.06 × 10-6 0.180698668 0.002200924 0.110218289 ENSG00000164603.7 1.08 × 10-6 0.158658111 0.001168985 0.103861738 ENSG00000055163.14 1.08 × 10-6 0.181480093 0.000688478 0.124657025 ENSG00000022277.8 1.08 × 10-6 0.130326717 0.001409204 0.084951399 ENSG00000158941.12 1.09 × 10-6 0.146421926 0.000198299 0.112695313

181 ENSG00000163596.12 1.09 × 10-6 0.159052045 1.20 × 10-5 0.146019796 ENSG00000189308.6 1.11 × 10-6 0.14981415 0.000366152 0.109242573 ENSG00000134248.9 1.13 × 10-6 0.156331306 0.00068903 0.109473908 ENSG00000146872.13 1.14 × 10-6 0.177100665 0.000493252 0.126876167 ENSG00000147119.3 1.14 × 10-6 0.1491162 0.000235079 0.112912399 ENSG00000131969.10 1.14 × 10-6 0.168521079 1.96 × 10-5 0.151303895 ENSG00000100626.12 1.15 × 10-6 0.179077657 0.012269653 0.087438213 ENSG00000182518.9 1.17 × 10-6 0.177138541 0.002561809 0.108167456 ENSG00000148429.10 1.18 × 10-6 -0.163292647 9.91 × 10-6 -0.152326903 ENSG00000148337.15 1.19 × 10-6 0.147577571 0.005207646 0.081853071 ENSG00000127580.11 1.20 × 10-6 -0.123502113 1.24 × 10-5 -0.113566716 ENSG00000085415.11 1.21 × 10-6 0.166818062 0.000121172 0.133320207 ENSG00000150540.9 1.21 × 10-6 0.170054345 2.23 × 10-5 0.151547285 ENSG00000156239.7 1.24 × 10-6 -0.176359198 4.46 × 10-6 -0.171603827 ENSG00000010404.13 1.24 × 10-6 0.163948573 0.005739104 0.08911265 ENSG00000120875.4 1.25 × 10-6 -0.181577282 2.45 × 10-5 -0.160084977 ENSG00000166411.9 1.25 × 10-6 -0.165188049 7.20 × 10-5 -0.138061614 ENSG00000164404.4 1.26 × 10-6 -0.178425935 1.22 × 10-5 -0.165466434 ENSG00000111335.8 1.26 × 10-6 0.181189755 0.000313204 0.13623533 ENSG00000122692.7 1.26 × 10-6 0.172135706 0.003306765 0.101881411 ENSG00000258839.2 1.27 × 10-6 0.167097301 6.48 × 10-5 0.140582447 ENSG00000131828.9 1.27 × 10-6 -0.164419679 0.008250469 -0.085300991 ENSG00000204852.11 1.27 × 10-6 0.181251784 0.003667208 0.10472288 ENSG00000124191.13 1.28 × 10-6 0.177155872 4.40 × 10-6 0.172043081 ENSG00000257923.5 1.29 × 10-6 0.168979768 0.000466087 0.121946701 ENSG00000163328.9 1.30 × 10-6 0.166385494 0.000158378 0.131200593 ENSG00000166321.9 1.33 × 10-6 0.178273022 0.000227555 0.13749497 ENSG00000184635.9 1.33 × 10-6 0.173082259 0.003631989 0.100822085 ENSG00000125703.10 1.34 × 10-6 0.166316774 0.003868967 0.095969803 ENSG00000068305.13 1.34 × 10-6 0.156619456 0.000111369 0.126081975 ENSG00000100911.9 1.35 × 10-6 0.157644634 0.002562831 0.097155657 ENSG00000151690.10 1.36 × 10-6 0.168823204 3.84 × 10-6 0.164731045 ENSG00000169976.5 1.36 × 10-6 0.14110353 0.00281568 0.085852506 ENSG00000010072.11 1.36 × 10-6 0.169069948 4.31 × 10-5 0.145016175 ENSG00000017797.7 1.36 × 10-6 0.181337399 0.003720569 0.104753232 ENSG00000172731.9 1.37 × 10-6 0.165969406 2.93 × 10-5 0.146890858

182 ENSG00000159588.10 1.37 × 10-6 0.170849109 0.000103342 0.140039148 ENSG00000165630.9 1.38 × 10-6 0.167489902 0.000530201 0.120998461 ENSG00000116030.12 1.38 × 10-6 0.166412841 0.004136572 0.096705051 ENSG00000004059.6 1.38 × 10-6 0.13324389 0.005446969 0.074358973 ENSG00000177989.9 1.38 × 10-6 0.131884493 0.002153494 0.082885456 ENSG00000168675.14 1.38 × 10-6 0.176063951 7.93 × 10-5 0.146063069 ENSG00000160753.11 1.40 × 10-6 0.162477528 0.001905578 0.10215876 ENSG00000213918.6 1.41 × 10-6 -0.17510431 0.000113654 -0.142318218 ENSG00000134853.7 1.42 × 10-6 0.174821445 0.103283303 0.051943223 ENSG00000022267.12 1.42 × 10-6 0.179604991 0.007462834 0.095773911 ENSG00000132016.7 1.43 × 10-6 0.168800547 1.33 × 10-5 0.156021867 ENSG00000043093.9 1.43 × 10-6 0.141351805 0.001718087 0.090457272 ENSG00000002549.8 1.44 × 10-6 0.165953589 0.000272192 0.126259769 ENSG00000174903.10 1.45 × 10-6 0.12352104 4.03 × 10-5 0.107517275 ENSG00000018236.10 1.46 × 10-6 0.178113082 0.000130133 0.143572674 ENSG00000183484.7 1.47 × 10-6 0.173568005 0.127217355 0.046647884 ENSG00000015532.5 1.48 × 10-6 0.136123845 9.40 × 10-5 0.112208676 ENSG00000203832.6 1.49 × 10-6 0.180701041 4.38 × 10-5 0.156643885 ENSG00000140682.14 1.49 × 10-6 0.135485381 0.000861438 0.093466894 ENSG00000205670.6 1.51 × 10-6 0.175654211 0.001316657 0.11689665 ENSG00000071462.7 1.53 × 10-6 0.149920903 0.000616984 0.10666949 ENSG00000166747.8 1.54 × 10-6 0.156484156 0.000256425 0.119327605 ENSG00000182621.12 1.55 × 10-6 0.170619406 5.55 × 10-6 0.165124362 ENSG00000099849.10 1.55 × 10-6 0.149683573 0.001482089 0.098280093 ENSG00000112110.5 1.55 × 10-6 0.168697309 0.001677208 0.108838549 ENSG00000152583.8 1.57 × 10-6 0.157271316 5.59 × 10-6 0.152377515 ENSG00000010270.9 1.59 × 10-6 0.167840808 0.00382391 0.098642427 ENSG00000156500.10 1.59 × 10-6 0.172994079 1.90 × 10-5 0.158062021 ENSG00000160323.14 1.60 × 10-6 -0.168711387 1.69 × 10-5 -0.155035523 ENSG00000002822.11 1.60 × 10-6 0.13977421 0.002243613 0.087123134 ENSG00000127314.13 1.62 × 10-6 0.155493642 0.006331132 0.084828713 ENSG00000167536.9 1.62 × 10-6 0.169943571 5.51 × 10-5 0.145630624 ENSG00000178222.8 1.63 × 10-6 -0.181816902 1.01 × 10-5 -0.172073668 ENSG00000168818.5 1.63 × 10-6 0.151091847 0.003009733 0.092055067 ENSG00000160783.15 1.64 × 10-6 0.15590015 0.004214663 0.091267819 ENSG00000180008.8 1.64 × 10-6 0.146139832 6.75 × 10-5 0.123123393

183 ENSG00000132432.9 1.64 × 10-6 0.114933457 0.000105337 0.094699856 ENSG00000081307.8 1.65 × 10-6 0.156268187 3.61 × 10-5 0.13738817 ENSG00000117118.5 1.65 × 10-6 -0.102223384 3.09 × 10-5 -0.090227844 ENSG00000075914.8 1.66 × 10-6 0.13550282 0.00286836 0.082761293 ENSG00000214026.6 1.66 × 10-6 0.1143404 0.00011948 0.093648586 ENSG00000108797.7 1.67 × 10-6 0.165717786 0.000174403 0.131082258 ENSG00000136059.10 1.68 × 10-6 0.144546694 0.000769139 0.101083324 ENSG00000008853.12 1.68 × 10-6 0.156841703 0.000148678 0.126079297 ENSG00000121542.7 1.69 × 10-6 0.166394555 0.001623507 0.10875648 ENSG00000124201.10 1.69 × 10-6 0.179961824 0.000622147 0.127697983 ENSG00000158315.6 1.70 × 10-6 -0.17887314 3.54 × 10-5 -0.157569036 ENSG00000180098.5 1.70 × 10-6 0.158205364 0.002720743 0.09752431 ENSG00000100146.12 1.70 × 10-6 -0.175215848 2.44 × 10-5 -0.158431789 ENSG00000139620.8 1.71 × 10-6 0.16690233 0.00011616 0.136705235 ENSG00000107077.13 1.73 × 10-6 0.17381602 0.000714092 0.121514796 ENSG00000239900.7 1.74 × 10-6 0.140628052 0.004086203 0.081831988 ENSG00000186298.7 1.74 × 10-6 0.157280865 0.068795902 0.052798202 ENSG00000105447.8 1.74 × 10-6 0.128717719 0.000413208 0.095790564 ENSG00000214413.3 1.75 × 10-6 0.148584304 0.000286936 0.113486742 ENSG00000149451.13 1.75 × 10-6 0.164077636 0.009454056 0.085490937 ENSG00000157601.9 1.76 × 10-6 0.170194624 0.00071453 0.121411049 ENSG00000125971.12 1.77 × 10-6 0.142212577 0.00078915 0.100393343 ENSG00000179388.8 1.77 × 10-6 0.179078597 0.000164794 0.14304294 ENSG00000165487.9 1.77 × 10-6 0.152487025 0.002424122 0.095641395 ENSG00000003989.12 1.78 × 10-6 0.167627176 2.82 × 10-5 0.149706991 ENSG00000120509.6 1.79 × 10-6 0.162273886 0.000962339 0.11279023 ENSG00000107249.17 1.81 × 10-6 0.173623977 0.004463548 0.100941383 ENSG00000164134.8 1.83 × 10-6 0.160800906 0.001004082 0.109972716 ENSG00000115520.4 1.83 × 10-6 0.167839899 5.85 × 10-5 0.144263515 ENSG00000170037.9 1.84 × 10-6 0.157355385 0.007747631 0.086397085 ENSG00000144366.11 1.84 × 10-6 0.161660059 1.93 × 10-5 0.147831195 ENSG00000172977.8 1.85 × 10-6 0.161224895 0.004068921 0.09494918 ENSG00000180370.6 1.85 × 10-6 0.152421381 0.004062238 0.088632725 ENSG00000168026.12 1.85 × 10-6 0.176311028 0.00192803 0.113535812 ENSG00000148950.5 1.86 × 10-6 0.165461131 0.000390351 0.124886978 ENSG00000070444.10 1.91 × 10-6 0.158084835 9.69 × 10-5 0.131437768

184 ENSG00000174652.13 1.91 × 10-6 0.176144852 0.000595309 0.127093774 ENSG00000119673.10 1.93 × 10-6 -0.161726733 0.004729295 -0.092614936 ENSG00000184005.9 1.93 × 10-6 -0.171089051 0.000127821 -0.139688876 ENSG00000122203.10 1.93 × 10-6 -0.161150952 5.14 × 10-6 -0.15872807 ENSG00000122565.14 1.97 × 10-6 0.16134798 0.010829499 0.082873678 ENSG00000006652.9 1.97 × 10-6 0.169977064 9.08 × 10-5 0.142032894 ENSG00000175691.8 1.98 × 10-6 0.176782723 7.27 × 10-5 0.150992788 ENSG00000170421.7 1.99 × 10-6 0.169723503 0.000379172 0.127792797 ENSG00000139428.7 1.99 × 10-6 -0.150422065 0.003677066 -0.088736851 ENSG00000198888.2 2.01 × 10-6 -0.164229542 0.017431989 -0.075821407 ENSG00000255767.1 2.02 × 10-6 0.165493651 0.000173839 0.132849656 ENSG00000130227.12 2.04 × 10-6 0.164438655 0.008313928 0.087093246 ENSG00000166199.8 2.07 × 10-6 0.151056449 0.000485637 0.111934867 ENSG00000184451.5 2.07 × 10-6 -0.150403517 0.000643158 -0.108120074 ENSG00000221838.5 2.08 × 10-6 0.147377814 0.000768967 0.104900845 ENSG00000213221.4 2.10 × 10-6 0.141060838 0.000112805 0.116525028 ENSG00000110697.8 2.10 × 10-6 0.140796358 0.003989302 0.083093043 ENSG00000117174.6 2.13 × 10-6 0.161848678 0.000986109 0.111919908 ENSG00000168924.10 2.14 × 10-6 -0.172402404 0.008463937 -0.092426527 ENSG00000108771.8 2.14 × 10-6 0.157414277 0.002303837 0.100519261 ENSG00000120697.4 2.15 × 10-6 0.12610953 0.000290083 0.097978334 ENSG00000157557.7 2.16 × 10-6 0.178458213 5.73 × 10-5 0.153831809 ENSG00000125818.13 2.17 × 10-6 0.15845414 0.015536096 0.076229143 ENSG00000137802.9 2.17 × 10-6 0.15301034 4.34 × 10-5 0.134828452 ENSG00000104880.13 2.18 × 10-6 0.134368209 0.009087589 0.071794772 ENSG00000112419.10 2.18 × 10-6 0.147419091 5.42 × 10-5 0.127764565 ENSG00000205531.8 2.18 × 10-6 0.144484102 0.001652124 0.094853318 ENSG00000118922.12 2.19 × 10-6 0.161239548 0.001835526 0.103205113 ENSG00000163606.6 2.19 × 10-6 0.179535756 0.193412768 0.041310999 ENSG00000085185.11 2.20 × 10-6 -0.177697104 6.06 × 10-6 -0.174534484 ENSG00000139998.10 2.23 × 10-6 0.17582065 1.65 × 10-5 0.163909478 ENSG00000256229.3 2.24 × 10-6 0.162253382 0.00096607 0.112340466 ENSG00000103005.7 2.25 × 10-6 0.169851902 8.44 × 10-5 0.143690424 ENSG00000050327.10 2.28 × 10-6 0.170135523 0.000250645 0.133518477 ENSG00000169239.8 2.29 × 10-6 0.172901705 0.003025913 0.105804577 ENSG00000225828.1 2.30 × 10-6 0.122064897 2.63 × 10-5 0.111369913

185 ENSG00000172071.7 2.30 × 10-6 0.163940925 0.003989017 0.097219158 ENSG00000126821.7 2.32 × 10-6 0.154663942 0.004793165 0.08927418 ENSG00000139971.11 2.33 × 10-6 0.172754256 9.08 × 10-5 0.145455075 ENSG00000174606.8 2.36 × 10-6 0.154555648 0.000299052 0.119642723 ENSG00000168476.7 2.37 × 10-6 0.138708046 0.026024302 0.061470619 ENSG00000166333.9 2.39 × 10-6 0.147763291 0.009679007 0.077864245 ENSG00000155729.8 2.41 × 10-6 0.168337426 0.000472875 0.125143081 ENSG00000100401.15 2.42 × 10-6 0.120618007 0.000635969 0.087656814 ENSG00000056097.11 2.47 × 10-6 0.15028851 0.001518281 0.099925304 ENSG00000139233.2 2.47 × 10-6 0.167988743 0.002805051 0.1061294 ENSG00000104219.8 2.48 × 10-6 0.159764165 0.000337277 0.121840196 ENSG00000091073.15 2.49 × 10-6 0.130540591 0.000152724 0.105679076 ENSG00000089723.5 2.52 × 10-6 0.176786294 0.000465299 0.131358589 ENSG00000188153.8 2.57 × 10-6 0.17440859 0.000126823 0.143810546 ENSG00000170522.5 2.58 × 10-6 -0.174956823 0.013226095 -0.088195221 ENSG00000080371.4 2.60 × 10-6 -0.134313088 4.79 × 10-6 -0.134590781 ENSG00000118690.8 2.61 × 10-6 0.170612635 0.01047175 0.089997148 ENSG00000076662.5 2.61 × 10-6 -0.166554764 2.92 × 10-5 -0.151621787 ENSG00000072506.8 2.63 × 10-6 -0.145923705 2.28 × 10-5 -0.134377714 ENSG00000138395.10 2.63 × 10-6 0.176556566 0.055398553 0.065138972 ENSG00000132581.5 2.64 × 10-6 0.154063057 0.003527034 0.094395051 ENSG00000092068.14 2.64 × 10-6 0.173490292 0.0585633 0.064108976 ENSG00000186787.7 2.67 × 10-6 0.172190968 0.000196301 0.138492342 ENSG00000100902.6 2.71 × 10-6 0.116656264 6.51 × 10-5 0.101729509 ENSG00000099364.12 2.73 × 10-6 0.132218083 0.0007008 0.095478584 ENSG00000167085.7 2.75 × 10-6 0.134775249 7.50 × 10-5 0.116178401 ENSG00000129968.11 2.76 × 10-6 0.101961316 2.21 × 10-5 0.09438346 ENSG00000141519.10 2.76 × 10-6 0.166302414 5.28 × 10-5 0.146302543 ENSG00000101109.7 2.77 × 10-6 0.143182119 0.065363704 0.049861007 ENSG00000140743.3 2.78 × 10-6 0.17520188 0.000127254 0.145552648 ENSG00000143393.12 2.80 × 10-6 0.134913474 0.001321571 0.092219664 ENSG00000097033.10 2.83 × 10-6 0.14725817 0.000130511 0.121946551 ENSG00000111364.11 2.86 × 10-6 0.169371544 6.92 × 10-5 0.14724589 ENSG00000160446.14 2.87 × 10-6 0.129342069 0.004986363 0.075527632 ENSG00000166912.12 2.88 × 10-6 -0.14237653 4.86 × 10-5 -0.126020379 ENSG00000137275.9 2.89 × 10-6 0.170017413 0.000935709 0.120350193

186 ENSG00000145354.5 2.92 × 10-6 0.168464518 0.060052562 0.062049083 ENSG00000095303.10 2.93 × 10-6 0.176466251 0.26939369 0.034343202 ENSG00000019169.9 3.01 × 10-6 0.170543458 0.148809153 0.045286434 ENSG00000043514.11 3.02 × 10-6 0.166777885 0.000765899 0.121003754 ENSG00000153046.13 3.04 × 10-6 0.169835945 0.000244282 0.134635426 ENSG00000196814.10 3.04 × 10-6 0.151627366 0.000200049 0.122308643 ENSG00000165806.15 3.05 × 10-6 0.171470554 0.001467471 0.1162644 ENSG00000198598.2 3.05 × 10-6 -0.157306505 9.44 × 10-5 -0.133729038 ENSG00000122035.6 3.07 × 10-6 0.168869693 0.001736067 0.112422782 ENSG00000112769.14 3.08 × 10-6 0.146879831 0.006415175 0.083774791 ENSG00000176986.10 3.08 × 10-6 0.162037998 0.002013349 0.106112689 ENSG00000128394.12 3.09 × 10-6 0.160559847 0.00190334 0.105759169 ENSG00000198944.4 3.10 × 10-6 -0.174093024 0.002450067 -0.111665692 ENSG00000175267.10 3.12 × 10-6 -0.176749816 7.25 × 10-6 -0.175754403 ENSG00000206527.5 3.15 × 10-6 -0.159201001 0.004887524 -0.094169585 ENSG00000159374.13 3.16 × 10-6 0.171946476 0.008220508 0.095401967 ENSG00000070718.7 3.17 × 10-6 0.16616676 0.000482745 0.12481708 ENSG00000111664.6 3.17 × 10-6 0.172358318 0.000868316 0.123834454 ENSG00000100029.13 3.22 × 10-6 0.128405624 0.001963933 0.085304299 ENSG00000001629.5 3.25 × 10-6 0.144065592 0.000222191 0.115116406 ENSG00000164307.8 3.30 × 10-6 0.165299908 0.000141805 0.136900334 ENSG00000161929.10 3.34 × 10-6 0.171150194 0.189901192 0.041465649 ENSG00000221949.2 3.35 × 10-6 0.16392145 8.40 × 10-6 0.161429084 ENSG00000102967.7 3.36 × 10-6 -0.168042651 4.28 × 10-5 -0.15130507 ENSG00000114529.8 3.37 × 10-6 -0.171960806 6.82 × 10-5 -0.150345436 ENSG00000165475.9 3.37 × 10-6 0.160427947 0.079545569 0.054589467 ENSG00000174840.8 3.37 × 10-6 0.140037834 3.82 × 10-6 0.142740633 ENSG00000171522.5 3.38 × 10-6 0.173887611 0.000471127 0.131197423 ENSG00000153064.7 3.38 × 10-6 0.157013879 5.10 × 10-5 0.139459976 ENSG00000151470.8 3.39 × 10-6 0.163699379 0.001821218 0.109054069 ENSG00000148841.11 3.40 × 10-6 0.172204024 0.000731673 0.12535987 ENSG00000119231.6 3.47 × 10-6 0.155151558 5.80 × 10-5 0.137116855 ENSG00000231672.2 3.51 × 10-6 -0.172696029 0.000156542 -0.143013828 ENSG00000017621.11 3.52 × 10-6 0.164240913 5.14 × 10-6 0.165747015 ENSG00000143079.10 3.52 × 10-6 0.160031088 0.000423567 0.121120439 ENSG00000103174.7 3.53 × 10-6 0.128514931 0.012176653 0.06690085

187 ENSG00000203485.8 3.53 × 10-6 0.11305903 2.24 × 10-5 0.106044482 ENSG00000067334.9 3.58 × 10-6 0.142517725 0.004189587 0.086273532 ENSG00000127241.12 3.59 × 10-6 0.17521238 0.006954494 0.099395146 ENSG00000008018.8 3.60 × 10-6 0.10685683 0.000623533 0.079724782 ENSG00000150471.11 3.61 × 10-6 0.16799556 1.31 × 10-5 0.161969347 ENSG00000257103.4 3.63 × 10-6 0.152314832 0.007737481 0.08465603 ENSG00000103549.17 3.65 × 10-6 0.140876271 0.00252716 0.090925503 ENSG00000175265.13 3.65 × 10-6 0.172342459 0.00034691 0.134080512 ENSG00000168779.15 3.66 × 10-6 0.175421542 0.001951336 0.115429915 ENSG00000184117.7 3.68 × 10-6 0.169199718 0.001411781 0.116081311 ENSG00000100299.13 3.68 × 10-6 0.110480165 0.000208869 0.08968784 ENSG00000170175.6 3.73 × 10-6 0.162138108 0.001598028 0.109859113 ENSG00000143842.10 3.77 × 10-6 0.168277428 1.05 × 10-5 0.164242227 ENSG00000165389.6 3.77 × 10-6 0.163090483 0.000827976 0.118417789 ENSG00000166479.5 3.79 × 10-6 0.132126105 0.00248938 0.084964698 ENSG00000205268.6 3.79 × 10-6 0.167563208 0.017636312 0.082124088

Table A.7: List of genes whose expression is no longer significantly associated to BMI after adjusting for macrophage proportion.

188