G3: |Genomes|Genetics Early Online, published on May 18, 2016 as doi:10.1534/g3.116.029439

ForestPMPlot: a flexible tool for visualizing heterogeneity

between studies in meta-analysis

Eun Yong Kang1, Yurang Park2, Xiao Li3, Ayellet V. Segre`3, Buhm Han4,*, and Eleazar Eskin1,5,*

1Department of Computer Science, University of California, Los Angeles, CA, USA 2Asan Institute for Life Sciences, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Republic of Korea 3The Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, Massachusetts 02142, USA. 4Department of Convergence Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea 5Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA, USA *These authors contributed equally to this work

May 17, 2016

Abstract

Meta-analysis has become a popular tool for genetic association studies to combine different genetic studies. A key challenge in meta-analysis is heterogeneity or the differences in effect sizes between studies. Heterogeneity complicates the interpretation of meta-analyses. In this paper, we describe ForestPMPlot, a flexible visualization tool for analyzing studies included in a meta-analysis. The main feature of the tool is visualizing the differences in the effect sizes of the studies to understand why the studies exhibit heterogeneity for a particular phenotype

1

© The Author(s) 2013. Published by the Genetics Society of America. and pair under different conditions. We show the application of this tool to interpret a meta-analysis of 17 mouse studies and to interpret a multi-tissue eQTL study.

1 Introduction

Meta-analysis has become a popular tool for genetic association studies to achieve higher power in identifying genetic variants that affect a trait (Evangelou and Ioannidis, 2013). Recently, by combining multiple studies through meta-analysis, a large number of genetic studies have successfully identified novel associated loci that were not identified by any single study included in the meta-analysis (Morris et al., 2012; Lango Allen et al., 2010; Lambert et al., 2013; Heid et al., 2010; Anttila et al., 2013). As more genetic studies of phenotypes become available, meta-analysis will become even more widely utilized in the genetic association studies. Interpreting and understanding the results of meta-analysis is now becoming important, yet it remains a challenge. The combined studies in a meta-analysis are often heterogeneous. For example, genetic association studies can differ from each other in terms of environmental con- ditions (Kang et al., 2014), study design, populations, statistical noise and the use of covariates in the analysis (Manning et al., 2012). These factors can make the effect sizes differ between studies, a phenomenon called between-study heterogeneity (Han and Eskin, 2011). A correct interpretation of this heterogeneity will lead us to a better understanding of the effect under specific conditions and to an informed decision in the replication study. In this paper, we describe ForestPMPlot, a flexible visualization tool for analyzing studies included in a meta-analysis. The main feature of the tool is visualizing the differences in the effect sizes of the studies to understand why the studies exhibit heterogeneity for a particular phenotype and locus pair under different conditions. Unlike traditional forest plots, which only display effect size magnitude and its standard error for each study, ForestPMPlot displays the P-value and the posterior probability prediction for the existence of the effect in each study (m-value), which is estimated by utilizing cross-study information (Han and Eskin, 2012). The main advantage of the m-value is that it can effectively segregate from one another the studies predicted to have an effect, the studies predicted to not have an effect, and the ambiguous studies that are underpowered. ForestPMPlot visualizes the relationship between P-values and m-values in a plot called PM-Plot, and displays it along with the forest plot. By visualizing much richer information than the traditional forest plot, ForestPMPlot can considerably facilitate the

2 interpretation of the results of meta-analysis

2 Methods

In this section, we first review the two different meta-analysis approaches (fixed effects model and random effects model) and explain our approaches for visualizing the results of meta-analysis.

2.1 Fixed effects (FE) model meta-analysis

The underlying assumption of the fixed effects meta-analysis is the same effect size across the studies included in meta-analysis (Mantel and Haenszel, 1959). Under this assumption, in the fixed effects meta-analysis, the effect size estimates of studies such as the log odds ratios or regression coefficients, are combined and summarized into one summary statistic. The common method of combining effect sizes under fixed effects model is the inverse variance weighted effect size estimate (de Bakker et al., 2008; Fleiss, 1993). Let X1, ..., Xc be the effect size estimates from c studies included in a meta-analysis and let Vi be the variance of Xi. Then for fixed effects meta-analysis, the weight for each effect size is set to the inverse variance of the effect

−1 size estimate (Wi = Vi ). Thus the inverse-variance-weighted effect size estimate is

P WiXi X = P Wi

pP −1 And the standard error of X is Wi . Because X asymptotically follows a normal distri- bution, we can compute the fixed effects meta-analysis statistic in the following way.

X P W X Z = = i i FE pP −1 pP Wi Wi

The above statistic (ZFE) follows the standard normal distribution under the null hypothesis of no association. Thus the P-value can be computed by

PFE = 2Φ(−|ZFE|) where Φ denotes the cumulative density function of the standard normal distribution.

3 2.2 Random effects (RE) model meta-analysis

Unlike fixed effects model meta-analysis, random effects model meta-analysis treats the under- lying effect size of each study as a random variable. Specifically, a typical assumption is that the effect size of each study follows a normal distribution with the grand mean β¯ and the variance τ 2 (Han and Eskin, 2011; DerSimonian and Laird, 1986):

¯ 2 βi ∼ N(β, τ ).

In this model, τ 2 represents the between-study variance. In other words, the more the effect sizes of studies included in a meta-analysis are different, the larger the between-study variance (τ 2) is. Given the above model for effect sizes of the studies, the traditional random effects model tests the null hypothesis β¯ = 0 versus the alternative hypothesis β¯ 6= 0. Recently, Han and Eskin (Han and Eskin, 2011) have shown that under the condition that there is no heterogeneity under the null hypothesis, which is often the case in genetic association studies, the traditional random effects model can be overly conservative. Instead, they proposed a new random effects model approach that increases power by testing the null hypothesis β¯ = 0 and τ 2 = 0 versus the alternative hypothesis β¯ 6= 0 or τ 2 6= 0. Han-Eskin model uses the following likelihood model:

Y 1  β2  L = exp − i 0 p 2 2σ2 i 2πσi i  2  Y 1 (βi − µ) L1 = exp − . p 2 2 2(σ2 + τ 2) i 2π(σi + τ ) i

The maximum likelihood estimates µˆ and τˆ2 can be found by an iterative procedure suggested by Hardy and Thompson (Hardy and Thompson, 1996). Then the likelihood ratio test statistic can be built

 2  2 2 X σi X βi X (βi − µˆ) SHan−Eskin = −2 log(λ)= log 2 2 + 2 − 2 2 . (1) σi +τ ˆ σi σi +τ ˆ which asymptotically follows a mixture of 1 and 2 degrees of freedom χ2. Accurate p-values with small sample correction can be calculated efficiently using pre-computed tabulated values (Han and Eskin, 2011).

4 2.3 Identifying studies with an effect through m-value

To distinguish the studies with an effect and the studies without an effect, we utilize the m-value framework. The m-value (Han and Eskin, 2012; Kang et al., 2014) is the posterior probability that the effect exists in each study. Thus one can interpret m-value in the following way: a small m-value (e.g. 0.1) represents a study that is predicted to not have an effect, a large m-value (e.g. 0.9) represents a study that is predicted to have an effect, and otherwise it is ambiguous to make a prediction. In the following we explain how to compute m-value. Suppose we have n studies we want to combine. Let E = [δ1, δ2, ..., δn] be the vector of estimated effect sizes and V = [V1,V2, ..., Vn] be the vector of variances of n effect sizes. We assume that the effect size δi follows the following normal distribution.

P (δi|no effect) = N(δi; 0,Vi) (2)

P (δi|effect)= N(δi; µ, Vi) (3)

We assume that the prior for the effect size is

µ ∼ N(0, σ2) (4)

A possible choice for σ in GWAS is 0.2 for small effect and 0.4 for large effects (Stephens and

Balding, 2009). Let Ci be a random variable whose value is 1 if a study i has an effect and 0 otherwise. Let C be a vector of Ci for n studies. Since C includes n binary variables, C can

n have 2 possible configurations. Let U = [c1, ..., c2n ] be a vector containing all these possible configurations. We define m-value mi as the probability P (Ci = 1|E), which is the probability of study i having an effect given the observed effect size estimates. We can compute this probability using the Bayes’ theorem in the following way.

P P (E|C = c)P (C = c) c∈Ui mi = P (Ci = 1|E) = P (5) c∈U P (E|C = c)P (C = c)

th where Ui is a subset of U whose elements’ i value is 1. Now we need to compute P (E|C = c)

5 and P (C = c). P (C = c) can be computed as

B(|c| + α, n − |c| + β) P (C = c) = (6) B(α, β) where |c| denotes the number of 1’s in c and B denotes the beta function where we set α and β be 1 (Han and Eskin, 2012). The probability E given configuration c, P (E|C = c), can be computed as

Z ∞ Y Y P (E|C = c) = N(δi; 0,Vi) N(δi; µ, Vi)p(µ) dµ (7) −∞ i∈c0 i∈c1 ¯ 2 Y = CN¯ (δ; 0, V¯ + σ ) N(δi; 0, Vi) (8)

i∈c0

P ¯ i Wiδi ¯ 1 δ = P and V = P (9) i Wi i Wi where c0 is the indices of 0 in c and c1 is the indices of 1 in c, N(δ; a, b) denotes the probability −1 density function of the normal distribution with mean a and variance b. Wi = Vi is the inverse variance or precision and C¯ is a scaling factor.

s Q ( P 2 !) 1 i Wi 1 X 2 ( i Wiδi) C¯ = √ exp − Wiδ − (10) N−1 P W 2 i P W ( 2π) i i i i i

2.4 PM-Plot

For interpreting and understanding the result of a meta-analysis, it is informative to look at the P-values and m-values at the same time. We propose the PM-plot framework (Han and Eskin, 2012), which plots the P-values and m-values of each study together and visualizes the relationship between m-values and P-values in a two-dimensional space. Through the PM-Plot, a researcher can easily distinguish which study is predicted to have an effect and which study is predicted to not have an effect. The X-axis of the PM-Plot represents the m-value between

0 and 1, and the Y-axis represents the statistical significance of association, −log10(p-value) and the dashed horizontal line is the significance threshold. The colored circle for each study is placed in the PM-Plot according to its m-value and P-value. We classify the estimated posterior probability for each study into three categories: a study that has an effect (m > .9) is denoted

6 by a red dot, a study that does not have an effect (m < .1) is denoted by a blue dot, and a study whose effect is uncertain (.1< m <.9) is denoted by a green dot. The dot size represents the study’s sample size. For the multiple tissue eQTL application (Figure 2), we can add particular color for dots which represents the corresponding tissue type. Figure 1 (b) shows one example of a PM-plot. One reason that studies are ambiguous (0.9 ≤ m-value ≤ 0.1) is that they are underpowered due to small sample size. If the sample size increases, the study can be drawn to either the left or the right side. ForestPMPlot utilizes an automatic algorithm to place the study names (numbers corresponding to the actual names in the forest plot) to minimize the overlap between labels (Lemon, 2006).

3 Results

3.1 Application to GWAS Meta-Analysis

Figure 1 is an example of the output of ForestPMPlot for a mouse HDL study (Kang et al., 2014), which combines 17 mouse studies. The 17 HDL mouse studies included in this meta-analysis have different environmental/genetic conditions such as diet (high fat/low fat etc) and various knockouts including homozygous deficiency in leptin receptor (db/db), ldl recepter knockouts and Apoe gene knockouts. In Figure 1, study names describe the characteristic of each study. In this example, the study name is encoded as {mouse-strain}-{condition} format. HMDP stands for hybrid mouse diversity panel, which combines classic inbred strains for mapping resolution and recombinant inbred strains for mapping power (Bennett et al., 2010). Mice for the HMDPxB panel were created by breeding females of the various HMDP inbred strains (Bennett et al., 2010) to males carrying transgenes for both Apoe Leiden (van den Maagdenberg et al., 1993) and for human Cholesterol Ester Transfer (CETP) (Jiang et al., 1992) on a C57BL/6 genetic background, which cause the progression of atherosclerosis along the arterial tree. BXH wild type (BXH/wt) mice were produced as previously described (van Nas et al., 2010). Briefly, C57BL/6J mice were intercrossed with C3H/HeJ mice to generate 321 F2 progeny. BXD-db strain is an F2 intercross between the inbred strains DBA/2 and C57BL/6 (Davis et al., 2012). The male C57BL/6 parents carried heterozygosity deficiency in the leptin receptor (db +/-) and F1 progeny were selected for homozygosity of the mutant allele. Among F2 progeny, only those with homozygous deficiency in leptin receptor (db/db) were selected. For CXB-ldlr strain, female BALB/cByJ-LDLRKO (designated as C) mice were crossed with

7 rs32595861 (FE P = 5.71 x 10−6 , RE P = 1.41 x 10−7 ) PM−Plot Gene : Fabp3 10 P −value Study Name 0.0004 ● 1.HMDPxB−chow(M) 0.0219 ● 2.HMDPxB−chow(F) 0.4636 ● 3.BXH−wt(M)

8 ● Study has an effect (m > .9) 0.0149 ● 4.BXH−wt(F) ● 0.0046 ● 5.HMDP−chow(M) Study does not have an effect (m < .1) 0.0016 ● 6.HMDP−fat(M) ● Study's effect is uncertain (.1< m < .9) 0.0313 ● 7.HMDP−fat(F) 0.3556 ● 8.BXD−db−12(M) 6 0.1820 ● 9.BXD−db−12(F) ) p 0.5774 ● 10.BXD−db−5(M) ( 10 0.7739 ● 11.BXD−db−5(F) log

0.7152 ● 12.BXH−apoe(M) −

0.8704 ● 13.BXH−apoe(F) 4 0.8517 ● 14.HMDPxB−ath(M) 1 ● 0.0013 ● 15.HMDPxB−ath(F) 15 6 ● 0.7775 ● 16.CXB−ldlr(M) 5 ● 0.6493 ● 17.CXB−ldlr(F) 2 4 9 2 ● ● ●7 FE Summary 3 12 ● 10 ● 8 RE Summary ● ● ●11 ●1413●● 16 0 ● 17

(A) −1.0 −0.5 0.0 0.5 1.0 (B) 0.0 0.2 0.4 0.6 0.8 1.0 Log odds ratio m−value

Figure 1: 17 mouse HDL studies with various environmental/genetic conditions are combined in this meta-analysis (Kang et al., 2014). In this example, we want to focus on 3. BXH-wt(M) and 4. BXH-wt(F) studies. These BXH strains are F2 mice constructed from a cross between C57BL/6J x C3H/HeJ F2 wild type strains under the western diet condition (van Nas et al., 2010), but differ by sex. When we only consider the effect size estimates in forest plot, two confidence intervals of effect estimates overlap each other, making it ambiguous if the observed heterogeneity is a result of stochastic errors. However in the PM-Plot, since the m-values are calculated utilizing cross-study information, the posterior probabilities are well segregated for these two studies (m-value: 0.93 vs 0.03), allowing us to hypothesize that the SNP effects on HDL in C57BL/6J x C3H/HeJ F2 strains under the western diet condition can be interacting with sex. Implicated genes are Fabp3 which is also known as fatty acid binding protein 3, which is well known gene for playing regulatory roles at the nexus of lipid metabolism and signaling including HDL-cholesterol, LDL-cholesterol and fasting insulin (Mitchell et al., 1996; Zhang et al., 2013). (A) Forest plot and (B) PM-plot for rs32595861 locus (Fabp3 Gene) analyzing data from the Kang et al., (2014) study. FE: fixed effects model, RE: random effects model.

8 male C57BL/6J-LDLRKO (designated as B) to generate F1 mice. Then intercross of F1 was performed to generate F2 mice. Given these heterogeneous natures of studies that differ in many different dimensions, in- terpreting a significant but heterogeneous result can be challenging. Examining both the forest plot and the PM-Plot allows us to to generate appropriate hypothesis on why the effect size differences are occurring. For example, consider studies 3 and 4 which contain mouse C57BL/6J x C3H/HeJ F2 wild type strains under the western diet condition (van Nas et al., 2010), but differ by sex. These two studies show heterogeneous effects in the forest plot, but the two confidence intervals of effect estimates overlap each other, making it ambiguous if the observed heterogeneity is a result of stochastic errors. A researcher can perceive, however, that there exist other studies that have similar effect sizes to these two studies, increasing our belief that this observed heterogeneity is driven by true interactions of sex, genetic background, and diet. Nevertheless, the forest plot alone does not display or systematically infer such cross-study in- formation. In the PM-Plot, the m-values are calculated utilizing cross-study information. The posterior probabilities are well segregated for these two studies (m-value: 0.93 vs 0.03), allowing us to hypothesize that the SNP effects on HDL in C57BL/6J x C3H/HeJ F2 strains under the western diet condition can be interacting with sex. This shows an example that our visualization framework can lead to plausible interpretations, which would not have been straightforward, had we used the traditional forest plot alone.

3.2 Application to Multi-Tissue eQTL analysis

One powerful application of our proposed framework is multi-tissue eQTL analysis in Genotype- Tissue Expression (GTEx) project. Genotype-Tissue Expression (GTEx) project studies human gene expression and genetic regulation in multiple tissues, providing valuable insights into the mechanisms of gene regulation, which can lead to the new discovery of disease-related pertur- bations. In this project, genetic variation between individuals will be examined for correlation with differences in gene expression level to identify regions of the genome that influence whether and how much a gene is expressed. Particularly examining multiple tissues can give us valuable insights to the genetic architecture of the regulatory mechanism, because many regulatory re- gions is known to act in a tissue-specific manner (Ernst et al., 2011; Consortium, 2012). Hence, understanding the role of regulatory variants, and the tissues in which they act, is essential for the functional interpretation of GWAS loci and insights into disease etiology.

9 Figure 2: 13 multiple-tissue eQTL studies analyzed in (Consortium, 2015). In this example, 13 different tissue eQTLs are analyzed together for SEMA3B gene expression levels. The first column shows the P-value for each tissue specific eQTL study. The different colors in dots represent the different tissues. study name column shows the various tissue names included in this multi-tissue eQTL analysis. The forest plot shows that the SNP rs28559826 seems better association with the SEMA3B gene expression level in 3 tissues (Heart Left Ventricle, Stomach, and Thyroid), although the confidence intervals are overlapped between many tissues. On the other hand, PM- plot clearly shows that the association of top 3 tissues (Heart Left Ventricle, Stomach, and Thyroid) are outstanding compared to other tissue eQTLs. The gene SEMA3B which is also known as the semaphorin/collapsin family of molecules. This gene plays a critical role in the guidance of growth cones during neuronal development. It has been shown to act as a tumor suppressor by inducing apoptosis (sem, 2015). (A) Forest plot and (B) PM-plot for rs28559826 locus (SEMA3B Gene) analyzing data from the GTex study (Consortium, 2015). RE: random effects model.

10 Figure 2 is an example of the output of ForestPMPlot for a mult-tissue eQTL study for SEMA3B gene (Consortium, 2015). Examining both the forest plot and the PM-Plot allows us to obtain an insight into the tissue specific genetics effects in eQTL analysis, which leads to identify 3 significant eQTL tissues (Heart Left Ventricle, Stomach, and Thyroid). This example clearly shows that examining both the forest plot and the PM-Plot allows us to easily hypothesize that there is a specific group of studies showing tissue difference in eQTL analysis.

4 Conclusions

In conclusion, we describe ForestPMPlot, a flexible visualization tool for analyzing studies in- cluded in a meta-analysis, such as meta-analysis of genome-wide association studies. The main feature of the tool is visualizing the differences in the effect sizes of the studies for better under- standing of why the studies exhibit heterogeneity. Unlike the traditional forest plot framework, which only displays effect size magnitude and its standard error for each study, ForestPMPlot additionally displays the posterior probability prediction for the existence of the effect in each study, and the P-values. This allows us to effectively segregate from one another the stud- ies predicted to have an effect and the studies predicted not to have an effect. Through the visualization of these estimates and predictions, ForestPMPlot can considerably facilitate the interpretation of the results of meta-analysis. We show an example analysis where our visualiza- tion framework leads to plausible interpretations gene-by-environment interaction and multiple tissue eQTL, which would not have been straightforward with the traditional framework.

Competing interests

The authors declare that they have no competing interests.

Acknowledgements

Funding: E.Y.K. and E.E. are supported by National Science Foundation grants 0513612, 0731455, 0729049, 0916676 and 1065276, and National Institutes of Health grants K25-HL080079, U01-DA024417, P01-HL30568 and PO1-HL28481. X.L. and A.S. are supported by Contract HHSN268201000029C (Broad Institute). B.H. is supported by a grant 2015-0222 from the Asan

11 Institute for Life Sciences, Asan Medical Center, Seoul, Korea, and by a grant HI14C1731 of the Korean Health Technology R&D Project, Ministry of Health & Welfare, Republic of Korea.

References

(2015). gene: Sema3b sema domain, immunoglobulin domain (ig), short basic domain, secreted, (semaphorin)

3b".

Anttila, V., Winsvold, B. S., Gormley, P., Kurth, T., Bettella, F., McMahon, G., Kallela, M., Malik, R., de Vries, B.,

Terwindt, G., Medland, S. E., Todt, U., McArdle, W. L., Quaye, L., Koiranen, M., Ikram, M. A., Lehtimaki, T.,

Stam, A. H., Ligthart, L., Wedenoja, J., Dunham, I., Neale, B. M., Palta, P., Hamalainen, E., Schurks, M., Rose,

L. M., Buring, J. E., Ridker, P. M., Steinberg, S., Stefansson, H., Jakobsson, F., Lawlor, D. A., Evans, D. M., Ring,

S. M., Farkkila, M., Artto, V., Kaunisto, M. A., Freilinger, T., Schoenen, J., Frants, R. R., Pelzer, N., Weller, C. M.,

Zielman, R., Heath, A. C., Madden, P. A. F., Montgomery, G. W., Martin, N. G., Borck, G., Gobel, H., Heinze,

A., Heinze-Kuhn, K., Williams, F. M. K., Hartikainen, A.-L. . L., Pouta, A., Ende, J. v. d., Uitterlinden, A. G.,

Hofman, A., Amin, N., Hottenga, J.-J. . J., Vink, J. M., Heikkila, K., Alexander, M., Muller-Myhsok, B., Schreiber,

S., Meitinger, T., Wichmann, H. E., Aromaa, A., Eriksson, J. G., Traynor, B. J., Trabzuni, D., Consortium, N.

A. B. E., Consortium, U. K. B. E., Rossin, E., Lage, K., Jacobs, S. B. R., Gibbs, J. R., Birney, E., Kaprio, J.,

Penninx, B. W., Boomsma, D. I., Duijn, C. v., Raitakari, O., Jarvelin, M.-R. . R., Zwart, J.-A. . A., Cherkas, L.,

Strachan, D. P., Kubisch, C., Ferrari, M. D., Maagdenberg, A. M. J. M. v. d., Dichgans, M., Wessman, M., Smith,

G. D., Stefansson, K., Daly, M. J., and Consortium, t. I. H. G. (2013). Genome-wide meta-analysis identifies new

susceptibility loci for migraine. Nature Genetics, 45(8), 912.

Bennett, B. J., Farber, C. R., Orozco, L., Kang, H. M., Ghazalpour, A., Siemers, N., Neubauer, M., Neuhaus, I.,

Yordanova, R., Guan, B., Truong, A., Yang, W.-p. P., He, A., Kayne, P., Gargalovic, P., Kirchgessner, T., Pan,

C., Castellani, L. W., Kostem, E., Furlotte, N., Drake, T. A., Eskin, E., and Lusis, A. J. (2010). A high-resolution

association mapping panel for the dissection of complex traits in mice. Genome Res, 20(2), 281–90.

Consortium, E. P. (2012). An integrated encyclopedia of dna elements in the . Nature, 489(7414), 57–74.

Consortium, G. (2015). Human genomics. the genotype-tissue expression (gtex) pilot analysis: multitissue gene regula-

tion in humans. Science, 348(6235), 648–60.

Davis, R. C., van Nas, A., Castellani, L. W., Zhao, Y., Zhou, Z., Wen, P., Yu, S., Qi, H., Rosales, M., Schadt, E. E.,

Broman, K. W., Péterfy, M., and Lusis, A. J. (2012). Systems genetics of susceptibility to obesity-induced diabetes

in mice. Physiol Genomics, 44(1), 1–13. de Bakker, P. I. W., Ferreira, M. A. R., Jia, X., Neale, B. M., Raychaudhuri, S., and Voight, B. F. (2008). Practical

aspects of imputation-driven meta-analysis of genome-wide association studies. Hum Mol Genet, 17(R2), R122–8.

DerSimonian, R. and Laird, N. (1986). Meta-analysis in clinical trials. Control Clin Trials, 7(3), 177–88.

Ernst, J., Kheradpour, P., Mikkelsen, T. S., Shoresh, N., Ward, L. D., Epstein, C. B., Zhang, X., Wang, L., Issner, R.,

Coyne, M., Ku, M., Durham, T., Kellis, M., and Bernstein, B. E. (2011). Mapping and analysis of chromatin state

dynamics in nine human cell types. Nature, 473(7345), 43–9.

12 Evangelou, E. and Ioannidis, J. P. A. (2013). Meta-analysis methods for genome-wide association studies and beyond.

Nat Rev Genet, 14(6), 379–89.

Fleiss, J. L. (1993). The statistical basis of meta-analysis. Stat Methods Med Res, 2(2), 121–45.

Han, B. and Eskin, E. (2011). Random-effects model aimed at discovering associations in meta-analysis of genome-wide

association studies. Am J Hum Genet, 88(5), 586–98.

Han, B. and Eskin, E. (2012). Interpreting meta-analyses of genome-wide association studies. PLoS Genet, 8(3),

e1002555.

Hardy, R. J. and Thompson, S. G. (1996). A likelihood approach to meta-analysis with random effects. Stat Med,

15(6), 619–29.

Heid, I. M., Jackson, A. U., Randall, J. C., Winkler, T. W., Qi, L., Steinthorsdottir, V., Thorleifsson, G., Zillikens,

M. C., Speliotes, E. K., Magi, R., Workalemahu, T., White, C. C., Bouatia-Naji, N., Harris, T. B., Berndt, S. I.,

Ingelsson, E., Willer, C. J., Weedon, M. N., Luan, J., Vedantam, S., Esko, T., Kilpelainen, T. O., Kutalik, Z., Li,

S., Monda, K. L., Dixon, A. L., Holmes, C. C., Kaplan, L. M., Liang, L., Min, J. L., Moffatt, M. F., Molony, C.,

Nicholson, G., Schadt, E. E., Zondervan, K. T., Feitosa, M. F., Ferreira, T., Allen, H. L., Weyant, R. J., Wheeler,

E., Wood, A. R., MAGIC, Estrada, K., Goddard, M. E., Lettre, G., Mangino, M., Nyholt, D. R., Purcell, S., Smith,

A. V., Visscher, P. M., Yang, J., McCarroll, S. A., Nemesh, J., Voight, B. F., Absher, D., Amin, N., Aspelund, T.,

Coin, L., Glazer, N. L., Hayward, C., Heard-Costa, N. L., Hottenga, J.-J. . J., Johansson, A., Johnson, T., Kaakinen,

M., Kapur, K., Ketkar, S., Knowles, J. W., Kraft, P., Kraja, A. T., Lamina, C., Leitzmann, M. F., McKnight, B.,

Morris, A. P., Ong, K. K., Perry, J. R. B., Peters, M. J., Polasek, O., Prokopenko, I., Rayner, N. W., Ripatti, S.,

Rivadeneira, F., Robertson, N. R., Sanna, S., Sovio, U., Surakka, I., Teumer, A., Wingerden, S. v., Vitart, V., Zhao,

J. H., Cavalcanti-Proenca, C., Chines, P. S., Fisher, E., Kulzer, J. R., Lecoeur, C., Narisu, N., Sandholt, C., Scott,

L. J., Silander, K., Stark, K., Tammesoo, M.-L. . L., Teslovich, T. M., Timpson, N. J., Watanabe, R. M., Welch,

R., Chasman, D. I., Cooper, M. N., Jansson, J.-O. . O., Kettunen, J., Lawrence, R. W., Pellikka, N., Perola, M.,

Vandenput, L., Alavere, H., Almgren, P., Atwood, L. D., Bennett, A. J., Biffar, R., Bonnycastle, L. L., Bornstein,

S. R., Buchanan, T. A., Campbell, H., Day, I. N. M., Dei, M., Dorr, M., Elliott, P., Erdos, M. R., Eriksson, J. G.,

Freimer, N. B., Fu, M., Gaget, S., Geus, E. J. C., Gjesing, A. P., Grallert, H., Grassler, J., Groves, C. J., Guiducci,

C., Hartikainen, A.-L. . L., Hassanali, N., Havulinna, A. S., Herzig, K.-H. . H., Hicks, A. A., Hui, J., Igl, W.,

Jousilahti, P., Jula, A., Kajantie, E., Kinnunen, L., Kolcic, I., Koskinen, S., Kovacs, P., Kroemer, H. K., Krzelj,

V., Kuusisto, J., Kvaloy, K., Laitinen, J., Lantieri, O., Lathrop, G. M., Lokki, M.-L. . L., Luben, R. N., Ludwig,

B., McArdle, W. L., McCarthy, A., Morken, M. A., Nelis, M., Neville, M. J., Pare, G., Parker, A. N., Peden, J. F.,

Pichler, I., Pietilainen, K. H., Platou, C. G. P., Pouta, A., Ridderstrale, M., Samani, N. J., Saramies, J., Sinisalo,

J., Smit, J. H., Strawbridge, R. J., Stringham, H. M., Swift, A. J., Teder-Laving, M., Thomson, B., Usala, G.,

Meurs, J. B. J. v., Ommen, G.-J. v. . J. V., Vatin, V., Volpato, C. B., Wallaschofski, H., Walters, G. B., Widen,

E., Wild, S. H., Willemsen, G., Witte, D. R., Zgaga, L., Zitting, P., Beilby, J. P., James, A. L., Kahonen, M.,

Lehtimaki, T., Nieminen, M. S., Ohlsson, C., Palmer, L. J., Raitakari, O., Ridker, P. M., Stumvoll, M., Tonjes,

A., Viikari, J., Balkau, B., Ben-Shlomo, Y., Bergman, R. N., Boeing, H., Smith, G. D., Ebrahim, S., Froguel, P.,

Hansen, T., Hengstenberg, C., Hveem, K., Isomaa, B., Jorgensen, T., Karpe, F., Khaw, K.-T. . T., Laakso, M.,

Lawlor, D. A., Marre, M., Meitinger, T., Metspalu, A., Midthjell, K., Pedersen, O., Salomaa, V., Schwarz, P. E. H.,

Tuomi, T., Tuomilehto, J., Valle, T. T., Wareham, N. J., Arnold, A. M., Beckmann, J. S., Bergmann, S., Boerwinkle,

E., Boomsma, D. I., Caulfield, M. J., Collins, F. S., Eiriksdottir, G., Gudnason, V., Gyllensten, U., Hamsten, A.,

13 Hattersley, A. T., Hofman, A., Hu, F. B., Illig, T., Iribarren, C., Jarvelin, M.-R. . R., Kao, W. H. L., Kaprio,

J., Launer, L. J., Munroe, P. B., Oostra, B., Penninx, B. W., Pramstaller, P. P., Psaty, B. M., Quertermous, T.,

Rissanen, A., Rudan, I., Shuldiner, A. R., Soranzo, N., Spector, T. D., Syvanen, A.-C. . C., Uda, M., Uitterlinden, A.,

Volzke, H., Vollenweider, P., Wilson, J. F., Witteman, J. C., Wright, A. F., Abecasis, G. R., Boehnke, M., Borecki,

I. B., Deloukas, P., Frayling, T. M., Groop, L. C., Haritunians, T., Hunter, D. J., Kaplan, R. C., North, K. E.,

O’Connell, J. R., Peltonen, L., Schlessinger, D., Strachan, D. P., Hirschhorn, J. N., Assimes, T. L., Wichmann, H.-E.

. E., Thorsteinsdottir, U., Duijn, C. M. v., Stefansson, K., Cupples, L. A., Loos, R. J. F., Barroso, I., McCarthy,

M. I., Fox, C. S., Mohlke, K. L., and Lindgren, C. M. (2010). Meta-analysis identifies 13 new loci associated with

waist-hip ratio and reveals sexual dimorphism in the genetic basis of fat distribution. Nature Genetics, 42(11), 949.

Jiang, X. C., Agellon, L. B., Walsh, A., Breslow, J. L., and Tall, A. (1992). Dietary cholesterol increases transcription

of the human cholesteryl ester transfer protein gene in transgenic mice. dependence on natural flanking sequences.

J Clin Invest, 90(4), 1290–5.

Kang, E. Y., Han, B., Furlotte, N., Joo, J. W., Shih, D., Davis, C. R., Lusis, J. A., and Eskin, E. (2014). Meta-analysis

identifies gene-by-environment interactions as demonstrated in a study of 4,965 mice. PLoS Genet, 10(1), e1004022.

Lambert, J.-C. . C., Ibrahim-Verbaas, C. A., Harold, D., Naj, A. C., Sims, R., Bellenguez, C., Jun, G., DeStefano, A. L.,

Bis, J. C., Beecham, G. W., Grenier-Boley, B., Russo, G., Thornton-Wells, T. A., Jones, N., Smith, A. V., Chouraki,

V., Thomas, C., Ikram, M. A., Zelenika, D., Vardarajan, B. N., Kamatani, Y., Lin, C.-F. . F., Gerrish, A., Schmidt,

H., Kunkle, B., Dunstan, M. L., Ruiz, A., Bihoreau, M.-T. . T., Choi, S.-H. . H., Reitz, C., Pasquier, F., Hollingworth,

P., Ramirez, A., Hanon, O., Fitzpatrick, A. L., Buxbaum, J. D., Campion, D., Crane, P. K., Baldwin, C., Becker,

T., Gudnason, V., Cruchaga, C., Craig, D., Amin, N., Berr, C., Lopez, O. L., Jager, P. L. D., Deramecourt, V.,

Johnston, J. A., Evans, D., Lovestone, S., Letenneur, L., Moron, F. J., Rubinsztein, D. C., Eiriksdottir, G., Sleegers,

K., Goate, A. M., Fievet, N., Huentelman, M. J., Gill, M., Brown, K., Kamboh, M. I., Keller, L., Barberger-Gateau,

P., McGuinness, B., Larson, E. B., Green, R., Myers, A. J., Dufouil, C., Todd, S., Wallon, D., Love, S., Rogaeva, E.,

Gallacher, J., George-Hyslop, P. S., Clarimon, J., Lleo, A., Bayer, A., Tsuang, D. W., Yu, L., Tsolaki, M., Bossu, P.,

Spalletta, G., Proitsi, P., Collinge, J., Sorbi, S., Sanchez-Garcia, F., Fox, N. C., Hardy, J., Naranjo, M. C. D., Bosco,

P., Clarke, R., Brayne, C., Galimberti, D., Mancuso, M., Matthews, F., (EADI), E. D. I., Genetic, (GERAD), E.

R. i. D., (ADGC), A. D. G. C., Heart, C. f., (CHARGE), A. R. i. G. E., Moebus, S., Mecocci, P., Zompo, M. D.,

Maier, W., Hampel, H., Pilotto, A., Bullido, M., Panza, F., Caffarra, P., Nacmias, B., Gilbert, J. R., Mayhaus,

M., Lannfelt, L., Hakonarson, H., Pichler, S., Carrasquillo, M. M., Ingelsson, M., Beekly, D., Alvarez, V., Zou, F.,

Valladares, O., Younkin, S. G., Coto, E., Hamilton-Nelson, K. L., Gu, W., Razquin, C., Pastor, P., Mateo, I., Owen,

M. J., Faber, K. M., Jonsson, P. V., Combarros, O., O’Donovan, M. C., Cantwell, L. B., Soininen, H., Blacker, D.,

Mead, S., Jr, T. H. M., Bennett, D. A., Harris, T. B., Fratiglioni, L., Holmes, C., de Bruijn, R. F. A. G., Passmore,

P., Montine, T. J., Bettens, K., Rotter, J. I., Brice, A., Morgan, K., Foroud, T. M., Kukull, W. A., Hannequin,

D., Powell, J. F., Nalls, M. A., Ritchie, K., Lunetta, K. L., Kauwe, J. S. K., Boerwinkle, E., Riemenschneider, M.,

Boada, M., Hiltunen, M., Martin, E. R., Schmidt, R., Rujescu, D., Wang, L.-S. . S., Dartigues, J.-F. . F., Mayeux, R.,

Tzourio, C., Hofman, A., Nothen, M. M., Graff, C., Psaty, B. M., Jones, L., Haines, J. L., Holmans, P. A., Lathrop,

M., Pericak-Vance, M. A., Launer, L. J., Farrer, L. A., Duijn, C. M. v., Broeckhoven, C. V., Moskvina, V., Seshadri,

S., Williams, J., Schellenberg, G. D., and Amouyel, P. (2013). Meta-analysis of 74,046 individuals identifies 11 new

susceptibility loci for alzheimer’s disease. Nature Genetics, 45(12), 1452.

14 Lango Allen, H., Estrada, K., Lettre, G., and et al. (2010). Hundreds of variants clustered in genomic loci and biological

pathways affect human height. Nature, 467(7317), 832–8.

Lemon, J. (2006). Plotrix: a package in the red light district of r. R-News, 6(4), 8–12.

Manning, A. K., Hivert, M.-F. F., Scott, R. A., and et al. (2012). A genome-wide approach accounting for body mass

index identifies genetic variants influencing fasting glycemic traits and insulin resistance. Nat Genet, 44(6), 659–69.

Mantel, N. and Haenszel, W. (1959). Statistical aspects of the analysis of data from retrospective studies of disease. J

Natl Cancer Inst, 22(4), 719–48.

Mitchell, B. D., Kammerer, C. M., Blangero, J., Mahaney, M. C., Rainwater, D. L., Dyke, B., Hixson, J. E., Henkel,

R. D., Sharp, R. M., Comuzzie, A. G., VandeBerg, J. L., Stern, M. P., and MacCluer, J. W. (1996). Genetic and

environmental contributions to cardiovascular risk factors in mexican americans. the san antonio family heart study.

Circulation, 94(9), 2159–70.

Morris, A. P., Voight, B. F., Teslovich, T. M., and et al. (2012). Large-scale association analysis provides insights into

the genetic architecture and pathophysiology of type 2 diabetes. Nat Genet, 44(9), 981–90.

Stephens, M. and Balding, D. J. (2009). Bayesian statistical methods for genetic association studies. Nat Rev Genet,

10(10), 681–90. van den Maagdenberg, A. M., Hofker, M. H., Krimpenfort, P. J., de Bruijn, I., van Vlijmen, B., van der Boom, H.,

Havekes, L. M., and Frants, R. R. (1993). Transgenic mice carrying the apolipoprotein e3-leiden gene exhibit

hyperlipoproteinemia. J Biol Chem, 268(14), 10540–5. van Nas, A., Ingram-Drake, L., Sinsheimer, J. S., Wang, S. S., Schadt, E. E., Drake, T., and Lusis, A. J. (2010).

Expression quantitative trait loci: replication, tissue- and sex-specificity in mice. Genetics, 185(3), 1059–68.

Zhang, Y., Kent, J. W., Lee, A., Cerjak, D., Ali, O., Diasio, R., Olivier, M., Blangero, J., Carless, M. A., and Kissebah,

A. H. (2013). Fatty acid binding protein 3 (fabp3) is associated with insulin, lipids and cardiovascular phenotypes

of the metabolic syndrome through epigenetic modifications in a northern european family population. BMC Med

Genomics, 6, 9.

15