Heritability Estimates Obtained from Genome-Wide Association Studies (GWAS) Are Much Lower Than Those of Traditional Quantitative Methods
Total Page:16
File Type:pdf, Size:1020Kb
Dissolving the missing heritability problem Abstract: Heritability estimates obtained from genome-wide association studies (GWAS) are much lower than those of traditional quantitative methods. This phenomenon has been called the “missing heritability problem”. By analyzing and comparing GWAS and traditional quantitative methods, we first show that the estimates obtained from the latter involve some terms other than additive genetic variance, while the estimates from the former do not. Second, GWAS, when used to estimate heritability, do not take into account additive epigenetic factors transmitted across generations, while traditional quantitative methods do. Given these two points we show that the missing heritability problem can largely be dissolved. Pierrick Bourrat* Macquarie University, Department of Philosophy North Ryde, NSW 2109, Australia Email: [email protected] The University of Sydney, Department of Philosophy, Unit for the History and Philosophy of Science & Charles Perkins Centre Camperdown, NSW 2006, Australia Qiaoying Lu* Sun Yat-sen University, Department of Philosophy Xingangxi Road 135 Guangzhou, Guangdong, China * PB and QL contributed equally to this work. Acknowledgements We are thankful to Steve Downes, Paul Griffiths and Eva Jablonka for discussions on the topic. PB’s research was supported under Australian Research Council's Discovery Projects funding scheme (project DP150102875). QL’s research was supported by a grant from the Ministry of Education of China (13JDZ004) and “Three Big Constructions” funds of Sun Yat-sen University. Dissolving the missing heritability problem Abstract: Heritability estimates obtained from genome-wide association studies (GWAS) are much lower than those of traditional quantitative methods. This phenomenon has been called the “missing heritability problem”. By analyzing and comparing GWAS and traditional quantitative methods, we first show that the estimates obtained from the latter involve some terms other than additive genetic variance, while the estimates from the former do not. Second, GWAS, when used to estimate heritability, do not take into account additive epigenetic factors transmitted across generations, while traditional quantitative methods do. Given these two points we show that the missing heritability problem can largely be dissolved. 1. Introduction. One pervasive problem encountered when estimating the heritability of quantitative traits is that the estimates obtained from genome-wide association studies (GWAS) are much smaller than that calculated by traditional quantitative methods. This problem has been called the missing heritability problem (Turkheimer 2011). Take human height for example. Traditional quantitative methods deliver a heritability estimate of about 0.8, while the first estimates using GWAS were 0.05 (Maher 2008). More recent GWAS methods have revised this number and estimate the heritability of height to be 0.451 (Yang et al. 2010; Turkheimer 2011). Yet, compared to traditional quantitative methods, half of the heritability is still missing. In quantitative genetics, heritability is defined as the portion of phenotypic variance in a population that is due to genetic difference (Falconer and Mackay 1996; Downes 2015; Lynch and Bourrat 2017). Traditionally, this portion is estimated by measuring the phenotypic resemblance of genetically related individuals without identifying genes at the molecular level (more particularly DNA sequences). GWAS have been developed in order to locate the DNA sequences that influence the target trait and estimate their effects, especially for common complex diseases such as obesity, diabetes and heart disease 1 According to Yang et al. (2015), GWAS may deliver a higher estimate of the heritability of height in the future. (Visscher et al. 2012; Frazer et al. 2009). As for height, almost 300 000 common DNA variants in human populations that associate with it have been identified by GWAS (Yang et al. 2010). Granted by many that the heritability estimates obtained by traditional quantitative methods are quite reliable, the method(s) used in GWAS have been questioned (Eichler et al. 2010). A number of partial solutions to the missing heritability problem have been proposed, with most of them focusing on improving the methodological aspects of GWAS in order to provide a more accurate estimate (e.g., Manolio et al. 2009; Eichler et al. 2010). Some authors have also suggested that heritable epigenetic factors might account for part of the missing heritability. For instance, in Eichler et al. (2000, 488), Kong notes that “[e]pigenetic effects beyond imprinting that are sequence-independent and that might be environmentally induced but can be transmitted for one or more generations could contribute to missing heritability.” Furrow et al. (2011) also claim that “[e]pigenetic variation, inherited both directly and through shared environmental effects, may make a key contribution to the missing heritability.” Others have made the same point (e.g., McCarthy and Hirschhorn 2008; Johannes et al. 2008). Yet, in the face of this idea one might notice what appears to be a contradiction: how can epigenetic factors account for the missing heritability, if the heritability is about genes? To answer this question as well as to analyze the missing heritability problem, we compare the assumptions underlying both heritability estimates in traditional quantitative methods and those in GWAS. We make two points. First, traditional methods typically overestimate heritability (narrow-sense heritability, ℎ²) because these estimates do not successfully isolate the additive genetic component of phenotypic variance, which is part of the definition of ℎ² (see Section 2), from the non-additive genetic and non-genetic ones and the potential effects of assortative mating. Second, the concept of the gene used in the definition of ℎ2 is an evolutionary one, and it differs from the one used in GWAS which is DNA centered. This means that the heritability estimates obtained from traditional methods can include heritability due to heritable epigenetic factors (which can be regarded as evolutionary genes) while the effects from these factors are not included in the estimates obtained from GWAS. With these two points taken into account, we expect the missing heritability problem to be largely dissolved as well as setting the stage for further discussions. The reminder of the paper will be divided into three parts. First, we briefly introduce two ways in which heritability is estimated in traditional methods, namely twin studies and parent-offspring regression. We show that the estimates obtained by each way include some non-additive and (or) non-genetic elements and consequently overestimate ℎ². Second, we outline the basic rationale underlying GWAS and illustrate that they estimate heritability by considering solely DNA variants. By arguing that the notion of additive genetic variance used in traditional methods does not necessarily refer to DNA sequences but can also refer to epigenetic factors, we show that the notion of heritability estimated in GWAS is more restrictive than ℎ². Finally, in Section 4, based on the conclusions from Section 2 and Section 3, we show that the missing heritability problem can be partly dissolved in two ways. One is that if non-additive and non-genetic variance was removed from the estimates obtained via traditional methods, these estimates would be lower. The other is that if additive epigenetic factors were taken into account by GWAS, the heritability estimates obtained would be higher. We conclude Section 4 by demonstrating how our analysis sheds some light on a discussion about the role played by non-additive factors in the missing heritability problem. Because human height has been “the poster child” of the missing heritability problem (Turkheimer 2011, 232), we will use it to illustrate each of our points. 2. Heritability in Traditional Quantitative Methods. Although there exist different definitions of heritability (Jacquard 1983; Bourrat 2015; Downes 2009), according to the standard model of quantitative genetics, the phenotypic variance (푉푃) of a population can be explained by two components, its genotypic variance (푉퐺) and its environmental variance (푉퐸). In the absence of gene-environment interaction and correlation, we have: 푉푃 = 푉퐺 + 푉퐸 (1) From there broad-sense heritability (퐻2) is defined as: 푉 퐻2 = 퐺 (2) 푉푃 푉퐺 can further be portioned into the additive genetic variance (푉퐴), the dominance genetic variance (푉퐷) and the epistasis genetic variance (푉퐼). Thus Equation (1) can be rewritten as: 푉푃 = 푉퐴 + 푉퐷 + 푉퐼 + 푉퐸 (3) where 푉퐴 is the variance due to alleles being transmitted from the parents to the offspring that contribute to the phenotype. 푉퐷 is the variance due to interactions between alleles at one locus for diploid organisms, and 푉퐼 is the variance due to interactions between alleles from different loci. 푉퐷 and 푉퐼 together represent the variance due to particular combinations of genes of an organism. Because genotypes of sexual organisms recombine at each generation via reproduction, the effects of combinations of genes, namely dominance and epistasis effects (measured respectively by 푉퐷 and 푉퐼) are not transmitted across generations; only the effects of the genes independent from their genetic background (measured by 푉퐴) are. By 2 taking only 푉퐴 into account, narrow-sense heritability (ℎ ) which “expresses the extent to which phenotypes are determined by the genes transmitted from the parents”