<<

HIGHLIGHTED ARTICLE | COMMUNICATIONS

Do Molecular Markers Inform About ?

Daniel Gianola,*,†,‡,1 Gustavo de los Campos,§ Miguel A. Toro,** Hugo Naya,†† Chris-Carolin Schön,†,‡ and Daniel Sorensen‡‡ *Departments of Animal Sciences, Dairy Science, and Biostatistics and Medical Informatics, University of Wisconsin–Madison, Wisconsin 53706, †Department of Plant Sciences Technical University of Munich, Center for Life and Food Sciences, D-85354 Freising-Weihenstephan, Germany, ‡Institute of Advanced Study, Technical University of Munich, D-85748 Garching, Germany, §Department of Biostatistics, Michigan State University, East Lansing, Michigan 48824, **Escuela Técnica Superior de Ingenieros Agrónomos, Universidad Politécnica de Madrid, 20840 Madrid, Spain, ††Institut Pasteur de Montevideo, Mataojo 2020, Montevideo 11400, , and ‡‡Department of Molecular Biology and Genetics, , DK-8000 Aarhus C, Denmark

ABSTRACT The availability of dense panels of common single-nucleotide polymorphisms and sequence variants has facilitated the study of statistical features of the of and diseases via whole-genome regressions (WGRs). At the onset, traits were analyzed trait by trait, but recently, WGRs have been extended for analysis of several traits jointly. The expectation is that such an approach would offer insight into mechanisms that cause trait associations, such as pleiotropy. We demonstrate that correlation parameters inferred using markers can give a distorted picture of the between traits. In the absence of knowledge of relationships between quantitative or disease trait loci and markers, speculating about genetic correlation and its causes (e.g., pleiotropy) using genomic data is conjectural.

KEYWORDS genetic correlation; genomic correlation; genomic ; linkage disequilibrium; pleiotropy

HE interindividual differences for a trait or disease risk 1996; Lynch and Ritland 1999). This development has Tthat can be explained by genetic factors, such as trait opened new opportunities for study of the genetic architec- 2 heritability (h ), the genetic correlation (rG), and the coher- ture of complex traits and diseases. For instance, Yang et al. itability between two traits (rGh1h2), are very important pa- (2010) suggested using whole-genome regressions (WGRs) rameters in quantitative genetic studies of animals, humans, (Meuwissen et al. 2001) to assess the proportion of and plants. These quantities play a role in the study of evo- of a trait or disease risk that can be explained by a regression lution due to artificial and , and knowledge of on common SNPs or genomic heritability and thereof is required for statistical prediction of outcomes in a related parameter, the “missing heritability.” More recently, animal and as well as medicine. Traditionally, WGR models have been extended for the analysis of systems these parameters have been estimated using phenotypes and of multiple traits, so the concept of genomic correlation also pedigrees, e.g., family and data in human genetics. The has entered into the picture (Jia and Jannink 2012; Lee et al. availability of dense panels of common single-nucleotide 2012). For instance, Maier et al. (2015) used multivariate polymorphisms (SNPs) and of sequence data more recently WGR models and reported estimates of genetic correlations has made it possible to assess kinship among distantly related between psychiatric disorders, and Furlotte and Eskin (2015) individuals (Morton et al. 1971; Thompson 1975; Ritland presented a methodology that incorporates genetic marker information for the analysis of multiple traits that, according to the authors, “provide fundamental insights into the nature Copyright © 2015 by the Genetics Society of America of co-expressed .” In a similar spirit, Korte et al. (2012) doi: 10.1534/genetics.115.179978 Manuscript received June 26, 2015; accepted for publication July 21, 2015; published argued that multitrait-marker-enabled regressions can be useful Early Online July 23, 2015. for understanding pleiotropy. More recently, Bulik-Sullivan et al. 1Corresponding author: Department of Animal Sciences, University of Wisconsin– “ Madison, 1675 Observatory Drive, Madison, WI 53706. (2015) proposed a methodology for estimating genetic corre- E-mail: [email protected] lation” using derived from single-marker genome-wide

Genetics, Vol. 201, 23–29 September 2015 23 2 ¼ð 2 r2 Þ 2 association studies (GWAS) and reported estimates of such missing heritability is hmissing 1 QX h .Hence,missing correlations among 25 human traits. heritability is a function of the LD between the marker and de los Campos et al. (2015) discussed potential problems the QTL. Genomic heritability has h2 as an upper bound (de that emerge when trying to infer genetic parameters using los Campos et al. 2015). molecular markers that are imperfectly associated with the gen- The regression model just described can be extended to the otypes at the causal loci. In this paper, the framework described analysis of multiple traits affected by multiple QTL. For in de los Campos et al. (2015) is extended for the analysis of simplicity, we consider only two markers (X1 and X2) and systems of traits, and it is demonstrated that correlation param- two QTL (Q1 and Q2). A multivariate representation of the eters inferred using markers can give a distorted picture of the model with an arbitrary number of QTL and markers is pro- genetic correlation between traits. For instance, it is shown that vided in the Appendix. Figure 1 depicts a system with two an analysis based on markers may suggest a genetic correlation traits, two QTL, and two markers. The left panel represents when none exists or may fail to detect a genetic correlation the regression of the phenotypes on the two QTL, with blue when one does exist. It is concluded that in the absence of arrows denoting effects from QTL on traits and green arcs knowledge about linkage disequilibrium (LD) relationships be- denoting LD between QTL. In the QTL model of Figure 1, the tween quantitative trait loci (QTL) and markers, speculating genetic correlation is (see Appendix) about genetic correlations, and even more about their causes a9S a 1 Q 2 (e.g., pleiotropy), using genomic data is conjectural. rG ¼ qÀffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiÁÀ Á (3) a9S a a9S a 1 Q 1 2 Q 2 Theory a9 ¼ða a Þ where 1 11 12 contains the effects of QTL 1 and 2 To set the stage, consider a single- model. In an additive- a9 ¼ða a Þ on trait 1, and 2 21 22 contains the effects of QTL 1 inheritance framework, a (y) is regressed on a QTL and 2 on trait 2. The variance-covariance matrix between code Q (0, 1, and 2 for aa, Aa, and AA, QTL genotypes is given by SQ. If genotypes are standardized, respectively) according to the linear model   1 r Q12 SQ ¼ y ¼ a0 þ a1Q þ E (1) r Q21 1 where a0 and a1 are fixed parameters, and Q and E are in- r with Q12 being the correlation between genotypes at QTL 1 dependent random variables, the latter representing a model and 2. In the QTL model of Figure 1, there are two sources of residual. The proportion of phenotypic variance explained by genetic correlation: pleiotropy (i.e., the same QTL affects the on Q, or narrow-sense heritability, is more than one trait) and LD between QTL, in this case rep- resented by r 6¼ 0. This is well known in quantitative ge- a2S Q12 h2 ¼ 1 Q netics (Falconer and Mackay 1996; Knott and Haley 2000). a2S þ s2 1 Q E We now bring the two markers into the picture, as shown in therightpanelofFigure1;heregrayarrowsareregressionson where S ¼ varðQÞ is the variance in allelic content, and Q markers (these are distinct from regressions on QTL genotypes), s2 ¼ varðEÞ is the residual variance. If Q is standardized to E and arcs denote correlations between genotypes due to LD. In a unit variance, the Appendix, we show that the genomic correlation is 2 a 2 S ¼ 1 and h2 ¼ 1 a9S S 1S a Q a2 þ s2 ¼ rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1 QX X XQ 2 1 E rG;marked    a9S S21S a a9S S21S a 1 QX X XQ 1 2 QX X XQ 2 In quantitative genomic analysis, marker genotypes (X) are (4) used in lieu of the QTL genotypes Q because the latter are unknown or unobserved. The marker-based or instrumental In this expression, SQX is the covariance matrix between QTL model, assuming a single marker, is a linear regression on and marker genotypes (reflecting marker-QTL LD), and SX is marker genotype X with form the covariance matrix between marker genotypes, reflecting mutual LD relationships among markers. If markers and gen- ¼ a þ b þ 9 y 0 1X E (2) otypes are in standard deviation units,     where E9 is a regression residual. Assuming without loss of r r r S ¼ 1 X12 S ¼ Q1X1 Q1X2 X r and QX r r generality that both X and Q are in standard deviation units, X 1 Q X Q X b ¼ r a r 21 2 1 2 2 the marker effect can be shown to be 1 QX 1, where QX is the correlation between the marker and the QTL genotypes, Comparison of the genomic correlation (4) with the genetic S S21S which depends on their LD. In this setting, the proportion of correlation (3) indicates that in rrG;marked , QX X XQ replaces variance of phenotypes explained by the linear regression on SQ: Inspection of (4) reveals that the sources of the genomic 2 ¼ r2 2 a a the marker, or genomic heritability, is hmarked QX h ,and correlation are (1) pleiotropic QTL effects via 1 and 2, (2)

24 D. Gianola et al. Figure 1 Two-trait system. A system of two traits (Y) involving two QTL and two markers (X). Single-pointed blue arrows denote causal effects, green double-pointed arrows denote LD, and single-pointed gray arrows represent regression coefficients. marker-QTL LD patterns conveyed by SQX, and (3) among- marker LD relationships, as conveyed by SX : Notably, one of the sources of genetic correlation, i.e., LD between QTL, as Figure 2 Two-trait system. Four possible cases of interplay between QTL, conveyed by SQ, has no effect on rG;marked. Conversely, there markers, and phenotypes. The arrows have the same interpretation as in are sources that contribute to the genomic correlation, i.e., Figure 1. marker-marker and marker-QTL LD, that do not enter into r : G correlation: markers can induce genomic correlation when Because the sources affecting genetic and genomic corre- traits are genetically uncorrelated—a crucial issue. lations are distinct, the two parameters can differ greatly.This point is strengthened by considering four stylized cases rep- Case 3: Missing correlation (Figure 2, lower-left panel) resented in Figure 2. All the demonstrations supporting the This scenario illustrates a situation in which the genetic discussion that follows can be found in the Appendix. correlation is undetected by the markers and is obtained from case 1 by adding LD between QTL, which, in the absence of Application to Four Situations pleiotropy, is the only source of genetic correlation between S S21S Case 1: Independent marker-QTL pairs and absence of traits. However, QX X XQ remains diagonal as in case 1. a9a ¼ pleiotropy (Figure 2, upper-left panel) Furthermore, in the absence of pleiotropy, 1 2 0 (orthog- onality); consequently, rG;marked is null. This example shows This is the simplest case: it consists of two marker-QTL pairs how one source of genetic correlation, namely, LD among with linkage equilibrium (LE) between pairs but LD within QTL, may be completely lost in a genomic analysis. pairs. Each trait is affected by only one QTL; QTL 1 affects trait 1, and QTL 2 affects trait 2. Several simplifications take place Case 4: Pleiotropy (Figure 2, lower-right panel) here. For instance, because of LE between pairs, r ¼ 0, so Q1X2 Here we allow each of the two QTL to affect both traits; S QX becomes an identity matrix. Therefore, the genetic co- otherwise, the setting is as in case 1. Pleiotropy now induces a9a variance in the numerator of (3) reduces to 1 2. In the a genetic and a genomic correlation. However, r and r ; a9 ¼ða Þ a9 ¼ð a Þ G G marked absence of pleiotropy, 1 11 0 and 2 0 22 differ in magnitude depending on the patterns of LD and on a9a ¼ are orthogonal; i.e., 1 2 0. Therefore, the genetic corre- the magnitude of the pleiotropic effects. To illustrate, we set lation is null. Furthermore, with LE between pairs, SX ¼ SQ ¼ I2, an identity matrix of order 2; this implies LE r ¼ r ¼ r ¼ 0, leading also to an absence of geno- Q1X2 Q2X1 X1X2 between pairs of QTL and pairs of markers. Further, we take mic correlation. Thus, in case 1 there is complete agreement     between the genomic and genetic correlations: both are null. 0:50 0:20 S ¼ or S ¼ QX : QX : Case 2: Phantom correlation (Figure 2, 005 008 upper-right panel) i.e., homogeneity or heterogeneity of marker-QTL LD, respec- a9 ¼ð a Þ The setting is obtained by adding LD between the two markers tively. Finally, QTL effects are set to 1 1 12 and a9 ¼ða Þ a ¼ a to case 1. There is no pleiotropy, and the two QTL are in LE, so 2 21 1 , with 12 21; this pleiotropic effect was the genetic correlation is zero (genetically, the system is equiv- varied over the set of values ½20:9; 2 0:8; ...; 0:8; 0:9. Fig- alent to case 1). However, because of the LD between markers, ure 3 displays the resulting values of the genomic (vertical S S21S QX X XQ in (4) is no longer diagonal. Consequently, there axis) vs. genetic (horizontal axis) correlations computed us- will be nonzero genomic correlation even in absence of genetic ing (3) and (4); the blue curve represents the case where

Communications 25 marker-QTL LD was the same for both pairs, and the red curve represents the case where LD differed between pairs 1 and 2. The figure shows how different patterns of LD induce different magnitudes of genomic and genetic correlations that, however, do not differ in sign in this example. The genomic covariance does not always preserve the sign of the genetic covariance. Suppose that the two QTL are not a9 ¼ða Þ pleiotropic but are in LD, with effects 1 0 and a9 ¼ð 2a Þ 2 0 and with 2 3 1 6 1 7 6 2 7 SQ ¼ 4 5 1 1 2

Using the expression in the numerator of (3), the genetic covariance is 2 3 1 6 1 7 2 6 2 7 a ð a 0 Þ4 5ð 0 2a Þ9 ¼ 2 1 2 Figure 3 Genomic vs. genetic correlation in the system described by case 4. 1 2 which is negative at any nonnull value of a. Now let the LD component of rG that is due to LD among QTL is likely to relationships between markers and between QTL and be missed by an analysis based on markers that are in imper- markers be such that fect LD with QTL. Also, a fraction of the genetic correlation 2 3 2 3 due to pleiotropy is likely to be missed as a result of imperfect 4 4 LD between marker and QTL. Finally, LD between markers 6 1 7 6 0 7 6 5 7 6 5 7 can create illusory genetic correlations. SX ¼ 4 5 and SQX ¼ 4 5 4 4 What happens if all QTL genotypes are included in the 1 0 5 5 panel of markers, as may be expected if full DNA sequence information is available? Here the sequence can be partitioned The genetic system is such that QTL 1 (QTL 2) is in LD with into neutral markers (x) and QTL (q) such that for a given marker 1 (marker 2), but there is LE between QTL 1 and individual the genomic data presents as xseq ¼ðx9; q9Þ. Thus, marker 2 and QTL 2 and marker 1. In the numerator of the sequence covariance matrix is expression (4),   2 3 SX SXQ varðx Þ¼Sx ¼ (5) 16 64 seq seq S S 6 2 7 QX Q 21 6 9 45 7 SQX S SXQ ¼ 4 5 X The marked genotype for trait i using the DNA sequence is 264 16 45 9 ^ 21 G ¼ a9S x S x ; i ¼ 1; 2 (6) i i Q seq xseq seq and the genomic covariance is (64/45)a2, always positive. In this example, the genomic correlation is 4/5, and the genetic and the genomic or marked covariance is correlation is 21/2. ^ ^ 21 21 covðG ; G Þ ¼ a9S x S Sx S Sx a 1 2 1 Q seq xseq seq xseq seqQ 2 21 9 ¼ a9S x S Sx a (7) Discussion 1 Q seq xseq seqQ 2 In the analysis of systems of complex traits, none of the cases Using partitioned matrix techniques for obtaining the inverse “ ” just discussed are likely to hold exactly as described, and S of xseq de los Campos et al. (2015) showed that there is an enormous range of possibilities in terms of within   and between marker-QTL genotypes as well as allelic effects 21 0 S x Sx Sx ¼½SQX SQ ¼ S (8) sizes and signs. However, the underlying mechanisms that Q seq seq seqQ I Q our examples describe are an integral part of the multivariate ð^ ; ^ Þ¼a9S a system involving QTL and markers and are key to an under- Hence, cov G1 G2 1 Q 2, the genetic covariance de- standing of why genomic and genetic correlations are distinct fined in equation (A2) in the Appendix. This shows that if parameters. Importantly, there is an ambiguous link between the sequence information contains the variants at the causal the two parameters. For instance, all or a fraction of the loci, the marked covariance is equal to the genetic covariance

26 D. Gianola et al. between traits, and therefore, the genomic correlation is from the European Union’s Seventh Framework Programme identical to the genetic correlation in that case. However, (KBBE.2013.1.2-10) under grant agreement 61361. the genetic correlation depends on allelic frequencies and effect sizes at the QTL as well as on LD relationships Literature Cited between the QTL; these parameters, as well as the trait-specific QTL, will still need to be learned properly. Apart from finite- Bulik-Sullivan, B., H. K. Finucane, V. Anttila, A. Gusev, F. R. Day et al. sample-size statistical problems, technical issues such as a large 2015 An atlas of genetic correlations across human diseases percentage of singleton reads and incomplete coverage and traits. bioRxiv doi: http://dx.doi.org/10.1101/014498. de los Campos, G., D. Sorensen, and D. Gianola, 2015 Genomic will complicate matters (Kerr Wall 2009). Hence, when sequence heritability: what is it? PLoS Genet. 11: e1005048. data become available for quantitative genetic studies, unrav- Falconer, D. S., and T. F. C. Mackay, 1996 Introduction to Quan- eling the structure of the genetic correlation will not be an titative Genetics, Ed. 4. Longmans Green, Harlow, UK. easy task, even under the simplifying assumptions of an additive Furlotte, N. A., and E. Eskin, 2015 Efficient multiple-trait associ- model of inheritance. ation and estimation of genetic correlation using the matrix- variate linear mixed model. Genetics 200: 59–68. In conclusion, multivariate quantitative genetic analysis Knott, S. A., and C. S. Haley, 2000 Multitrait least qquares for based on markers can be used to obtain more accurate pre- quantitative trait loci detection. Genetics 156: 899–911. dictions of complex traits and to estimate genomic correla- Jia, Y., and J.-L. Jannink, 2012 Multiple trait genomic selection tions. However, these parameters cannot always be viewed as methods increase genetic value prediction accuracy. Genetics – genetic correlations because the sources of genetic and geno- 192: 1513 1522. Korte, A., B. J. Vilhjálmsson, V. Segura, A. Platt, Q. Long, and M. mic correlations are distinct. Imperfect LD between markers Nordborg, 2012 A mixed-model approach for genome-wide and QTL produces missing heritability in single-trait analysis; association studies of correlated traits in structured . in multivariate models, the problem becomes one of missing, Nat. Genet. 44: 1066–1071. excessive, or spurious (MES) correlation. Care must be exer- Lee, S. H., J. Yang, M. E. Goddard, P. M. Visscher, and N. R. Wray, cised when interpreting estimates of genomic correlations 2012 Estimation of pleiotropy between complex diseases using single-nucleotide -derived genomic relationships between complex traits when these traits are assessed by and restricted maximum likelihood. Bioinformatics 28: 2540–2542. molecular markers as opposed to QTL and even more so when Lynch, M., and K. Ritland, 1999 Estimation of pairwise related- interpreted from a causality perspective. Unfortunately, con- ness with molecular markers. Genetics 152: 1753–1766. siderably more information is needed than what is now avail- Maier, R. et al., 2015 Joint analysis of psychiatric disorders increases able for a meaningful interpretation of estimates of genomic accuracy of risk prediction for , , and major depressive disorder. Am. J. Hum. Genet. 96: 283–294. correlations between pairs of traits when gene action involves Meuwissen, T. H. E., B. J. Hayes, and M. E. Goddard, many additive QTL. Speculating on the multivariate statistical 2001 Prediction of total genetic value using genome-wide genetic architecture of complex traits using imperfect instru- dense marker maps. Genetics 157: 1819–1829. ments such as markers seems risky at this time. Morton, N. E., S. Yee, D. E. Harris, and R. Lew, 1971 Bioassay of kinship. Theor. Popul. Biol. 2: 507–524. Ritland, K., 1996 A marker-based method for inferences about Acknowledgments quantitative inheritance in natural populations. 50: 1062–1073. This work was supported in part by the Wisconsin Agricul- Thompson, E. A., 1975 The estimation of pairwise relationships. ture Experiment Station and by a U.S. Department of Ann. Hum. Genet. 39: 173–188. Agriculture Hatch Grant (142-PRJ63CV) to D.G. C.C.S. Kerr Wall, P., 2009 Comparison of next generation sequencing technologies for transcriptome characterization. BMC and D.G. acknowledge support of the Technische Universität 10: 347. München Institute for Advanced Study, funded by the Ger- Yang, J., B. Benyamin, B. P. McEvoy, S. Gordon, A. K. Henders et al., man Excellence Initiative. G.D.L.C. received support from 2010 Common SNPs explain a large proportion of heritability National Institutes of Health grants R01GM099992 and for . Nat. Genet. 42: 565–569. R01GM101219. M.A.T. wishes to acknowledge funding Communicating editor: G. A. Churchill

Communications 27 Appendix Genetic Correlation ¼ a9q ¼ a9 q a a fi Let G1 1 and G2 2 be additive genetic values for a pair of traits, where 1 and 2 are vectors of xed allelic substitution effects affecting traits 1 and 2, respectively, and q is a random vector indicating the incidence of genotypes at the corresponding QTL. Following de los Campos et al. (2015), the additive genetic variance of trait i is ð Þ¼a9 S a ; ¼ ; var Gi i Q i i 1 2 (A1)

The additive genetic covariance between traits 1 and 2 is then ð ; Þ¼a9S a cov G1 G2 1 Q 2 (A2) where SQ is a covariance matrix between allelic contents at loci affecting the traits. For example, with two QTL (assuming Hardy-Weinberg equilibrium at each of the two QTL),   2p1ð1 2 p1Þ 2D12 SQ ¼ (A3) 2D12 2p2ð1 2 p2Þ where pj is the frequency of the reference allele at locus j (j=1,2), and D12 is the LD between at the two loci. In scalar notation, (A2) takes the more explicit form

covðG1; G2Þ¼2½p1ð1 2 p1Þa11a21 þ p2ð1 2 p2Þa12a22þ2D12ða12a21 þ a11a22Þ (A4)

The genetic covariance has a pleiotropy component (the first part of the expression) plus a LD component that vanishes if the QTL are in pairwise equilibrium, i.e., D12 ¼ 0: The genetic correlation (Falconer and Mackay 1996) is a9S a 1 Q 2 rG ¼ qÀffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiÁÀ Á (A5) a9S a a9S a 1 Q 1 2 Q 2

Genomic Correlation

Let x be a vector of genotypes at p marker loci. The multiple linear regressions of G1 and G2 on x produce as fitted values ^ ¼ a9S S21x fi Gi i QX X . The genomic covariance (or marked genetic covariance) is de ned as ¼ ð^ ; ^ Þ¼a9S S21S a Cmarked cov G1 G2 1 QX X XQ 2 (A6)

The genomic correlation is

rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiCmarked rG;marked ¼    (A7) a9S S21S a a9S S21S a 1 QX X XQ 1 2 QX X XQ 2

Interpreting this parameter meaningfully requires knowledge of (1) bivariate QTL effects at all loci, (2) LD relationships between QTL affecting the two traits and the markers via the SQX matrices, and (3) LD relationships among markers. Unfortunately,only phenotypes, marker genotypes, and LD relationships between markers are observable. Most of the required ingredients in the formula are yet unknown. Importantly, note that SQ; conveying LD between QTL, does not enter into the genomic correlation.

Independent QTL-Marker Blocks (Case 1 in Figure 2) Each of two independently segregating QTL is in LD with a marker, with the two markers being in mutual LE, and there is no pleiotropy. Here   10 S ¼ S ¼ S ¼ (A8) QX XQ X 01 so the genetic and genomic correlations both become

28 D. Gianola et al. a9 a 1 2 r ¼ r ; ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi (A9) G G marked ða9a Þða9a Þ 1 1 2 2 a9a ¼ða Þ9ð a Þ¼ Because there is no pleiotropy, 1 2 11 0 0 22 0, and both correlations are null.

Phantom Correlation (Case 2 in Figure 2) S S21S Consider QX X XQ in (A6), where (given standardized genotypes) " # " # r r r 0 Q1X1 Q1X2 Q1X1 SQX ¼ ¼ r r 0 r Q2X1 Q2X2 Q2X2 " #2 " # 1 r 1 1 2r 2 X1X2 1 X1X2 S 1 ¼ ¼ (A10) X r 2 r2 2r X X 1 1 X X X X 1 " 1 2 # 1 2 1 2 r 0 Q1X1 SXQ ¼ 0 r Q2X2

Then " # r2 2r r r 21 1 Q1X1 Q1X1 Q2X2 X1X2 SQX S SXQ ¼ (A11) X 1 2 r2 2r r r r2 X1X2 Q1X1 Q2X2 X1X2 Q2X2

The off-diagonals of this matrix are nonnull, so a genomic correlation will arise when there is no genetic correlation.

Missing Correlation (Case 3 in Figure 2) S21 Because the markers are in LE, X is an identity matrix, so " # r2 0 21 QX SQX S SXQ ¼ SQX SXQ ¼ X r2 0 QX

Therefore, in the absence of pleiotropy, Cmarked in (A6) and, thus, rG;marked will be null no matter what the value of the genetic correlation.

Pleiotropy (Case 4 in Figure 2) The results in Figure 3 were obtained using expressions (3) and (4) with the parameter values described in the main body of the paper.

Communications 29