Copyright  2004 by the Genetics Society of America DOI: 10.1534/genetics.103.025734

Quantitative Genetic Models for Describing Simultaneous and Recursive Relationships Between Phenotypes

Daniel Gianola*,†,1 and Daniel Sorensen‡ *Departments of Animal Sciences, Dairy Science and Biostatistics and Medical Informatics, University of Wisconsin, Madison, Wisconsin 53706, †Department of Animal and Aquacultural Sciences, Agricultural University of Norway, N-1432 A˚ s, Norway and ‡Department of and Genetics, Danish Institute of Agricultural Sciences, 8830 Tjele, Denmark Manuscript received December 11, 2003 Accepted for publication March 5, 2004

ABSTRACT Multivariate models are of great importance in theoretical and applied quantitative genetics. We extend quantitative genetic theory to accommodate situations in which there is linear feedback or recursiveness between the phenotypes involved in a multivariate system, assuming an infinitesimal, additive, model of inheritance. It is shown that structural parameters defining a simultaneous or recursive system have a bear- ing on the interpretation of quantitative genetic parameter estimates (e.g., heritability, offspring-parent regression, genetic correlation) when such features are ignored. Matrix representations are given for treating a plethora of feedback-recursive situations. The likelihood function is derived, assuming multivari- ate normality, and results from econometric theory for parameter identification are adapted to a quantita- tive genetic setting. A Bayesian treatment with a Markov chain Monte Carlo implementation is suggested for inference and developed. When the system is fully recursive, all conditional posterior distributions are in closed form, so Gibbs sampling is straightforward. If there is feedback, a Metropolis step may be embedded for sampling the structural parameters, since their conditional distributions are unknown. Extensions of the model to discrete random variables and to nonlinear relationships between phenotypes are discussed.

ULTIVARIATE models are of great importance mental or residual effects (E 1, E2). The genetic and en- M in applied, evolutionary, and theoretical quanti- vironmental effects are assumed to be independently tative genetics. For example, in animal and plant breed- distributed random vectors, following the bivariate nor- ing, the value of a candidate for selection as a prospec- mal distributions N(0, G0) and N(0, R0), respectively. tive parent of the next generation often is a function of Here, several traits, e.g., protein yield, milk somatic cell count, ␴ 2 ␴ ϭ u1 u12 fertility, and life span in dairy cattle or yield and resis- G0 ΄␴ ␴ 2 ΅ (1) tance to disease in wheat. In evolutionary genetics, the u12 u2 effects of natural selection on mean fitness depend on and the values of elements of the genetic variance-covariance ␴ 2 ␴ matrix between quantitative characters (e.g., Cheverud ϭ e1 e12 R0 ΄␴ ␴ 2 ΅ (2) 1984). Walsh (2003) and B. Walsh and M. Lynch (un- e12 e2 published results) give a discussion of the dynamics of are genetic and residual variance-covariance matrices, quantitative traits under multivariate selection. ␴ 2 respectively. For example, u1 is the variance between A schematic of the standard multivariate model used ␴ additive genetic effects affecting trait 1, and e12 is the in quantitative genetics is displayed in Figure 1, where residual covariance between traits 1 and 2. a two-trait system is represented; for simplicity, all other The standard model depicted in Figure 1 does not explanatory variables are omitted. The diagram depicts allow for feedback or recursive relationships between ϫ a2 1 vector of phenotypic values (Y1, Y2) expressed phenotypes, which may be present in many biological as a function (typically linear) of genetic effects (U1, U2), systems. A classical example of feedback (that is, when usually taken to be of an additive type, and of environ- changes of a quantity indirectly influence the quantity itself) is given by Haldane and Priestley (1905) and retaken by Turner and Stevens (1959) and by Wright This article is dedicated to Arthur B. Chapman, teacher and mentor (1960). These authors modeled feedback relationships of numerous animal breeding students and disciple and friend of between the concentration of CO2 in the air (A) and . in the alveoli of the lungs (C) and the depth of respira- 1Corresponding author: Department of Animal Sciences, 1675 Obser- vatory Dr., University of Wisconsin, Madison, WI 53706. tion (D). As shown in Figure 2, Turner and Stevens E-mail: [email protected] (1959) posit that A affects C; in turn, C and D have a

Genetics 167: 1407–1424 ( July 2004) 1408 D. Gianola and D. Sorensen

Figure 1.—Standard bivariate model used in quantitative genetics: Y1 and Y2 are the phenotypic values; U1 and U2 are additive genetic effects act- ing on the traits; E1 and E2 are residual effects. A single-headed arrow (e.g., A → B) indicates that variable A affects variable B.

feedback relationship. Wright (1960) introduces resid- place. Ignoring the actual biology of the problem, a uals V and W for the C and D variables, respectively, model such as that of Turner and Stevens (1959) or and makes a further extension of the model. The exten- of Wright (1960) implies the following: if A is in- sion consists of including a variable X for the actual creased and the relationship between C and A is such ␭ concentration of CO2 in the lungs; in Figure 2, U repre- that C increases as well, then D will increase provided DC ␭ sents “random” errors of technique; this is a “measure- is positive. Further, if CD is positive, then C will increase ment error” model (Warren et al. 1974; Joresko¨g and further. If all the loops go in the same direction, there Sorbo¨m 2001). In the Turner-Stevens model, the effect is positive feedback, which might lead to some equi- ␭ ␭ of C on D is through a coefficient DC, whereas CD gives librium or to an eventual breakdown of the system (Tur- the rate of change of C with respect to D. Suppose that ner and Stevens 1959). A second example of reciprocal these two coefficients are not null, so that feedback takes interaction is the classical supply-demand problem of

Figure 2.—Haldane and Priestley (1905) respiration relations. Models for describing feed- back relationships between concentrations of CO2 in the respired air (A), in the alveoli of the lungs (C), and depth of respiration; V, W, and U are residuals. Wright (1960) introduces the variable X, the actual concentration of CO2 in the alveoli; i.e., C is an imperfect measure. Quantitative Genetics of Simultaneity and Recursiveness 1409

econometrics (Wright 1925; Johnston 1972; Judge sive and exhaustive models for describing the relation- et al. 1985). Also, the existence of feedback inhibition ships between phenotypic values. Formulas pertinent to is well known in genetic regulation. For instance, the multivariate selection (e.g., best prediction of genetic product of a metabolic pathway may bind to a gene values) are given as well. Then, the likelihood func- product (enzyme) catalyzing a previous step, to prevent tion and identification of parameters sections are the channeling of additional molecules through the presented. bayesian model addresses statistical infer- pathway (Lewin 1985). A discussion of the implications ence in a structural equations model from a Bayesian of interactive enzyme systems in genetics is in Kacser perspective. It is shown in the fully conditional pos- and Burns (1981). They write: terior distributions that, under normality assump- tions, most conditional posterior distributions arising In vivo enzymes do not act in isolation, but are kinetically linked to other enzymes via their substrates and products. in the multivariate system are in recognizable form. The These interactions modify the effect of enzyme variation implication is that software for standard multiple-trait on the phenotype, depending on the nature and quantity analysis of quantitative traits via Gibbs sampling (a Mar- of the other enzymes present. An output of such a system, kov chain Monte Carlo method) can be adapted to han- say a flux, is therefore a systemic property, and its response dle simultaneity and recursiveness in a fairly direct man- to variation at one locus must be measured in the whole system. ner. The article concludes with suggestions on how the approach can be extended to a wider class of models. It has been long recognized in economics (e.g., Haa- velmo 1943) that the existence of lagged or of instan- taneous feedback (often referred to as “simultaneity”) GENETIC CONSEQUENCES OF SIMULTANEITY IN A TWO-TRAIT SYSTEM and of recursiveness between variables has implications

on the understanding of multivariate systems, and that Let yi1 and yi2 be measurements on traits 1 and 2 ob- special statistical techniques are required for inference. served in individual i. For example, in an animal breed-

Curiously, Sewall Wright’s work on feedback mechanisms ing setting, yi1 may represent the milk yield of dairy cow

has received scant attention in population/quantitative i and yi2 may be a proxy for the level of some disease genetics, in spite of his influence on the aforementioned (e.g., milk somatic cell count as an indicator of mastitis, a fields and the pervasiveness of such mechanisms in regu- bacterial-related inflammation of the mammary gland). lation, as noted. An explanation may reside in the fact Suppose that biological knowledge admits that produc- that even though path analysis was “extremely powerful tion affects disease and, in turn, that disease affects in the hands of Wright” (Kempthorne 1969), the lack production. As noted earlier, we refer to this as a simul- of matrix representations in his writings hampered a taneous or instantaneous feedback system, following general understanding of the method. This is especially econometric terminology (Zellner 1979; Judge et al. true of Wright’s treatment of reciprocal interaction with 1985). Assume that the relationship between produc- lags (Wright 1960), which is difficult to follow. Also, tion and disease can be represented by the two-equation Goldberger (1972, p. 988) noted: “when there are linear system, more estimating equations than unknown parameters, y ϭ␭ y ϩ xЈ ␤ ϩ u ϩ e , (3) path analysis gives no systematic guide to efficient esti- i1 12 i2 i1 1 i1 i1 mation,” a situation known as overidentification. At any and rate, social scientists eventually embedded path analysis y ϭ␭ y ϩ xЈ ␤ ϩ u ϩ e . (4) into the general framework of simultaneous systems and i2 21 i1 i2 2 i2 i2 ␭ gave it a formal statistical structure (Goldberger 1972; Here, 12 is the rate of change of level of production ␭ Goldberger and Duncan 1973; Duncan 1975; Jore- with respect to a disease index and 21 is the gradient sko¨g and Sorbo¨m 2001). of disease with respect to production. A priori, one might ␭ ␭ Our concern here is with the consequences of the expect 12 to be negative and 21 to be positive, since high existence of simultaneous (“feedback”) and recursive output may be associated with “stress” in the dairy cow. ␭ ␭ relationships between phenotypic values on quantitative It is assumed that 12 and 21 are homogeneous across ␤ genetic parameters, as well as with statistical methods individuals, but this can be relaxed. The vectors 1 and ␤ for appropriate inference. The outline of the article is 2, often called fixed effects in the statistical literature as follows. genetic consequences of simultaneity in a (Searle 1971), are some location parameters such as two-trait system and genetic consequences of recur- age of the cow, parity, or breed affecting production siveness illustrate effects of simultaneity or recursive- (net of disease) and disease (net of production), respec- Ј Ј ness on simple two-trait systems. In particular, formulas tively. Further, xi1 and xi2 are known incidence row vec- are presented for heritability, for the offspring-parent tors; whenever the same factors affect the two traits, the Ј ϭ Ј ϭ Ј regression, and for the genetic and environmental cor- covariates are such that xi1 xi2 xi , say. Finally, ui1 and relations between traits. Next, matrix representations ui2 are additive genetic effects (Fisher 1918; Bulmer

shows that if four phenotypes are involved in a multivari- 1980) intervening in the system and ei1 and ei2 are model ate system, there can be as many as 128 mutually exclu- residuals. 1410 D. Gianola and D. Sorensen

␴2 u * It is important to note that ui1 (ui2) above can be h2 ϭ 1 1 ␴2 ϩ␴2 u * e * construed as an additive genetic effect affecting only 1 1 ␭ ␭ ϭ Ϫ␭ ␭ 2 ␴2 ϩ ␭ ␴ ϩ␭2 ␴2 production (disease) if and only if 12( 21) 0; it is ϭ (1/(1 12 21)) ( u1 2 12 u12 12 u2) (1/(1 Ϫ␭ ␭ ))2[␴2 ϩ␴2 ϩ 2␭ (␴ ϩ␴ ) ϩ␭2 (␴2 ϩ␴2 )] shown subsequently that ui1 and ui2 may affect each of 12 21 u1 e1 12 u12 e12 12 u2 e2 ␴2 ϩ ␭ ␴ ϩ␭2 ␴2 the two traits. On the other hand, it is legitimate to view ϭ u1 2 12 u12 12 u2 . ␴2 ϩ␴2 ϩ ␭ ␴ ϩ␴ ϩ␭2 ␴2 ϩ␴2 [ u1 e1 2 12( u12 e12) 12( u2 e2)] (8) ui1, say, as the additive genetic component of the linear Ϫ␭ combination yi1 12yi2. This is suggested by a rewriting Observe that the apportionment of variance into genetic of (3) and (4) as and environmental components depends nontrivially ␭ ␭ Ϫ␭ ϭ Ј ␤ ϩ ϩ on the structural parameter 12, but not on 21 (the yi1 12yi2 xi1 1 ui1 ei1 (5) ␴ ϭ␴ ϭ opposite being true for trait 2). If u12 e12 0, the Ϫ␭ ϭ Ј ␤ ϩ ϩ Ϫ␭ yi2 21yi1 xi2 2 ui2 ei2 . (6) linear combinations or “composite” traits yi1 12yi2 and y Ϫ␭y are statistically independent (by virtue of More generally, u and u are genetic effects “control- i2 21 i1 i1 i2 normality); however, h2 would still depend on ␭ ,as ling” system (5)–(6). 1 12 When a specification such as system (5)–(6) holds, ␴ 2 ϩ␭2 ␴ 2 h2 ϭ u1 12 u2 . 1 ␴ 2 ϩ␴2 ϩ␭2 ␴ 2 ϩ␴2 generalized least-squares estimates (or maximum-likeli- u1 e1 12( u2 e2) hood estimates under standard normality assumptions) ␭ ␭ The corresponding expression for the heritability of of the structural parameters 12 (or 21) are biased and inconsistent if obtained from a subset of equations trait 2 is (Johnston 1972). For instance, if a univariate analysis ␴ 2 ϩ␭2 ␴ 2 h2 ϭ u2 21 u1 . of trait 1 is conducted including yi2 as a covariate (but 2 ␴ 2 ϩ␴2 ϩ␭2 ␴ 2 ϩ␴2 ␭ u2 e2 21( u1 e1) ignoring the submodel for trait 2), 12 is estimated with ␭ Ͻ an upward bias (provided 12 1). The effects on model Regression of offspring on parent: Let yiЈ1 be a record parameters are more cryptic as the dimensionality of for trait 1 measured on the offspring of an individual

the system increases, for example, when a system of five with phenotype yi1. The offspring-parent covariance is mutually interacting response variables is analyzed with Ј ␤ ϩ␭ Ј ␤ ϩ ϩ␭ ϩ ϩ␭ ϭ (xiЈ1 1 12xiЈ2 2) (uiЈ1 12uiЈ2) (eiЈ1 12eiЈ2) , Cov(yiЈ1, yi1) Cov Ά Ϫ␭ ␭ a three-trait model. 1 12 21 The implications of system (5)–(6) on the interpreta- Ј ␤ ϩ␭ Ј ␤ ϩ ϩ␭ ϩ ϩ␭ (xi1 1 12xi2 2) (ui1 12ui2) (ei1 12ei2) . Ϫ␭ ␭ · tion of some parameters of interest in quantitative ge- 1 12 21 netics analysis are considered next. Assuming zero covariances between environmental ef- Heritability: Use (4) in (3) and solve for yi1, to obtain the “reduced” model (a term from econometrics), fects affecting records taken on different individuals, under additive inheritance one has that y ϭ␮* ϩ u* ϩ e*, (7) i1 1 i1 i1 1 ␴ 2 ϩ␭2 ␴ 2 ϩ␭ ␴ ⁄2 ( u1 12 u2) 12 u12 Cov(y Ј , y ) ϭ , i 1 i1 Ϫ␭ ␭ 2 where (1 12 21) xЈ␤ ϩ␭ xЈ ␤ which is half of the numerator of the expression leading ␮* ϭ i1 1 12 i2 2 , 1 Ϫ␭ ␭ ␭ ϭ 1 12 21 to (8). In the absence of feedback ( 12 0), the off- ϩ␭ spring-parent covariance is always positive and equal to ui1 12ui2 ␴ 2 * ϭ , u1/2. Under simultaneity, however, this covariance ui1 Ϫ␭ ␭ 1 12 21 could be negative, provided that

and ␴ 2 ϩ␭2 ␴ 2 ␴ Ͼ ( u1 12 u2) . u12 ␭ e ϩ␭ e 2 12 * ϭ i1 12 i2 , ei1 Ϫ␭ ␭ 1 12 21 The implication is that a variance component analysis

2 based on the reduced model would, necessarily (be- ف * * ␴ ف * with the random effects ui1 N(0, u1) and ei1 2 cause of the parameterization), return a positive fitted N(0, ␴ *) being independently distributed. Note that e1 value of the offspring-parent covariance. On the other both ␤ and ␤ intervene in ␮*, and that u* is a linear 1 2 1 i1 hand, this may not be so under a simultaneous equations combination of the system genetic effects u and u . i1 i2 model. If the observed covariance is negative, this An implication of this is that estimates of location pa- should be construed as evidence against a specification rameters and of predictions of random effects from failing to accommodate simultaneity, although there standard univariate or multivariate analyses can be inter- may be other reasons (e.g., maternal effects) for a nega- preted differently if simultaneity holds. Suppose data tive offspring-parent covariance. are missing at random (i.e., that selection is ignorable). The regression of offspring on parent is In this case the fraction of the variance of trait 1 that can 1 ␴ 2 ϩ␭2 ␴ 2 ϩ␭ ␴ be attributed to additive genetic effects, or coefficient of ⁄2( u1 12 u2) 12 u12 b ϭ , (9) OP ␴ 2 ϩ␴2 ϩ ␭ ␴ ϩ␴ ϩ␭2 ␴ 2 ϩ␴2 heritability, is u1 e1 2 12( u12 e12) 12( u2 e2) Quantitative Genetics of Simultaneity and Recursiveness 1411

␴ 2 ␴ 2 ϩ␴2 yielding the usual u1/[2( u1 e1)] in the absence of yi1. An example is the maternal-effects model proposed ␭ ϭ␭ ϭ feedback ( 12 21 0). by Falconer (1965) and examined by Koerhuis and Ј ϭ Ј ϭ Regression of one variable on another: If xi1 xi2 Thompson (1997). This model postulates that the phe- Ј xi , the reduced models as in (7) become notype of an offspring is affected by the phenotype of its dam. In pigs, for instance, it is known that females ␤ ϩ␭ ␤ u ϩ␭ u e ϩ␭ e ϭ Ј 1 12 2 ϩ i1 12 i2 ϩ i1 12 i2 born in larger litters tend to produce smaller litters yi1 xi ΂ Ϫ␭ ␭ ΃ ΂ Ϫ␭ ␭ ΃ ΂ Ϫ␭ ␭ ΃ 1 12 21 1 12 21 1 12 21 (leading to negative values of the ␭ coefficients); con- ϭ Ј␤ ϩ ϩ versely, females in smaller litters are expected to pro- xi *1 u*i1 e*i1 (10) duce larger litters, etc. A recursive specification can be and ␭ ϭ obtained from the “full model” by setting 12 0in(3) ␤ ϩ␭ ␤ u ϩ␭ u e ϩ␭ e or (5), so that the system is now y ϭ xЈ΂ 2 21 1΃ ϩ ΂ i2 21 i1΃ ϩ ΂ i2 21 i1΃ i2 i Ϫ␭ ␭ Ϫ␭ ␭ Ϫ␭ ␭ ϭ Ј ␤ ϩ ϩ 1 12 21 1 12 21 1 12 21 yi1 xi1 1 ui1 ei1, (13) ϭ Ј␤ ϩ ϩ xi *2 u*i2 e*i2. (11) and ϭ␭ ϩ Ј ␤ ϩ ϩ Under normality, the regression function of trait 2 yi2 21yi1 x i2 2 ui2 ei2. (14) on trait 1 is Assuming that the incidence vectors are such that ϭ ϩ Cov(yi 2, yi 1) Ϫ ϭ Ј␤* E(yi 2|yi 1) E(yi 2) [yi 2 E(yi 1)] xi 2 xЈ ϭ xЈ ϭ xЈ, use of (13) in (14) gives a reduced model Var(yi 1) i1 i2 i (1 ϩ␭ ␭ )(␴ ϩ␴ ) ϩ␭ (␴ 2 ϩ␴2 ) ϩ␭ (␴ 2 ϩ␴2 ) for yi2, ϩ 12 21 u12 e12 21 u 1 e 1 12 u 2 e 2 (y Ϫ xЈ␤*). ␴ 2 ϩ␴2 ϩ ␭ ␴ ϩ␴ ϩ␭2 ␴ 2 ϩ␴2 i 1 i 1 u 1 e 1 2 12( u12 e12) 12( u 2 e 2) ϭ Ј ␤ ϩ␭ ␤ ϩ ϩ␭ ϩ ϩ␥ yi2 xi ( 2 21 1) (ui2 21ui1) (ei2 21ei1) In the absence of simultaneity, this reduces to the usual ϭ xЈ␤* ϩ u* ϩ e*, formula, i 2 i2 i2 ␴ ϩ␴ where E(y |y ) ϭ xЈ␤ ϩ u12 e12(y Ϫ xЈ␤ ). ␴ 2 ϩ ␭ ␴ ϩ␭2 ␴ 2 ف i2 i1 i 2 ␴ 2 ϩ␴2 i1 i 1 u1 e1 u*i2 N(0, u2 2 21 u12 21 u1) Genetic and environmental correlations: The reduced and ␴ 2 ϩ ␭ ␴ ϩ␭2 ␴ 2 ف models (10) and (11) lead to ei*2 N(0, e2 2 21 e12 21 e1), Cov(y , y ) ϭ Cov(u*, u*) ϩ Cov(e*, e*), i1 i2 i1 i2 i1 i2 so that where ϭ␴2 ϩ␴2 ϩ ␭ ␴ ϩ␴ ϩ␭2 ␴ 2 ϩ␴2 Var(yi2) u2 e2 2 21( u12 e12) 21( u1 e1). (1 ϩ␭ ␭ )␴ ϩ␭ ␴ 2 ϩ␭ ␴ 2 Cov(u*, u*) ϭ 12 21 u12 21 u1 12 u2 . i1 i2 Ϫ␭ ␭ 2 Heritability: The heritability of trait 1 is the usual (1 12 21) 2 ϭ␴2 ␴ 2 ϩ␴2 ␴ 2 ␴ 2 h u1/( u1 e1). Here, u1 and e1 are the additive Similarly genetic and residual variances of trait 1, respectively, contrary to the simultaneity situation, where these dis- (1 ϩ␭ ␭ )␴ ϩ␭ ␴ 2 ϩ␭ ␴ 2 Cov(e*, e*) ϭ 12 21 e12 21 e1 12 e2 . persion parameters pertain to the variation of genetic i1 i2 Ϫ␭ ␭ 2 (1 12 21) Ϫ and residual effects affecting the distribution of yi1 ␭ The genetic and environmental covariances depend on 12yi2 . The coefficient of heritability of trait 2 has the ␭ ␭ the ␭ coefficients and on the appropriate variances and form of (8), but with 21 instead of 12 : covariances of each of the two composite traits. ␴ 2 ϩ 2␭ ␴ ϩ␭2 ␴ 2 h2 ϭ u2 21 u12 21 u1 . The genetic correlation is 2 ␴2 ϩ␴2 ϩ ␭ ␴ ϩ␴ ϩ␭2 ␴ 2 ϩ␴2 u2 e2 2 21( u12 e12) 21( u1 e1) ϩ␭ ␭ ␴ ϩ␭ ␴ 2 ϩ␭ ␴ 2 ϭ (1 12 21) u12 21 u1 12 u2 (15) Corr(u*i1, u*i2) , √ ␴ 2 ϩ ␭ ␴ ϩ␭2 ␴ 2 ␴ 2 ϩ ␭ ␴ ϩ␭2 ␴ 2 ( u1 2 12 u12 12 u2)( u2 2 21 u12 21 u1) (12) Regression of offspring on parent: The regression of offspring on parent depends on the trait or pairs of traits and the expression for the residual correlation is similar. involved. Using the same notation as in the section for ␴ ϭ If u12 0(i.e., when the composite traits are genetically simultaneity, the offspring-parent covariance for trait 1 is uncorrelated), (12) becomes ϭ Ј ␤ ϩ ϩ Ј ␤ ϩ ϩ ϭ 1 ␴ 2 ␭ ␴ 2 ϩ␭ ␴ 2 Cov(yiЈ1, yi1) Cov(xiЈ1 1 uiЈ1 eiЈ1, xi1 1 ui1 ei1) u1. ϭ 21 u1 12 u2 2 Corr(u*i1, u*i2) . √ ␴ 2 ϩ␭2 ␴ 2 ␴ 2 ϩ␭2 ␴ 2 ( u1 12 u2)( u2 21 u1) Hence the regression of offspring on parent for trait 1 2 is simply h 1/2, the standard result from assuming addi- tive inheritance. GENETIC CONSEQUENCES OF RECURSIVENESS Let y Ј be a record measured for trait 2 on the off- IN A TWO-TRAIT SYSTEM i 2 spring of an individual with phenotype yi2. The offspring- A recursive specification postulates, for instance, that parent regression, assuming that between-generation en-

yi1 affects yi2 but that the latter variable has no effect on vironmental effects are uncorrelated, has a similar form 1412 D. Gianola and D. Sorensen

to (9). Consider now the covariance between yiЈ2, a re- Likewise, the environmental covariance and correla- cord for trait 2 measured on the offspring of an individ- tion are ual with phenotype y . The offspring-parent covariance i1 * ϭ ϩ␭ ϭ␴ ϩ␭ ␴ 2 between such records is now Cov(ei1, ei2) Cov(ei1, ei2 21ei1) e12 21 e1

␴ ϩ␭ ␴ 2 and u12 21 u1 Cov(u Ј ϩ␭ u Ј , u ) ϭ i 2 21 i 1 i1 ␴ ϩ␭ ␴ 2 2 ϭ e12 21 e1 Corr(ei1, e*i2) , (21) √␴ 2 ␴ 2 ϩ ␭ ␴ ϩ␭2 ␴ 2 and the regression coefficient is e1( e2 2 21 e12 21 e1) respectively. 1 ␴ ␭ ϭ ΂␭ h 2 ϩ u12 ΃. (16) O2P1 21 1 ␴ 2 ϩ␴2 2 u1 e1 MATRIX REPRESENTATIONS Conversely, with yiЈ1 being now a record for trait 1 mea- sured on the offspring of an individual with phenotypic Many possible models: A multivariate system may in-

value yi2 the offspring-parent regression is volve many response variables, as well as different levels of simultaneity and recursiveness. When the models ␴ ϩ␭ ␴ 2 ␭ ϭ ( u12 21 u1) . comprise more than two traits, the issues and principles O1P2 ␴ 2 ϩ␴2 ϩ ␭ ␴ ϩ␴ ϩ␭2 ␴ 2 ϩ␴2 2[ u2 e2 2 21( u12 e12) 21( u1 e1)] are as discussed above but the algebra is awkward. For (17) example, consider the simultaneous-equation model for Regression of one variable on another: Recall that three traits given in Figure 3. Here, the three response ϭ Ј ␤ ϩ ϩ variables Y1, Y2, and Y3 have mutually reciprocal effects, yi1 xi1 1 ui1 ei1 and that so that there are six ␭ coefficients or structural parame- ϭ Ј ␤ ϩ␭ ␤ ϩ ϩ␭ ϩ ϩ␭ yi2 xi ( 2 21 1) (ui2 21ui1) (ei2 21ei1) ters in the model. Several different models can be derived as special ϭ xЈ␤* ϩ u* ϩ e*. i 2 i2 i2 cases of the specification given in Figure 3. There are Then 64 models that can be viewed as “nested” within the dia- ␴ ϩ␴ ϩ␭ (␴ 2 ϩ␴2 ) gram depicted. In general, for K response variables there E(y |y ) ϭ xЈ␤ ϩ u12 e12 21 u1 e1 i1 i2 i 1 ␴ 2 ϩ␴2 ϩ ␭ ␴ ϩ␴ ϩ␭2 ␴ 2 ϩ␴2 Ϫ ␭ u2 e2 2 21( u12 e12) 21( u1 e1) are K(K 1) structural coefficients ( ’s) in a fully simulta- ϫ Ϫ Ј ␤ ϩ␭ ␤ [yi2 xi ( 2 21 1)]. (18) neous model. Since, in a given model, each coefficient ␭ can take the value ij or be constrained to be 0 (when Conversely, the regression function of yi2 on yi1 is there is no “effect” of variable j on variable i in the latter Ϫ ␴ ϩ␴ case), there can be as many as 2K(K 1) possible models E(y |y ) ϭ xЈ(␤ ϩ␭ ␤ ) ϩ ΂␭ ϩ u12 e12 ΃(y Ϫ xЈ␤ ). 21 i1 i 2 21 1 21 ␴ 2 ϩ␴2 i1 i 1 for explaining relationships between the phenotypic u1 e1 (19) variables; in practice, however, many of the models can be discarded on mechanistic grounds. For example, if The two preceding expressions reduce to the usual for- all ␭’s are set equal to 0 in Figure 3, this yields the mulas under bivariate normality by letting ␭ ϭ 0. 21 standard trivariate model used for quantitative genetic Genetic and environmental correlations: The genetic analysis of three traits. Some other models that can arise covariance between the two traits is are illustrated in Figures 4–6. A “cyclically recursive” Cov(u , u*) ϭ Cov(u , u ϩ␭ u ) ϭ␴ ϩ␭ ␴ 2 , model is depicted in Figure 4. Here, the causal relation- i1 i2 i1 i2 21 i1 u12 21 u1 → → → → ship modeled is Y1 Y2 Y3 Y1 . . . . For instance, so that the genetic correlation is consider a hypothetical situation where Y1, Y2, and Y3 are ␴ ϩ␭ ␴ 2 phenotypic values in sibships of size 3, with the subscript ϭ u12 21 u1 Corr(ui1, u*i2) . (20) indicating birth order. It may be that the chain of influ- √␴ 2 ␴ 2 ϩ ␭ ␴ ϩ␭2 ␴ 2 u1( u2 2 21 u12 21 u1) ences is such that the older sib (with phenotype Y1) affects If ␴ ϭ 0 the second sib (Y2) and so on, with the loop closing via u12 an influence of the youngest on the oldest sib. ␭ In Figure 5, it is hypothesized that Y has an effect of ϭ 21 1 Corr(ui1, u*i2) , √ ␭2 ϩ ␴ 2 ␴ 2 the recursive type on both Y2 and Y3, but that there is ( 21 ( u2/ u1)) simultaneity between Y2 and Y3; this is referred to as a in which case the sign of the genetic correlation depends recursive-simultaneous model. Here, Y1 might be the ␭ on the sign of 21. Further, if the traits are scaled such concentration of a hormone regulating the production ␴ 2 ϭ␴2 ϭ ␭ √ ϩ␭2 that u1 u2 1, the correlation is 21/ (1 21).It of two metabolites that are involved in a feedback rela- is interesting to observe that when genotypes are ex- tionship. In Figure 6, there is simultaneity between Y2 pressed in units of standard deviation, the genetic corre- and Y3, with these two variables affecting Y1; this is a ␭ lation is driven entirely by 21, which is a gradient op- simultaneous-recursive model: two biochemical prod-

erating at the phenotypic level. ucts (Y2 and Y3) involved in the regulation of Y1 may Quantitative Genetics of Simultaneity and Recursiveness 1413

Figure 3.—Simultaneous-equations model for three variables: Y1, Y2, and Y3 are the phenotypic values; U1, U2, and U3 are addi- tive genetic effects acting on the system; E1, E2, and E3 are residual effects. A single- headed arrow (e.g., A → B) indicates that variable A affects variable B. Double-headed arrows denote correlations between pairs ␭ of variables. ij indicates the rate of change of variable i with respect to variable j.

interact reciprocally toward the establishment of some The preceding discussion illustrates that a general equilibrium. Clearly, there is a constellation of model- representation is needed for describing the full range of ing alternatives. possibilities. For example, the two-variate simultaneous-

Figure 4.—Fully recursive model for three variables: Y1, Y2, and Y3 are the phenotypic values; U1, U2, and U3 are additive genetic effects acting on the system; E1, E2, and E3 are residual effects. A single-headed arrow (e.g., A → B) indicates that variable A af- fects variable B. Double-headed ar- rows denote correlations between ␭ pairs of variables. ij indicates the rate of change of variable i with respect to variable j. 1414 D. Gianola and D. Sorensen

Figure 5.—Recursive simultane- ous model for three variables: Y1, Y2, and Y3 are the phenotypic values; U1, U2, and U3 are additive genetic effects acting on the system; E1, E2, and E3 are residual effects. A single-headed arrow (e.g., A → B) indicates that vari- able A affects variable B. Double- headed arrows denote correlations be- ␭ tween pairs of variables. ij indicates the rate of change of variable i with respect to variable j.

equations system of Equations 3 and 4 can be put in of generality, it is assumed that Xi has full-column rank. matrix form as Further, the location vector ␤ is such that ␤ 1 Ϫ␭ y xЈ 0 ␤ u e 1 21 i1 ϭ i1 1 ϩ i1 ϩ i1 . (22) ΄Ϫ␭ ΅΄ ΅ ΄ Ј ΅΄␤ ΅ ΄ ΅ ΄ ΅ ␤ 21 1 yi2 0xi2 2 ui2 ei1 2 This representation embeds four models [K ϭ 2, so ␤ ϭ . , K(KϪ1) ϭ ΄ ␤ ΅ 2 4], including the simultaneous one. The other KϪ1 three models are the standard bivariate specification ␤ ␭ ϭ␭ ϭ ␭ ϭ K ( 12 21 0) and two recursive models ( 12 0 when ␭ ϭ ␤ ϭ ϫ y1 “affects” y2, but without a reciprocal effect; 21 0 where j(j 1,2,...,K)ispj 1. The vector ui contains

when y2 affects y1). additive genetic effects of individual i for the K traits Statistical structure: Let there be K traits observed on and, similarly, ei is a vector of residual effects, distributed individual or “cluster” (e.g., a family) i (i ϭ 1,2,...,N), independently of ui. and write the system as If ⌳ has full rank, the reduced form of the model is ⌳ ϭ ␤ ϩ ϩ y ϭ ⌳Ϫ1X ␤ ϩ ⌳Ϫ1u ϩ ⌳Ϫ1e yi Xi ui ei , (23) i i i i ϭ ␮ ϩ ϩ ϫ *i u*i e*i , (24) where yi is a K 1 vector of phenotypic measurements ⌳ ϫ Ϫ Ϫ Ϫ on the K traits of individual i; is a K K matrix con- where ␮* ϭ ⌳ 1X ␤, u* ϭ ⌳ 1u , and e* ϭ ⌳ 1e . For Ϫ ␭ i i i i i i taining at most K(K 1) unknown coefficients (all diag- example, in a two-trait simultaneous model, the ele- onal elements are equal to 1); and X is a K ϫ ͚Kϭ p ␮ i j 1 j ments of *i , u*i , and e*i take the form given in (7). known incidence matrix with the form Assume now that Ј (N(0, G ); i ϭ 1,2,...,N, (25 ف xi1 0 . 00 u |G Ј i 0 0 0xi2 . 00 and X ϭ .... ., i Ј ف ΅ xi(KϪ1) 0.00 ΄ e |R N(0, R ) (26) Ј i 0 0 00. 0xiK are mutually independent. Hence, the vectors of genetic Ј where xij is a row vector with pj elements. Without loss and residual effects involved in the correlations shown Quantitative Genetics of Simultaneity and Recursiveness 1415

Figure 6.—Simultaneous-recursive model for three variables: Y1, Y2, and Y3 are the phenotypic values; U1, U2, and U3 are additive genetic effects act- ing on the system; E1, E2, and E3 are residual effects. A single-headed arrow (e.g., A → B) indicates that variable A affects variable B. Double-headed arrows denote correlations between ␭ pairs of variables. ij indicates the rate of change of variable i with respect to variable j.

ϭ ϩ Ј Ϫ1 Ϫ ⌳Ϫ1 ␤ in Figures 3–6 follow multivariate normal distributions E(u*i |yi) E(u*i ) Cov(u*i , yi )Var (yi)(yi Xi ) with covariance matrices G and R , respectively, each Ϫ 0 0 ϭ H(y Ϫ ⌳ 1X ␤). (31) having order 3 ϫ 3. This implies that i i Ϫ1 ⌳ЈϪ 1 In multiple-trait selection, animal and plant breeders⌳ ف ⌳ ui*| , G0 N(0, G0 ), (27) are often interested in improving a linear combination and ϭ Ј of genetic values; e.g., Ti v u*i , where v is a known Ϫ1 ⌳ЈϪ1 ϫ⌳ ف ⌳ e*i | , R0 N(0, R0 ) (28) K 1 vector of relative economic values (Smith 1936; Hazel 1943), and ui* contains the “true” genetic values are also independently distributed. Further, the marginal affecting the traits. Recall that the genetic values are distribution of the phenotypic values for individual i is ui only in the absence of feedbacks or recursiveness. Ϫ1 ␤ ⌳Ϫ1 ϩ ⌳ЈϪ1⌳ ف ␤ yi| , R0, G0 N( Xi , (R0 G0) ). (29) Suppose the N candidates are independently distrib- uted, so that the density of the joint distribution of all Genetic parameters and functions thereof: The “mul- genetic and phenotypic values is given by tivariate heritability and coheritability” matrix can be N defined as ϭ p(u*, y|parameters) ͟ p(u*i , yi|parameters). Ϫ Ϫ Ϫ Ϫ Ϫ ϭ ⌳ 1 ⌳Ј 1 ⌳ 1 ϩ ⌳Ј 1 1 iϭ1 H G0 [ (R0 G0) ] ϭ ⌳Ϫ1 ϩ Ϫ1⌳ G0(R0 G0) . (30) This is what Henderson (1963, 1973) termed an “equal ϭ information” situation. The best predictor of the “merit In the absence of simultaneity or recursiveness, H function” T is ϩ Ϫ1 ⌳ i G0(R0 G0) , since would be an identity matrix of ϭ Ј ϭ Ј Ϫ ⌳Ϫ1 ␤ order K in this case. Note that the trace of (30), Tˆ i E(v u*i |yi) v H(yi Xi ) Ϫ ϭ ϩ 1 ϭ Ј Ϫ ⌳Ϫ1 ␤ tr(H) tr[G0(R0 G0) ], b (yi Xi ), (32) is free of the ␭ coefficients. Now, using the measure- where ments taken on individual i, the best predictor of u*i , in the sense of minimizing the mean square error of ϭ Ј ϭ ⌳Ј ϩ Ϫ1 ⌳ЈϪ1 b H v (R0 G0) G0 v (33) prediction among all possible functions of the data (Henderson 1973), is given by the conditional expecta- is the classical “selection index” solution to the Smith- tion function Hazel equations 1416 D. Gianola and D. Sorensen ϭ Ј ⌳ Var(yi )b Cov(yi, v u*i ). y1 X1 u1 e1 ⌳ y2 X2 u2 e2 Suppose that selection of a truncation type is based ϭ ␤ ϩ Z ϩ ˆ ␣ ΄ . ΅ ΄ . ΅ ΄ . ΅ ΄ . ΅ on Ti in (32), such that a proportion of the candidates ⌳ is kept as parents of the following generation. From the yN XN uN eN forms of (32) and (33), it follows that the mean of the ϭ X␤ ϩ Zu ϩ e, (35) distribution of Tˆ i in the unselected individuals is 0, since ϭ ⌳Ϫ1 ␤ E(yi) Xi . Under normality assumptions, standard where u comprises additive genetic effects for all individ- theory (e.g., Bulmer 1980; Falconer and Mackay uals and all traits (u may include additive genetic effects 1996) gives a mean of the selected individuals, of individuals without records), and Z is an incidence matrix of appropriate order. If all individuals have re- ˆ ϭ √ ˆ ES(Ti) i Var(Ti), (34) cords for all traits, Z is an identity matrix of order NK ϫ where i ϭ z/␣ is called “selection intensity” and z is the NK; otherwise, columns of 0’s for effects of individuals ordinate of the standard normal distribution at a point without phenotypic measurements would be included at the right of which there is a probability mass equal in Z. In view of the normality assumptions (25) and to ␣; S stands for selection. Under additive genetic ac- (26), one can write ( N(0, A  G ف tion, the expected genetic value of the progeny of se- u|G lected parents is equal to the expected value of the 0 0 selected parents. Hence, the expected response to selec- and  ف tion is given directly by (34). For example, consider ϭ e|R0 N(0, I R0), single-trait selection and the merit function Ti u*i1(the additive genetic value of individual i), and suppose that where A is a matrix of additive genetic relationships (or the only source of information is yi1, the phenotypic of twice the coefficients of coancestry) between individu- value for trait 1. In this case, and from the form of (32), als in a genealogy, and  indicates Kronecker product.  it follows that Note that I R0 reflects the assumption that all individu- als with records possess phenotypic values for each of Var(u*) ˆ ϭ i1 Ϫ the K traits. This is not a requirement, but it simplifies Ti ϩ [yi1 E(yi1)]. Var(u*i1) Var(e*i1) somewhat the treatment that follows. ⌳ For a two-trait simultaneous system, it was seen earlier Given u, the vectors yi are mutually independent that (since all ei vectors are independent of each other), so ⌳ the joint density of all yi is ␴ 2 ϩ ␭ ␴ ϩ␭2 ␴ 2 Var(u*i1) ϭ u1 2 12 u12 12 u2 ϩ ␴ 2 ϩ␴2 ϩ ␭ ␴ ϩ␴ ϩ␭2 ␴ 2 ϩ␴2 ⌳ ⌳ ⌳ ⌳ ␤ Var(u*i1) Var(e*i1) u1 e1 2 12( u12 e12) 12( u2 e2) p( y1, y2,..., yN| , , u, R0) 1 1 N ϰ exp΄Ϫ ͚ (⌳y Ϫ X ␤ Ϫ Z u)ЈRϪ1(⌳y Ϫ X ␤ Ϫ Z u)΅, and N/2 i i i 0 i i i |R0| 2 iϭ1 ␮ ϩ␭ ␮ ϭ␮ ϭ 1 12 2 . (36) E(yi1) *i Ϫ␭ ␭ 1 12 21 where Zi is an incidence matrix that “picks up” the K Hence breeding values of individual i (ui) and relates these to its phenotypic records yi. Making a change of variables ␴ 2 ϩ ␥ ␴ ϩ␭2 ␴ 2 ϭ i( u1 2 12 u12 12 u2) ⌳ ϭ ES(Tˆ i) . from yi to yi (i 1,2,...,N), the determinant of the √␴ 2 ϩ␴2 ϩ ␥ ␴ ϩ␴ ϩ␭2 ␴ 2 ϩ␴2 u1 e1 2 12( u12 e12) 12( u2 e2) Jacobian of the transformation is |⌳|. Hence, the density ϭ Ј Ј Ј Ј ␭ ϭ ˆ ϭ ␴ of y [y1, y2,...,yN] is When 12 0, this reduces to the usual ES(Ti) i u1 2 2 ⌳ N h ϭ ih ␴ , provided selection is based on Tˆ ϭ h [y Ϫ ⌳ ␤ ϰ | | 1 1 y1 i 1 i1 p(y| , , u, R0) N/2 |R0|

E(yi1)]. N ϫ Ϫ 1 Ϫ ⌳Ϫ1 ␤ ϩ Ј ⌳Ј Ϫ1⌳ Ϫ ⌳Ϫ1 ␤ ϩ expΆ ͚ [yi (Xi Ziu)] R 0 [yi (Xi Ziu)]· The covariance matrix between additive genetic val- 2 iϭ1 ues of related individuals i and iЈ is ϰ 1 Ϫ1R ⌳؅Ϫ1|N/2⌳| Ј ϭ ⌳Ϫ1 Ј⌳؅Ϫ1 ϭ ⌳Ϫ1 ⌳ЈϪ1 0 Cov(u*i , u*iЈ ) Cov( ui, ui ) aiiЈ G0 , N ϫ Ϫ 1 Ϫ ⌳Ϫ1 ␤ ϩ Ј ⌳Ј Ϫ1⌳ Ϫ ⌳Ϫ1 ␤ ϩ expΆ ͚ [yi (Xi Ziu)] R0 [yi (Xi Ziu)]·. 2 iϭ1 where a Ј is twice the coefficient of coancestry between ii (37) i and iЈ. This is the density of the product of the N normal distri- butions LIKELIHOOD FUNCTION ,(N(⌳Ϫ1(X ␤ ϩ Z u), ⌳Ϫ1R ⌳ЈϪ1 ف y |⌳, ␤, u, R Consider system (23) in conjunction with the normal- i 0 i i 0 ity assumptions (25) and (26), and regard the vector highlighting that the data generation process can be ⌳ yi as “data.” The model for the entire data vector can represented in terms of the reduced model (24), with be written as the only novelty here being the presence of the inci- Quantitative Genetics of Simultaneity and Recursiveness 1417 ϫ dence matrix Zi, with the latter being a K K identity Consider the system of K response variables (23), and matrix in (24). Hence, the entire data vector can be reorganize it as modeled as ⌳ ϩ ␤→ ϭ ε ⌳Ϫ1 ⌳Ϫ1 yi Xi i, (41) y1 X1 Z1 0 . 0 u1 e*1 ⌳Ϫ1 ⌳Ϫ1 y2 X2 0 Z2 . 0 u2 e*2 → ϭ ␤ ϩ ϩ where ␤ ϭϪ␤ and ε ϭ u ϩ e is a residual. It is conve- ΄ . ΅ ΄ . ΅ ΄ ....΅΄ . ΅ ΄ . ΅ i i i Ϫ Ϫ nient to lump the sum of the two random effects into y ⌳ 1X 00. ⌳ 1Z u e* N N n N N a single residual for the treatment that follows. Rewrite ϭ X⌳␤ ϩ Z⌳u ϩ e*, (38) Ј → ϫ ͚K xi1 0 . 00␤ where X⌳ is an NK jϭ1pj matrix (again, assuming that 1 Ј → 0xi2 . 00␤ each of the N individuals has measurements for the K 2 → traits), and Z⌳ has order NK ϫ (N ϩ P)K, where P is ␤ ϭ ...... Xi the number of individuals in the genealogy lacking phe- Ј → ΄ 00. xi(KϪ1) 0 ΅΄ ␤ Ϫ ΅ K 1 notypic records (the corresponding columns of Z⌳ being → 00. 0xЈ ␤ null). Observe that (38) is in the form of a standard iK K multiple-trait mixed-effects linear model, save for the → ␤Ј xi1 1 0 . 00 fact that the incidence matrices depend on the un- → ␤Ј xi2 known structural coefficients contained in ⌳. Hence 0 2 . 00 ϭ ...... ⌳ ␤ → p(y| , , u, R0) ΄ ␤ЈϪ ΅΄ xi(KϪ1) ΅ 00. K 1 0 1 1 Ϫ → ϰ exp΄Ϫ (y Ϫ X⌳␤ Ϫ Z⌳u)ЈR⌳ 1(y Ϫ X⌳␤ Ϫ Z⌳u)΅, 1/2 ␤Ј xiK |R⌳| 2 00. 0 K (39) ϭ Bxi,

͚K ϫ where where xi now is a column vector of order jϭ1pj 1, → ϫ ͚K ϭ ϭ  ⌳Ϫ1 ⌳ЈϪ1 and B is K jϭ1pj . In practice, it suffices to keep the Var(e*) R⌳ IN R0 distinct explanatory variables in xi; e.g., if herd effects is a block-diagonal matrix consisting of N blocks of order affect all traits in the system, only a single set of inci- ϫ ⌳Ϫ1 ⌳ЈϪ1 K K, and all such blocks are equal to R0 .It dence variables needs to be considered. With this nota- ϩ ␤ ف ␤ ⌳ follows that y| , , u, R0 N(X⌳ Z⌳u, R⌳). Hence, tion, (41) can be put as if simultaneity or recursiveness holds, the estimator of ␭Јy ϩ bЈx ε the residual variance-covariance matrix from a reduced 1 i 1 i i1 ⌳Ϫ1 ⌳ЈϪ1 ␭Ј ϩ Ј ε model analysis is actually estimating R0 ; this has 2yi b2xi i2 a bearing on the interpretation of the parameter esti- . ϭ . , (42) mates. ΄ ␭Ј ϩ Ј ΅ ΄ ε ΅ KϪ1yi bKϪ1xi i(KϪ1) N(0, A  G ), the ف Since it is assumed that u|G 0 0 ␭Јy ϩ bЈx ε likelihood function is given by K i K i iK ␭Ј Ј ϭ where j and bj (j 1,2,...,K) are the jth rows of l(⌳, ␤, R , G ) ϰ ΎN(X⌳␤ ϩ Z⌳u, R⌳)N(0, A  G )du 0 0 0 ⌳ and B, respectively. The specification ϰ ␤ ϩ  Ј N(X⌳ , R⌳ Z⌳(A G0)Z⌳). (40) ␭Ј ϩ Ј ϭ ε ϭ ϭ j yi bj xi ij ; j 1,2,...,K, i 1,2,...,N, This likelihood has the same form as that for a standard constitutes the jth equation of the system. Compactly, multivariate mixed-effects model, except that, here, ad- the system is ditional parameters (the nonnull elements of ⌳) appear ⌳ ϩ ϭ ε in both the location and dispersion structures of the yi Bxi i . (43) reduced model (38). A pertinent issue, then, is whether ⌳ ␤ The reduced model is expressible as or not all parameters in the model, that is, , , R0, ϭϪ⌳Ϫ1 ϩ⌳Ϫ1ε and G0, can be identified (i.e., estimated uniquely) from yi Bxi i the likelihood. This is discussed in the following section. ϭ ⌸ ϩ →ε xi i, (44) Ϫ where ⌸ ϭϪ⌳ 1B is a K ϫ ͚Kϭ p matrix of reduced IDENTIFICATION OF PARAMETERS → j 1 j ε ϭ⌳Ϫ1ε model parameters, and i i . The system in (43) This is dealt with only briefly here as extensive treat- contains, at least potentially, the following number of ments can be found in econometrics treatises such as parameters: K 2 (all elements of ⌳, including the 1’s in ͚K Johnston (1972) and Judge et al. (1988); a readable the diagonal), K jϭ1pj (all elements of B, including the ϩ account is in Goldberger (1998). null ones), plus K(K 1) (the distinct elements of R0 1418 D. Gianola and D. Sorensen

and G0). It is assumed that these two variance-covariance normalization restrictions set all diagonal elements of ⌳ ␭ ϭ ␭ matrices can be separated in the estimation procedure, as equal to 1, and then jj 1( jj is the jth element ␭ Ϫ which depends on the genetic structure of the data set. of j), this implies that (47) must provide K 1 linearly ϭ ͚K Letting p jϭ1pj /K, the total number of parameters in independent relationships, so that one can arrive at the the system is S ϭ K 2(2 ϩ p) ϩ K. In the reduced model, K restrictions needed. Now, combine (46) and (47), to on the other hand, the number of potential parameters arrive at the system 2 ⌸ ϩ is K p (the order of ), plus K(K 1) (the elements Ј Ϫ Ϫ Ϫ Ϫ ⌸͚K ϫ I͚K ϫ͚K ϭ ⌳ 1 ⌳Ј 1 ϭ ⌳ 1 ⌳Ј 1 jϭ1 pj K jϭ1 pj jϭ1 pj ␭ of G*0 G0 and those of R0* R0 ), j ϭ ϭ 2 ϩ ϩ ΄ ␭ ΅΄ ΅ 0, (48) yielding R K (1 p) K as the total number of pa- j jb R ϫ R ϫ͚K bj rameters. To obtain unique estimates of the parameters J K J jϭ1 pj ⌳ Ϫ ϭ 2 ϭ j␭ jb in , B, G0, and R0, S R K restrictions are needed where Rj [R R ] is given in partitioned form, and Ϫ ϩ ͚K for uniqueness. These can be of four types (Judge et al. the coefficient matrix must have rank K 1 jϭ1pj ␭ 1988), as follows. to obtain unique estimates of j and bj . Johnston (1972) and Judge et al. (1988) state that the rank of 1. “Normalization” restrictions: set the diagonal ele- the coefficient matrix is K Ϫ 1 ϩ ͚Kϭ p if and only if ments of ⌳ to 1, so that the parameters in equation j 1 j the rank of j are expressed relative to this constant of propor- tionality. This yields K restrictions, so an additional ␭ ␭ . ␭ ⌳Ј j␭ j b 1 2 K ϭ K 2 Ϫ K ϭ K(K Ϫ 1) restrictions are still needed. [R R ]΄ ΅ Rj ΄ Ј ΅ (49) b1 b2 . bK B 2. Exclusion restrictions: some of the ␭ coefficients may Ϫ ϫ be 0, as in a recursive model, or the elements of ␤ is K 1. Note that the preceding matrix has order J K may not appear in each of the equations. and that column j is null by virtue of (47). Hence, for Ϫ Ն Ϫ 3. Restrictions in the form of a linear combination of (49) to possess rank K 1, it must be that J K 1; parameters in the same equation or across equations. i.e., a condition for identification of equation j is that Ϫ 4. Restrictions on the variance-covariance matrices G0 the number of restrictions J must be at least K 1 and R0 (typically, such restrictions are not employed (recall that K is the number of traits in the system). in quantitative genetic analysis). However, this is not sufficient: as stated, the rank of (49) must be K Ϫ 1. Formal procedures for evaluation of identification of In short, if the rank of (49) is K Ϫ 1, equation j is equations are described by Johnston (1972) and Judge just identified, meaning that the relationship between ⌸ et al. (1988). Suppose that is given and that one wishes the reduced model parameters and the ␭’s and ␤’s in ⌳ to estimate uniquely (identify) the parameters in and the equation is unique. If the rank is larger than K Ϫ 1, in B. Briefly, note that the parameters of the reduced the equation is overidentified, meaning that there are ⌸ ϭϪ⌳Ϫ1 ⌳⌸ ϩ ϭ model, B, satisfy B 0 or, equivalently, many ways in which the structural model parameters ⌸ can be expressed as a function of the elements of ⌸. [⌳ B]΄ ΅ ϭ 0. (45) In these two cases, the ␭’s and ␤’s may be inferred I efficiently, using methods that employ all information Consider now row j of (45) and write it as available in the data, e.g., maximum-likelihood or Bayes- ␭Ј⌸ ϩ Ј ϭ ian procedures. Finally, if the rank of (49) is smaller j bj 0. than K Ϫ 1, equation j is underidentified, and the struc- Transposing, this yields tural parameters cannot be solved as a function of the reduced model parameters (Dreze and Richard 1983). ␭ j The preceding developments are illustrated with a ͓⌸Ј I͚K ϫ͚K ͔΄ ΅ ϭ 0. (46) jϭ1 pj jϭ1 pj bj two-trait simultaneous model. Suppose that yi1 and yi2 are measurements of systolic and diastolic blood pressure, This defines a system of equations on K ϩ ͚Kϭ p un- j 1 j respectively, taken on individual i; assume that physio- knowns in which the rank of the known coefficient ma- ͚K logical knowledge postulates a feedback between the trix is jϭ1pj . Hence, K restrictions are needed to identify ␭ two variables. Let the models be the unknown parameters j and bj of equation j of the ϭ␭ ϩ␤ ϩ␤ ϩ␤ ϩ ϩ system. The restrictions can be denoted (Judge et al. yi1 12yi2 11 12Agei 13Smokingi ui1 ei1 1988) as and ␭ j ϭ ϭ␭ ϩ␤ ϩ␤ ϩ␤ ϩ␤ Rj ΄ ΅ 0, (47) yi2 21yi1 21 22Agei 24Drinkingi 25Exeri bj ϩ ϩ ϫ ϩ ͚K Ͻ ϩ ui2 ei2 , where is Rj is a J (K jϭ1pj) matrix of rank J K ͚K jϭ1pj . For example, an exclusion restriction can be indi- where Age is the age of i in years; Smoking is a binary

cated by filling the appropriate row of Rj with 0’s, save variable (0 represents no smoking during the year prior for a 1 in the position corresponding to the element to measurement and 1 represents smoking); Drinking ␭Ј Ј Ј of [ j , bj ] to be excluded from equation j. Since the is an estimate of the amount of alcohol i consumed in Quantitative Genetics of Simultaneity and Recursiveness 1419 ⌳ ␤ the year previous to the blood pressure test, ignoring a timates of the structural model parameters , , R0,

possible error of measurement, and Exer measures the and G0 is not an easy matter, with a main difficulty being extent to which i exercises. The u and e variables are the fact that ⌳ is unknown. On the other hand, if the additive genetic and residual effects, as before, and the elements of this matrix were given, the setting would ␭’s and ␤’s are the structural model parameters. Here, be as in a multivariate mixed-effects linear model, so K ϭ 2 and the number of x variables is 5, since the two standard procedures, such as the expectation-maximiza- ␤ ␤ intercepts 11 and 21 are related to the measurements tion (EM) algorithm, could be employed for computing via the same incidence variate, which takes the value 1 the likelihood-based estimates. Another complication is for all i. The first equation has three “beta coefficients” that, typically, highly nonlinear functions of the parame- ␤ ␤ ␤ ␤ ␤ ␤ ( 11, 12, and 13) and the second has four ( 21, 22, 24, ters must be inferred. For example, see the forms of ␤ ␮ and 25). Before normalization the mean *1 in model (7) and of the coefficient of ␭ Ϫ␭ heritability in (8). Intuitively, asymptotic approxima- ⌳ ϭ 11 12 tions to the sampling distribution of the maximum-like- 2ϫ2 ΄Ϫ␭ ␭ ΅, 21 22 lihood estimates may be relatively less accurate at a given and sample size when the parametric function of interest is nonlinear than when it is linear. Note, however, that ␮ 1 *1 and (8) may be inferred from the reduced model, Age via the standard multivariate parameterization. In spe- Ϫ␤ Ϫ␤ Ϫ␤ i ϭ 11 12 13 00Smoking cial circumstances, one can form estimators of the struc- B2ϫ5xi ΄Ϫ␤ Ϫ␤ Ϫ␤ Ϫ␤ ΅ i . 21 22 0 24 25 ΄ ΅ tural parameters from statistics derived from the re- Drinking i duced model. These are called “indirect” procedures Exeri in econometrics (Johnston 1972). ␤ ϭ Equation 1 of the system uses the two exclusions 14 Also, inferring random effects is of great importance ␤ ϭ 15 0. Hence, (49) is in applied quantitative genetics (e.g., animal, plant, or tree breeding), and their best predictor would take a ␭ Ϫ␭  11 21  form such as in (32). In practice, however, calculations  Ϫ␭12 ␭22  require replacing the unknown structural parameters by  Ϫ␤ Ϫ␤   11 21  their maximum-likelihood estimates, that is, computing ⌳Ј 0000010 Ϫ Ϫ Ϫ  Ϫ␤12 Ϫ␤22  ϭ ⌳ˆ 1 ˆ ˆ ϩ ˆ 1⌳ˆ Ϫ ⌳ˆ 1 ␤ˆ R ΄ ΅ ϭ ΄ ΅ Eˆ(ui*|yi) G0(R0 G0) (yi Xi ). (50) 1 BЈ 0000001 Ϫ␤ 0   13  0 Ϫ␤ If interest focuses on the “system” genetic effects, the  24  statistic would be  0 Ϫ␤25  ϭ ϩ Ϫ1⌳ˆ Ϫ ⌳ˆ Ϫ1 ␤ Eˆ(ui|yi) Gˆ 0(Rˆ 0 Gˆ 0) (yi Xi ˆ ). 0 Ϫ␤24 ϭ ΄ ΅. The finite sample properties of the resulting empirical 0 Ϫ␤25 predictors are unknown. A common Bayesian criticism The rank of this matrix is 1 (which is K Ϫ 1), so that (Box and Tiao 1973; Gianola and Fernando 1986; the equation is identified. Equation 2 of the system Sorensen and Gianola 2002) is that (50) does not take ␤ ϭ employs the exclusion 23 0 so that (49) is the uncertainty (error of estimation) of the estimates ␭ Ϫ␭ of the parameters into account.  11 21  An alternative is to adopt a Bayesian approach, where Ϫ␭ ␭  12 22 inferences about structural parameters, random effects, Ϫ␤ Ϫ␤   11 21  or functions thereof are made from their marginal pos- Ϫ␤12 Ϫ␤22  terior distributions (Zellner 1971, 1979; Box and Tiao ͓ 0000100͔ ϭ [Ϫ␤13 0].   Ϫ␤13 0 1973; Gelman et al. 1995; Carlin and Louis 2000; Sor-   ensen and Gianola 2002). A review of some of the  0 Ϫ␤24   0 Ϫ␤  issues in simultaneous models from an econometric per- 25 spective is in Zellner (1979), Dreze and Richard (1983), Judge et al. (1985), and Koop (2003). A salient Since the rank of this matrix is 1, the second equation feature of the Bayesian analysis is its ability to produce ␭ ␭ is identified as well. Hence, 12, 21, and the elements exact finite sample inference, as well as to override po- ␤ ϭ ␤ ␤ ␤ Ј ␤ ϭ ␤ ␤ ␤ ␤ of 1 [ 11 12 13] and of 2 [ 21 22 24 25] tential underidentification of parameters. If proper pri- can be estimated uniquely. ors are adopted for all parameters in a model, all poste- rior distributions are proper as well (Bernardo and Smith 1994; O’Hagan 1994). However, unless the pa- BAYESIAN MODEL rameters are identifiable in the likelihood, the influence General: The form of the likelihood function given of the prior does not dissipate asymptotically. An exam- in (40) suggests that obtaining maximum-likelihood es- ple of this is in Carlin and Louis (2000) and in Gia- 1420 D. Gianola and D. Sorensen ␯ ␯ nola and Sorensen (2002). These authors discuss a where IW( R , VR) and IW( G , VG) denote K-dimensional situation where two random variables have the distribu- inverse Wishart distributions with “degrees of freedom” ␩ ␴ 2 ϭ ␯ ␯ ف ␮ ␴ 2 ف tions Xi N( , ) and Yi N( , ), say, but Zi parameters R and G , respectively, and scale matrices ϩ ϭ Xi Yi (i 1,2,...,N) is observed. Here, maximum VR and VG . Collecting (52–56), the joint prior density likelihood can estimate ␮ϩ␩, but not ␮ or ␩ separately. of all parameters in (51) is The individual means cannot be inferred separately be- ␭ ␤ ␭ ␶2 ␤ 2 ␯ ␯ cause an infinite number of combinations of the ele- p( , , u, R0, G0| 0, , 0, b , R , VR , G , VG) ϭ ␭ ␭ ␶2 ␤ ␤ 2 ␯ ␯ mentary parameters confer the same likelihood to ␮ϩ N( |1 0, I )N( |1 0, Ib )N(u|0, G0)IW(R0| R , VR)IW(G0| G , VG). ␩. On the other hand, a Bayesian analysis with proper (57) priors assigned to both ␮ and ␩ gives distinct, proper, Combination of (57) with the data density (37) gives marginal posterior distributions of ␮ and ␩. However, as joint posterior density of all estimands the usual asymptotic domination of the prior by the N ␪ ϰ ⌳Ϫ1 ␤ ϩ ⌳Ϫ1 ⌳ЈϪ1 likelihood does not occur. Even when sample sizes are of p( |y, H ) ͟ N(yi | (Xi Ziu), R0 ) infinite size, the prior matters; see Dreze and Richard iϭ1 ϫ ␭ ␭ ␶2 ␤ ␤ 2 ␯ ␯ (1983) and O’Hagan (1994). For this reason, it is always N( |1 0, I )N( |1 0, Ib )N(u|0, G0)IW(R0| R , VR )IW(G0| G , VG). important to investigate parameter identification, as dis- (58) cussed in the preceding section. A Bayesian analysis of the structural model (35) is presented subsequently. Prior and posterior distributions: Represent all un- FULLY CONDITIONAL POSTERIOR DISTRIBUTIONS ␪ ϭ ␭ ␤ known parameters of the model by [ , , u, R0, G0], Often, the fully conditional posterior distributions of ␭ where is a vector containing the unknown elements of the parameters of a model can be ascertained from the ⌳ , after normalization; for example, in a fully simultane- joint posterior density, (58) in this case, by retaining ϭ ␭ ␭ ous model with K 3, would contain six coefficients. the parts varying with the parameter (or group of param- No distinction is made here between fixed and random eters) of interest and treating the remaining parts as elements in the parameter vector since all unknowns known (e.g., Sorensen and Gianola 2002). This proce- are treated as unobservable random variables in the dure is employed in what follows. Bayesian setting. Bayes’ theorem gives as density of the Distributions of ␤ and u: Using the preceding concept posterior distribution in (58), one has that p(␪|y, H) ϰ p(y|␪)p(␪|H), ␤ ␭ p( , u|R0, G0, , y, H) ␪ ␪ N where p( |H) is the prior density of , H represents the ϰ ⌳Ϫ1 ␤ ϩ ⌳Ϫ1 Ϫ1⌳ЈϪ1 ␤ ␤ 2 ͟ N(yi| (Xi Ziu), R0 )N( |1 0, Ib )N(u|0, G0). collection of all known hyperparameters, and p(y|␪)is iϭ1 the density of the data. Note that p(y|␪) ϭ p(y|⌳, ␤, u, Now, since the value of ␭ is given in this conditional dis- R ), since, given u, the data-generation model does not 0 tribution, one can treat this vector as known and form depend on the genetic covariance matrix G0. Take as ⌳ ϭ joint prior density of all parameters the pseudo-data vectors yi (i 1,2,...,N). Thus ␤ ␭ ␪ ϭ ␭ ␭ ␤ ␤ p( , u|R0, G0, , y, H) p( |H) p( |H )p( |H )p(u|Hu)p(R0|HR0)p(G0|HG0), ϰ ⌳ ⌳ ⌳ ⌳ ␤ ␤ ␤ 2 (51) p( y1, y2,..., yn| , , u, R0)N( |1 0, Ib )N(u|0, G0), ⌳ ⌳ ⌳ ⌳ ␤ where, for example, p(␭|H␭) is the density of the prior where p( y1, y2,..., yn| , , u, R0) is as in (36). Re- ␤ distribution of ␭ and H␭ denotes a set of known hyper- taining only the terms that depend on and u, the ex- parameters. We take as prior distribution of ␭ the Gaus- plicit form of the conditional posterior density above is

sian process N Ϫ1 ͚ ϭ ⌳ Ϫ ␤ Ϫ Ј ⌳ Ϫ ␤ Ϫ ␤ ␭ ϰ Ϫ i 1( yi Xi Ziu) R 0 ( yi Xi Ziu) p( , u|R0, G0, , y, H ) exp΄ ΅ ␭ ␶2 2 ف ␭ ␭ ␶2 | 0, N(1 0, I ), (52) (␤ Ϫ 1␤ )Ј(␤ Ϫ 1␤ ) uЈ(AϪ1  GϪ1)u ϫ exp΄Ϫ 0 0 ΅ exp΄Ϫ 0 ΅ . ␭ 2b2 2 where 1 is a vector of ones, 0 is a known prior mean, common to all ␭’s, and ␶2 is a tuning parameter, ad- (59) justing the degree of sharpness of the prior. Further, Note in (59) that the influence of the prior distribution we assign the priors of ␤ on conditional (or marginal) inferences can be 2 ␤ ف 2 ␤ ␤ | 0, b N(1 0, Ib ), (53) tempered by taking a large value of the spread parame- ter b2. The preceding expression can be recognized as the (N(0, G ), (54 ف u|G 0 0 posterior density of the location parameters in a Gaus- ␯ ف ␯ R0| R , VR IW( R , VR), (55) sian linear model with proper priors and known vari- ance components (Lindley and Smith 1972; Gianola and and Fernando 1986; Sorensen and Gianola 2002). It is ␯ ف ␯ G0| G , VG IW( G , VG), (56) well known that that the corresponding distribution is Quantitative Genetics of Simultaneity and Recursiveness 1421 ␤ˆ N ϩ␯ Ј ϩ ف C␤␤ C␤u ␤ ␭ (R0| , u, G0, , y, H IW΂R0|N R , ͚riri VR ΃. (66 ف ␭ ␤ , u|R0, G0, , y, H N΂΄ ΅, ΄ ΅΃, (60) ϭ uˆ Cu␤ Cuu i 1 ␭ where Distribution of the elements of : All conditional posterior distributions considered so far are recogniz- Ј  Ϫ1 ϩ Ϫ2 ␤ ␤ˆ X (I R 0 )y⌳ b 1 0 C␤␤ C␤u able. However, this is not so for the elements of ␭. ΄ ΅ ϭ ΄ ΅΄ ΅ , (61) uˆ C ␤ C Ј  Ϫ1 u uu Z (I R 0 )y⌳ Retaining in (58) the parts that vary with the structural Ϫ1 Ј  Ϫ1 ϩ Ϫ 2 Ј  Ϫ1 ␭ X (I R 0 )X b IX(I R 0 )Z C␤␤ C␤u coefficients yields ΄ ΅ ϭ ΄ ΅ , (62) C ␤ C Ј  Ϫ1 Ј  Ϫ1 ϩ Ϫ1  Ϫ1 u uu Z (I R 0 )XZ(I R 0 )Z A G0 ␭ ␤ p( | , u, R0, G0, y, H) N and ϰ ⌳Ϫ1 ␤ ϩ ⌳Ϫ1 ⌳ЈϪ1 ␭ ␭ ␶2 ͟N(yi| (Xi Ziu), R0 )N( |1 0, I ). ⌳ iϭ1 y1 ⌳y This has the form ϭ 2 y⌳ . N . ␭ ␤ ϰ ⌳ N/2 Ϫ 1 ⌳ Ϫ ␤ Ϫ Ј Ϫ1 ⌳ Ϫ ␤ Ϫ ΄ ΅ p( | , u, R0, G0, y, H ) | | exp΄ ͚( yi Xi Zi u) R 0 ( yi Xi Ziu)΅ 2 ϭ ⌳ i 1 yN ϫ Ϫ 1 ␭Ϫ ␭ Ј ␭Ϫ ␭ exp΄ ( 1 0) ( 1 0΅. (67) Samples from (60) can be obtained directly, or by the 2␶2 method of Garcia-Corte´s and Sorensen (1996), or via Hence, contrary to (60), (64), and (66), the conditional Gibbs sampling, element-wise or block-wise (e.g., Wang posterior distribution of ␭ is not standard, except in a et al. 1993, 1994; Sorensen and Gianola 2002). fully recursive system (Zellner 1971), as noted below. Distributions of G0 and R0: Retaining in (58) the parts Subsequently, a Metropolis scheme (Tanner 1993; Gel- that vary with G0 gives man et al. 1995) is developed for drawing samples from ␤ ␭ ϰ ␯ the distribution having (67) as density. p(G0| , u, R0, , y, H) N(u|0, G0)IW(G0| G , VG), ⌳ ϭ Consider yi , and rewrite it (put K 3, to illustrate) as taking the explicit form Ϫ␭ Ϫ␭ yi1 12yi2 13yi3 ␤ ␭ ϰ  Ϫ1/2 Ϫ 1 Ј Ϫ1  Ϫ1 p(G0| , u, R0, , y, H ) |A G0| exp΄ u (A G0 )u΅ ⌳ ϭ y Ϫ␭ y Ϫ␭ y 2 yi ΄ i2 21 i1 23 i3 ΅ Ϫ␭ Ϫ␭ ϫ Ϫ(1/2)(␯ ϩ3) Ϫ 1 Ϫ1 yi3 31yi1 32yi2 |G0| G exp΄ tr(G0 VG)΅ 2 ␭ q q  12  ϰ Ϫ(1/2)(qϩ␯ ϩ3) Ϫ 1 Ϫ1 ij Ј ϩ |G0| G expΆ tr΄G0 ΂ ͚ ͚a uiu j VG ΃΅· , 2 iϭ1 jϭ1 ␭13    (63) yi1 yi2 yi3 0000␭  21  ij Ϫ1 ϭ yi2 Ϫ 00yi1 yi3 00  where a is the element in row i and column j of A . ΄ ΅ ΄ ΅ ␭23   This reveals that the conditional posterior distribution yi3 0000yi1 yi2 ␭  31  of G0 is the inverse Wishart process ␭32  q q ϩ␯ ij Ј ϩ ف ␭ ␤ G0| , u, R0, , y, H IW΂ G0|q G , ͚ ͚a uiuj VG ΃, ϭ yι Ϫ Yι␭, iϭ1 jϭ1 (64) where

where q is the order of A. Similarly, the density of the con- yi2 yi3 0000 ditional posterior distribution of R0 is ϭ 00y y 00 Yi ΄ i1 i3 ΅. ␤ ␭ p(R0| , u, G0, , y, H) 0000yi1 yi2 N ϰ ⌳Ϫ1 ␤ ϩ ⌳Ϫ1 ⌳ЈϪ1 ␯ ͟N(yi| (Xi Ziu), R0 )IW(R0| R , VR). Using this representation in (67) above gives iϭ1 N ␭ ␤ ϰ ⌳ N/2 Ϫ 1 Ϫ ␭ Ј Ϫ1 Ϫ ␭ p( | , u, R0, G0, y, H) | | exp΄ ͚(wi Yi ) R0 (wi Yi )΅ Since the elements of ⌳ are given in this distribution, 2 iϭ1 1 one can also write ϫ exp΄Ϫ (␭ Ϫ 1␭ )Ј(␭ Ϫ 1␭ )΅, 2␶2 0 0 N ␤ ␭ ϰ ⌳ ␤ ϩ ␯ p(R0| , u, G0, , y, H ) ͟N( yi|Xi Ziu, R0)IW(R0| R , VR) ϭ Ϫ ␤ Ϫ iϭ1 where wi yi Xi Ziu. The two quadratic forms on N ␭ ϰ ϪN/2 Ϫ 1 ⌳ Ϫ ␤ Ϫ Ј Ϫ1 ⌳ Ϫ ␤ Ϫ appearing in the exponents can be combined using |R0| exp΄ ͚( yi Xi Ziu) R 0 ( yi Xi Ziu)΅ 2 iϭ1 standard results (e.g., Box and Tiao 1973; Sorensen

ϫ Ϫ(1/2)(␯ ϩ3) Ϫ 1 Ϫ1 |R0| R exp΄ tr(R 0 VR)΅ and Gianola 2002), yielding 2 N N/2 1 ^ Ϫ1 ^ Ϫ(1/2)(Nϩ␯ ϩ3) 1 Ϫ1 ␭ ␤ ϰ ⌳ Ϫ ␭ Ϫ ␭ Ј ␭ Ϫ ␭ ϰ R Ϫ Ј ϩ p( | , u, R , G , y, H) | | exp΄ ( ) V␭ ( )΅, |R0| expΆ tr΄R 0 ΂͚ ri ri VR΃΅· , 0 0 2 iϭ1 (65) 2 ϭ ⌳ Ϫ ␤ Ϫ (68) where ri yi Xi Ziu. This is the density of the inverse Wishart distribution where 1422 D. Gianola and D. Sorensen Ϫ N 1 N ␭ˆ ϭ Ј Ϫ1 ϩ␶Ϫ2 Ј Ϫ1 ϩ␶Ϫ2 ␭ until a sufficiently small Monte Carlo error of estimation ΂ ͚Yi R0 Yi I΃ ΂ ͚Yi R0 wi 1 0΃, iϭ1 iϭ1 of features of the posterior has been attained. Typically, the estimators of the features are ergodic averages. A dis- and Ϫ cussion of convergence analysis is in Cowles and Car- N 1 ϭ Ј Ϫ1 ϩ␶Ϫ2 lin (1996). V␭ ΂ ͚Yi R0 Yi I΃ . iϭ1 To illustrate, consider inferring the offspring-parent regression in a simultaneous two-trait model, as dis- An important simplification occurs in a fully recursive ϭ cussed in genetic consequences of simultaneity in system, e.g., when y2 is a linear function of y1, and y3 ⌳ ⌳ ϭ a two-trait system. The Monte Carlo estimator of the f(y1, y2); here, is a triangular matrix, so that | | 1. regression is In this case, the conditional posterior distribution of ␭ m 1 ͑␴ 2( j ) ϩ␭2( j )␴ 2( j )͒ ϩ␭( j )␴( j ) ͑␭ˆ ͒ ^ ⁄2 u1 12 u2 12 u12 is exactly the normal process N , V␭ , so sampling is ϭ 1 ͚ , bOP 2( j ) 2( j ) 2( j ) ( j ) ( j ) 2( j ) 2( j ) 2( j ) m ϭ ␴ ϩ␴ ϩ ␭ ͑␴ ϩ␴ ͒ ϩ␭ ͑␴ ϩ␴ ͒ straightforward. j 1 u1 e1 2 12 u12 e12 12 u2 e2 For the general case, our Metropolis algorithm for ␴( j ) ␭( j ) where, for example, u12 and 12 are samples from the drawing samples from (68) uses the multivariate normal ␴ ␭ marginal posterior distributions of u12 and 12, respec- distribution N͑␭ˆ, V␭͒ for generating candidates. The tively. A second example is that of inferring the additive algorithm proceeds as outlined below: genetic value of an individual for trait 1 in connection Draw a candidate ␭* from N͑␭ˆ, V␭͒, with ␭ˆ and V␭ with model (7). The mean of the posterior distribution ␤ is estimated as computed from the given , u, R0, G0, y, and H values, which we refer to as state of the system at time t. 1 m u( j ) ϩ␭( j )u( j ) Eˆ(u*|y, H) ϭ ͚ i1 12 i2 . i1 Ϫ␭( j )␭( j ) Calculate the acceptance probability m jϭ1 1 12 21 Further, in the context of multivariate selection, the (t ) Ј Ϫ1 (t ) ⌳ N/2 Ϫ 1 ␭ Ϫ ␭ˆ ͑ (t )͒ ␭ Ϫ ␭ˆ estimate of the posterior mean of the merit of candidate | *| exp΄ ⁄2΂ * ΃ V␭ ΂ * ΃΅ ␣ϭ . i would be, using (32) and (33), Ј Ϫ Ϫ Ϫ (t ) (t ) 1 Ϫ (t ) ^ Ј m Ϫ1 |⌳(t 1)|N/2exp΄Ϫ 1⁄ ΂␭(t 1) Ϫ ␭ˆ ΃ ͑V␭ ͒ ΂␭(t 1) Ϫ ␭ˆ ΃΅ ^ ϭ v ⌳( j )Ϫ1 ( j )͑ ( j ) ϩ ( j )͒ ͑⌳( j ) Ϫ ␤( j )͒ 2 Ti ͚ G0 R0 G0 yi Xi . m jϭ1 Set ␭* with probability min (␣,1) CONCLUSION (t ) ␭ Ά Ϫ ␭(t 1) otherwise. This article proposes an extension of the standard linear models for analysis of quantitative traits in genet- ics to situations in which there is feedback or recursive- MARKOV CHAIN MONTE CARLO AND ness between phenotypic values. In addition, techniques INFERENCES FROM THE SAMPLES for parameter inference are described, with a focus on Drawing marginal inferences about unknown aspects Bayesian analysis via Markov chain Monte Carlo meth- of the joint posterior distribution (58) is impossible by ods. The developments represent a merger of standard analytical means, since this requires high-dimensional quantitative genetics theory and modern tools from integration of unwieldy expressions. Instead, we pro- Bayesian inference with existing knowledge from econo- pose to infer features of interest by iterative sampling metrics and sociology, i.e., simultaneous systems and via Markov chain Monte Carlo methods. The requisite structural equations models (Joresko¨g and Sorbo¨m theory is presented in Gilks et al. (1996), Robert and 2001). Casella (1999), and Sorensen and Gianola (2002). Wright (1925) pioneered on this type of treatment Briefly, the idea is to create a Markov chain with (58) of multivariate systems, beginning with his “corn and as equilibrium distribution. In our context, the sampling hog correlations,” as noted by Goldberger (1972), and starts from some initial point ␪(0) inside the parameter reaching a climax in Wright (1960), where a difficult space, with updates obtained by successive looping path analysis of “reciprocal interaction with or without through the fully conditional posterior distributions lag” is presented. Curiously, although Wright’s ideas (60), (64), (66), and (68). This defines what has been were foundational in animal breeding and quantitative termed a “Metropolis within Gibbs” algorithm: the genetics, his work on feedback and on joint determina- draws from the known fully conditionals constitute the tion of systems of variables did not receive much atten- Gibbs proposals (accepted with probability equal to 1), tion (if at all) in these fields, even though biological with the Metropolis step for ␭ completing a loop of the systems typically display reciprocity between variables, algorithm (if the system is recursive, this becomes a with instantaneous or delayed feedbacks. Thus, it is per- Gibbs step as well). The early samples are discarded as plexing that this type of work has been ignored in quan- burn-in, i.e., prior to declaration of convergence, and titative genetics. Path analysis (Wright 1921) was not successive samples (m, say) are collected subsequently favored by influential statistical geneticists such as Quantitative Genetics of Simultaneity and Recursiveness 1423

Charles Henderson (C. Henderson, personal commu- capture such behavior in the model. Lack of linearity nication) and Oscar Kempthorne (Kempthorne 1969), makes identification and computation more difficult, who preferred variance component models and matrix but the Bayesian Markov chain Monte Carlo machinery treatments instead. For instance, calculation of additive works satisfactorily with nonlinear systems (Gilks et al. relationships and of inbreeding coefficients is straight- 1996). Another extension consists of a hierarchical mod- forward with tabular (matrix) methods (Henderson elling of the ␭ coefficients controlling feedback or re- 1976; Quaas 1976), and the inverse of large relationship cursion. Here, we have assumed that the model depends matrices is crucial in genetic evaluation via the mixed- on a single set of ␭ parameters. However, these may be model algorithm (see Gianola 2001, for a vignette). clustered, e.g., family aggregation of parameters, or be Similar calculations are awkward, at best, if done with affected by a locus with major effects. This sort of the method of path coefficients, and it is not obvious multitier specification can be handled via the usual how the needed inverse can be generated directly from Bayesian modeling of hierarchies, in the sense of Lind- the paths. Also, animal breeders often need to account ley and Smith (1972). for nuisance fixed effects (the explanatory variables are We have focused on feedback or recursiveness at the often discrete) and for interaction variance components phenotypic level. However, these effects might also take in the models. This also contributed to the fading away place at the level of the genes, in which case the model of path analysis in animal breeding, since it was not clear could be written as how these problems could be treated in the standard ⌳ y ϭ X ␤ ϩ ⌳ u ϩ e , framework of path analysis. Arguably, a heavy emphasis y i i u i i on large-scale computations may have distracted animal with a “purely genetic” (phenotypic) feedback obtained ⌳ ⌳ and plant breeders away from the modeling process, by setting y ( u) equal to an identity matrix. It would where a path diagram has an unparalleled eloquence. be of interest to study parameter identification issues On the other hand, path analytic methods have had in this more general model. an important impact in sociometrics. For example, see There are situations specific to animal breeding Duncan (1975) for an account of structural equation where introducing simultaneity in the model may allevi- models. Further, the LISREL software (now in version ate some inadequacies of standard treatments. Con- 8; Joresko¨g and Sorbo¨m 2001) for estimation of struc- sider, for example, dairy cows with first and second tural coefficients has been used in sociology since the lactation milk yield records. It is known that if a cow mid 1970s. A discussion of the use of structural equation has a good first record of performance, then there is a models in biology is in Shipley (2002), with emphasis chance that she will receive preferential nutrition and on causal inference; an abundant literature on estima- management, relative to her herd mates. One can think tion and testing in structural models can be found in of an effect of the first record on the second even in this text. In this context, our article can be viewed as the absence of a mechanistic basis for this when the an attempt to reclaim, for quantitative genetics, the breeding value and permanent environmental effect of heritage of Wright modeling ideas, although in the light the cow are included in the statistical models for genetic of modern machinery developed mainly by econometri- evaluation. Similarly, one may consider including an cians. effect of second on first records on similar grounds, as As pointed out in matrix representations, there if the future had an influence on the present. can be many competing models for explaining relation- In short, there is a plethora of potential scenarios in ships in a multivariate system. Thus, model selection biology where recursiveness or simultaneity enters into issues are paramount. The standard Bayesian tool kit the realm of the possible. In spite of the towering influ- for model selection and criticism, including the Bayes ence of Sewall Wright, these phenomena have been factor (Bernardo and Smith 1994; O’Hagan 1994) essentially ignored in quantitative genetics. It is hoped and posterior predictive checks (Gelman et al. 1995; that the techniques described in this article will contrib- Sorensen and Waagepetersen 2003), can be employed ute toward a deeper understanding of complex traits. in connection with simultaneity and recursiveness with- The authors thank James Crow, Arthur Goldberger, Guilherme out extra difficulties, other than those related to the Rosa, and Arnold Zellner for useful comments. Research was sup- computations. ported by the Wisconsin Agriculture Experiment Station and the The methods presented here can be extended in sev- Danish Institute of Agricultural Sciences and by grants from the Na- eral directions. For example, some of the response vari- tional Research Initiatives Competitive Grants Program/U.S. Depart- ables involved may be discrete. Here, one could intro- ment of Agriculture (99-35205-8162 and 2003-35205-12833) and the National Science Foundation (DEB-0089742). duce latent variables, such as the typical liability variate of quantitative genetics (Thompson 1979; Gianola 1982), and then model feedback or recursiveness at that LITERATURE CITED level. Also, nonlinearity (in the parameters) may be dictated by mechanistic considerations. For example, a Bernardo, J. M., and A. F. M. Smith, 1994 Bayesian Theory. John Wiley & Sons, New York. gene product may accumulate following the trajectory Box, G. E. P., and G. C. Tiao, 1973 Bayesian Inference in Statistical of some nonlinear growth curve, and one may wish to Analysis. Addison-Wesley, Reading, MA. 1424 D. Gianola and D. Sorensen

Bulmer, M. G., 1980 The Mathematical Theory of Quantitative Genetics. T. C. Lee, 1985 The Theory and Practice of Econometrics. John Oxford University Press, New York. Wiley & Sons, New York. Carlin, B. P., and T. A. Louis, 2000 Bayes and Empirical Bayes Methods Judge, G. G., R. Carter Hill, W. E. Griffiths, H. Lutkepohl and for Data Analysis. Chapman & Hall/CRC Press, Boca Raton, FL. T. C. Lee, 1988 Introduction to the Theory and Practice of Economet- Cheverud, J. M., 1984 Quantitative genetics and developmental rics. John Wiley & Sons, New York. constraints on evolution by selection. J. Theor. Biol. 110: 155–171. Kacser, H., and J. A. Burns, 1981 The molecular basis of domi- Cowles, M. K., and B. P. Carlin, 1996 Markov chain Monte Carlo nance. Genetics 97: 639–666. convergence diagnostics: a comparative review. J. Am. Stat. Assoc. Kempthorne, O., 1969 An Introduction to Genetic Statistics. Iowa State 91: 883–904. University Press, Ames, IA. Dreze, J., and J. F. Richard, 1983 Bayesian analysis of simultaneous Koerhuis, A. N. M., and R. Thompson, 1997 Models to estimate equations systems, pp. 519–598 in Handbook of Econometrics, Vol. maternal effects for juvenile body weight in broiler chickens. 1, edited by Z. Griliches and M. Intriligator. North Holland, Genet. Sel. Evol. 29: 225–249. Amsterdam. Koop, G., 2003 Bayesian Econometrics. John Wiley & Sons, Chiches- Duncan, O. D., 1975 Introduction to Structural Equation Models. Aca- ter, UK. demic Press, San Diego. Lewin, B, 1985 Genes. John Wiley & Sons, New York. Falconer, D. S., 1965 Maternal effects and selection response, Lindley, D. V., and A. F. M. Smith, 1972 Bayes estimates for the pp. 763–774 in Genetics Today, edited by S. J. Geerts. Pergamon linear model. J. R. Stat. Soc. B 34: 1–41. Press, New York. O’Hagan, A., 1994 Kendall’s Advanced Theory of Statistics, Vol. 2B: Falconer, D. S., and T. F. C. Mackay, 1996 Introduction to Quantita- Bayesian Inference. Arnold, London. tive Genetics. Prentice-Hall, Harlow, UK. Quaas, R. L., 1976 Computing the diagonal elements of a large Fisher, R. A., 1918 The correlation between relatives on the supposi- numerator relationship matrix. Biometrics 32: 949–953. tion of Mendelian inheritance. Trans. R. Soc. Edinb. 52: 399–433. Robert, C. P., and G. Casella, 1999 Monte Carlo Statistical Methods. Garcia-Corte´s, L. A., and D. Sorensen, 1996 On a multivariate im- Springer, New York. plementation of the Gibbs sampler. Genet. Sel. Evol. 28: 121–126. Searle, S. R., 1971 Linear Models. John Wiley & Sons, New York. Gelman, A., J. B. Carlin, H. S. Stern and D. B. Rubin, 1995 Bayesian Shipley, B., 2002 Cause and Correlation in Biology. Cambridge Univer- Data Analysis. Chapman & Hall, London. sity Press, Cambridge, UK. Gianola, D., 1982 Theory and analysis of threshold characters. Smith, H. F., 1936 A discriminant function for plant selection. Ann. J. Anim. Sci. 54: 1079–1096. Eugen. 7: 240–250. Gianola, D., 2001 Inference about breeding values, pp. 645–672 Sorensen, D., and D. Gianola, 2002 Likelihood, Bayesian, and MCMC in Handbook of Statistical Genetics, edited by D. J. Balding,M. Methods in Quantitative Genetics. Springer-Verlag, New York. Bishop and C. Cannings. John Wiley & Sons, Chichester, UK. Sorensen, D., and R. Waagepetersen, 2003 Normal linear models Gianola, D., and R. L. Fernando, 1986 Bayesian methods in animal with genetically structured residual variance heterogeneity: a case breeding theory. J. Anim. Sci. 63: 217–244. study. Genet. Res. 82: 207–222. Gilks, W. R., S. Richardson and D. J. Spiegelhalter, 1996 Markov Tanner, M. A., 1993 Tools for Statistical Inference. Springer-Verlag, Chain Monte Carlo in Practice. Chapman & Hall, London. New York. Goldberger, A. S., 1972 Structural equation methods in the social Thompson, R., 1979 Sire evaluation. Biometrics 35: 339–353. sciences. Econometrica 40: 979–1001. Turner, M. E., and C. E. Stevens, 1959 The regression analysis of Goldberger, A. S., 1998 Introductory Econometrics. Harvard University causal paths. Biometrics 15: 236–258. Press, Cambridge, MA. Walsh, B., 2003 Evolutionary quantitative genetics, pp. 380–442 in Goldberger, A. S., and O. D. Duncan (Editors), 1973 Structural Handbook of Statistical Genetics, Vol. 1, Ed. 2, edited by D. J. Bald- Equation Models in the Social Sciences. Seminar Press, New York. ing,M.Bishop and C. Cannings. John Wiley & Sons, Chiches- Haavelmo, T., 1943 The statistical implications of a system of simul- ter, UK. taneous equations. Econometrica 11: 1–12. Wang, C. S., J. J. Rutledge and D. Gianola, 1993 Marginal infer- Haldane, J. B. S., and J. G.M. Priestley, 1905 The regulation of ences about variance components in a mixed linear model using the lung-ventilation. J. Physiol. 32: 225–266. Gibbs sampling. Genet. Sel. Evol. 25: 41–62. Hazel, L. N., 1943 The genetic basis for constructing selection in- Wang, C. S., J. J. Rutledge and D. Gianola, 1994 Bayesian analysis dexes. Genetics 28: 476–490. of mixed linear models via Gibbs sampling with an application to litter size in Iberian pigs. Genet. Sel. Evol. 26: 91–115. Henderson, C. R., 1963 Selection index and expected genetic ad- Warren, R. D., J. K. White and W. A. Fuller, 1974 An errors- vance, pp. 141–163 in Statistical Genetics and , edited in-variables analysis of managerial role performance. J. Am. Stat. by W. D. Hanson and H. F. Robinson. Pub. 982, National Acad- Assoc. 69: 886–893. emy of Sciences-National Research Council, Washington, DC. Wright, S., 1921 Correlation and causation. J. Agric. Res. 201: Henderson, C. R., 1973 Sire evaluation and genetic trends. Proceed- 557–585. ings of the Animal Breeding and Genetics Symposium in Honor Wright, S., 1925 Corn and Hog Correlations. Bull. 1300, U.S. Depart- of Dr. Jay L. Lush. American Society of Animal Science and ment of Agriculture, Washington, DC. American Dairy Science Association, Champaign, IL, pp. 10–41. Wright, S., 1960 The treatment of reciprocal interaction, with or Henderson, C. R., 1976 A simple method for computing the inverse without lag, in path analysis. Biometrics 16: 423–445. of a numerator relationship matrix used in prediction of breeding Zellner, A., 1971 An Introduction to Bayesian Inference in Econometrics. values. Biometrics 32: 69–83. John Wiley & Sons, New York. Johnston, J. M., 1972 Econometric Methods. McGraw-Hill, New York. Zellner, A., 1979 Statistical analysis of econometric models. J. Am. Joresko¨g, K., and D. Sorbo¨m, 2001 Lisrel 8: User’s Reference Guide. Stat. Assoc. 74: 628–643. Scientific Software International, Lincolnwood, IL. Judge, G. G., W. E. Griffiths, R. Carter Hill, H. Lutkepohl and Communicating editor: J. B. Walsh