File S1
Mathematical derivations
Identity by descent (IBD) and identity by state (IBS)
Consider the process g in the main text. As we assume that the genes under consideration are neutral, the expected allele frequency for any individual equals to that in the ancestral generation, = . As ๐น๐น is an indicator function with values 0 or 1, it further holds that = , and = 0 for ๐ธ๐ธ๏ฟฝ๐ฅ๐ฅ๐๐๐๐๐๐๐๐ ๏ฟฝ ๐๐๐๐๐๐ ๐ฅ๐ฅ๐๐๐๐๐๐๐๐ . The probability that the alleles and in a gene j are of the same type u for the individuals i and i' is ๐ธ๐ธ๏ฟฝ๐ฅ๐ฅ๐๐๐๐๐๐๐๐ ๐ฅ๐ฅ๐๐๐๐๐๐๐๐ ๏ฟฝ ๐๐๐๐๐๐ ๐ธ๐ธ๏ฟฝ๐ฅ๐ฅ๐๐๐๐๐๐๐๐ โฒ ๐ฅ๐ฅ๐๐๐๐๐๐๐๐ ๏ฟฝ given by . The probability that two randomly chosen alleles in a gene j are of the same type u for ๐ข๐ข โ ๐ข๐ขโฒ ๐๐ ๐๐โฒ the individuals i and i' is given by /4. This probability can be decomposed into two ๐ธ๐ธ๏ฟฝ๐ฅ๐ฅ๐๐๐๐๐๐๐๐ ๐ฅ๐ฅ๐๐โฒ๐๐๐๐ โฒ๐ข๐ข ๏ฟฝ , components. First, the alleles may be identical by descent, the ancestral allele being of type u. The probability โ๐๐ ๐๐โฒ ๐ธ๐ธ๏ฟฝ๐ฅ๐ฅ๐๐๐๐๐๐๐๐ ๐ฅ๐ฅ๐๐โฒ๐๐๐๐ โฒ๐ข๐ข ๏ฟฝ of this event is . The second option is that the alleles are not identical by descent, but the alleles in the ancestral generation were both of type . As the probability of the latter event is (1 ) 2 , we obtain ๐๐๐๐๐๐โฒ ๐๐๐๐๐๐ 1 ๐๐๐๐๐๐ โ ๐๐๐๐๐๐โฒ ๐๐๐๐๐๐ E = + (1 ) 2 . (Eq. A1) 4 , ๏ฟฝ ๏ฟฝ๐ฅ๐ฅ๐๐๐๐๐๐๐๐ ๐ฅ๐ฅ๐๐โฒ๐๐๐๐ โฒ๐ข๐ข ๏ฟฝ ๐๐๐๐๐๐โฒ ๐๐๐๐๐๐ โ ๐๐๐๐๐๐โฒ ๐๐๐๐๐๐ Similarly, for , ๐๐ ๐๐โฒ
โ ๐ข๐ขโฒ 1 E = (1 ) . (Eq. A2) 4 , โฒ ๏ฟฝ ๏ฟฝ๐ฅ๐ฅ๐๐๐๐๐๐๐๐ ๐ฅ๐ฅ๐๐โฒ๐๐๐๐ โฒ๐ข๐ขโฒ ๏ฟฝ โ ๐๐๐๐๐๐โฒ ๐๐๐๐๐๐ ๐๐๐๐ ๐ข๐ข Thus, ๐๐ ๐๐โฒ
Cov[ , ] = E 4 = 4 , , , (Eq. A3) , , โฒ โฒ ๏ฟฝ ๐ฅ๐ฅ๐๐๐๐๐๐๐๐ ๐ฅ๐ฅ๐๐โฒ๐๐๐๐ โฒ๐ข๐ขโฒ ๏ฟฝ๏ฟฝ ๏ฟฝ๐ฅ๐ฅ๐๐๐๐๐๐๐๐ ๐ฅ๐ฅ๐๐โฒ๐๐๐๐ โฒ๐ข๐ขโฒ ๏ฟฝ๏ฟฝ โ ๐๐๐๐๐๐ ๐๐๐๐๐๐ โฒ ๐๐๐๐๐๐ โ๐๐ ๐ข๐ข ๐ข๐ข ๐๐ ๐๐โฒ ๐๐ ๐๐โฒ where , , = , . As we assume no linkage among the genes, it holds that Cov , โฒ โฒ= 0 for โฒ. โ๐๐ ๐ข๐ข ๐ข๐ข ๐ฟ๐ฟ๐ข๐ข ๐ข๐ข ๐๐๐๐๐๐ โ ๐๐๐๐๐๐ ๐๐๐๐ ๐ข๐ข โฒ โฒ โฒ โฒ ๏ฟฝ๐ฅ๐ฅ๐๐๐๐๐๐๐๐ ๐ฅ๐ฅ๐๐ ๐๐ ๐๐ ๐ข๐ข ๏ฟฝ ๐๐ โ ๐๐โฒ
IBD and IBS based definitions for FST, FIS and FIT
We denote by 0 the IBS probability for two alleles (same or different copy) in a randomly sampled gene j in a randomly sampled individual i in a randomly sampled subpopulation X. We denote by 1 the IBS probability for ๐๐ 2 SI O. Ovaskainen et al. ๐๐ two alleles that are obtained by sampling two individuals (same or different) from a randomly chosen subpopulation, and then sampling alleles from a randomly chosen gene j of these two individuals. We denote by 2 the IBS probability for two alleles that are obtained by randomly selecting two subpopulations (same or different), then two individuals from these, and then two alleles of the same randomly selected gene j. ๐๐ 2 We denote by = (1/ ) , the probability that two alleles from a randomly chosen gene j that are not identical by descent are identical by state. Note that IBS is possible when IBD does not hold although we ignore ๐๐ ๐๐โ โ๐๐ ๐ข๐ข ๐๐๐๐๐๐ mutation, because multiple copies are available in the ancestral population. By Eq. A3 and the formulae for , , and (see Table I in the main text), ๐ฎ๐ฎ ๐๐ ๐ซ๐ซ ๐๐ ๐๐ 1 1 1 1 = E = + 1 , (Eq. A4) 0 4 , โฒ ๐ฎ๐ฎ ๐ฎ๐ฎ ๐๐๐๐๐๐๐๐ ๐๐๐๐ ๐๐ ๐ข๐ข ๐๐ ๐ซ๐ซ ๏ฟฝ ๐๐ ๏ฟฝ โ ๏ฟฝ ๏ฟฝ ๏ฟฝโฒ ๏ฟฝ๐ฅ๐ฅ ๐ฅ๐ฅ ๏ฟฝ ๐๐ ๏ฟฝ โ ๐๐ ๏ฟฝ ๐๐ ๐๐ 1๐๐โ๐ซ๐ซ ๐๐ 1๐๐โ๐๐ ๐๐ 1๐๐ ๐ข๐ข ๐๐1๐๐ = E = + (1 ) , (Eq. A5) 1 2 4 , , ๐๐๐๐๐๐๐๐ ๐๐โฒ๐๐๐๐ โฒ๐ข๐ข ๐๐ ๐ซ๐ซ ๏ฟฝ ๐๐ ๏ฟฝ โ ๏ฟฝ ๏ฟฝ ๏ฟฝ ๏ฟฝ๐ฅ๐ฅ ๐ฅ๐ฅ ๏ฟฝ ๐๐ โ ๐๐ ๐๐ 1 ๐๐ ๐๐โ๐ซ๐ซ1 ๐๐ ๐๐ ๐๐โฒโ๐๐ ๐๐ 1 ๐๐ ๐ข๐ข 1๐๐ ๐๐โฒ = E = + 1 . (Eq. A6) 2 2 4 , , , ๐ซ๐ซ ๐ซ๐ซ ๐๐๐๐๐๐๐๐ ๐๐โฒ๐๐๐๐ โฒ๐ข๐ข ๐๐ ๏ฟฝ ๐๐ ๐๐ ๏ฟฝ ๏ฟฝ ๏ฟฝ ๏ฟฝ ๏ฟฝ๐ฅ๐ฅ ๐ฅ๐ฅ ๏ฟฝ ๐๐ ๏ฟฝ โ ๐๐ ๏ฟฝ ๐๐ ๐๐๐ซ๐ซ ๐๐ ๐๐โ๐ต๐ต ๐๐ ๐๐ ๐๐โ๐๐ ๐๐โฒโ๐๐ ๐๐โ ๐๐ ๐ข๐ข ๐๐ ๐๐โฒ We note that in the notation of COCKERHAM and WEIR (1987), 1 = 0, 2 = 1, 3 = 2. COCKERHAM and WEIR 2 2 (1987) defined 0 = 1 1 as the variance within individuals, 1 = 1 2 as the variance among individuals 2 ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ within subpopulations, and 2 = 2 3 as the variance among subpopulations within the population. ๐๐ โ ๐๐ ๐๐ ๐๐ โ ๐๐ Defining as the ratio of subpopulation variance to the total variance, we obtain ๐๐ ๐๐ โ ๐๐ ๐๐๐๐ 2 ๐น๐น 2 1 2 = 2 2 2 = (Eq. A7) + + 1 2 0 ๐๐1 2 ๐๐ โ ๐๐ ๐น๐น๐๐๐๐ Similarly, ๐๐ ๐๐ ๐๐ โ ๐๐
0 1 0 2 = , = . 1 1 1 2 ๐๐ โ ๐๐ ๐๐ โ ๐๐ ๐น๐น๐ผ๐ผ๐ผ๐ผ ๐น๐น๐ผ๐ผ๐ผ๐ผ measure the variance among individuals relativeโ to๐๐ that of the subpopulationsโ ๐๐ or the total variance. Combining the above, we obtain the IBD-based analogues
+ (1 ) 1 (1 ) = ๐ซ๐ซ ๐ซ๐ซ = ๐ซ๐ซ = , (Eq. A8) 1 1 (1 )(1 ) 1 ๐ซ๐ซ ๐๐ โ ๐๐ ๐๐ โ ๐๐ โ ๏ฟฝ โ ๐๐ ๏ฟฝ ๐๐ ๏ฟฝ๐๐ โ ๐๐ ๏ฟฝ โ ๐๐ ๐๐ โ ๐๐ ๐น๐น๐๐๐๐ ๐ซ๐ซ ๐ซ๐ซ ๐ซ๐ซ ๐ซ๐ซ โ ๐๐ โ ๏ฟฝ โ ๐๐ ๏ฟฝ ๐๐ โ ๐๐ โ ๐๐ โ ๐๐ = , (Eq. A9) 1๐ฎ๐ฎ ๐๐ โ ๐๐ ๐น๐น๐ผ๐ผ๐ผ๐ผ โ ๐๐
O. Ovaskainen et al. 3 SI
= . (Eq. A10) 1๐ฎ๐ฎ ๐ซ๐ซ ๐๐ โ ๐๐ ๐น๐น๐ผ๐ผ๐ผ๐ผ ๐ซ๐ซ โ ๐๐ Additive genetic covariance among individuals
The covariance in the genetic value between traits m and in individuals i and is given by โฒ โฒ ๐๐ ๐๐ Cov[ , , ] = Cov , = 4 , , (Eq. A11) , , , , โฒ , , โฒ ๐๐๐๐ ๐๐โฒ ๐๐โฒ ๐๐๐๐๐๐ ๐๐๐๐ โฒ๐๐โฒ ๐๐๐๐๐๐๐๐ ๐๐โฒ๐๐๐๐ โฒ๐ข๐ขโฒ ๐๐๐๐ ๐๐๐๐๐๐ ๐๐๐๐ โฒ๐๐โฒ ๐๐ ๐ข๐ข ๐ข๐ข ๐๐ ๐๐ ๏ฟฝโฒ ๐ฃ๐ฃ ๐ฃ๐ฃ ๏ฟฝ๐ฅ๐ฅ ๐ฅ๐ฅ ๏ฟฝ ๐๐ ๏ฟฝโฒ ๐ฃ๐ฃ ๐ฃ๐ฃ โ ๐๐ ๐๐ ๐ข๐ข ๐๐ ๐ข๐ขโฒ ๐๐ ๐ข๐ข ๐ข๐ข To define the matrix, we construct a hypothetical individual representing the ancestral allele frequencies. We thus let ๐๐be a set of random vectors (independently for each j) that follow the multinomial distribution ๐๐ with parameters 1 and , so that E = . Let = โฆ , be a random vector with = ๐ฅ๐ฅ๐๐๐๐ 1, . Let be the covariance matrix with = 2 Cov[ โ , ]. We have , ๐๐๐๐๐๐ ๏ฟฝ๐ฅ๐ฅ๐๐๐๐ ๏ฟฝ ๐๐๐๐๐๐ ๐ผ๐ผ ๏ฟฝ๐ผ๐ผ ๐ผ๐ผ๐๐ ๏ฟฝ ๐ผ๐ผ๐๐ ๐๐ ๐๐ โ๐๐ ๐ข๐ข ๐ฅ๐ฅ๐๐๐๐ ๐ฃ๐ฃ๐๐๐๐๐๐ ๐๐ ๐บ๐บ๐๐๐๐ โฒ ๐ผ๐ผ๐๐ ๐ผ๐ผ๐๐โฒ = 2 Cov , = 2 , , (Eq. A12) ๐๐ , , , , โฒ ๐๐๐๐ โฒ ๐๐๐๐๐๐ ๐๐๐๐โฒ๐๐โฒ ๐๐๐๐ ๐๐๐๐โฒ ๐๐๐๐๐๐ ๐๐๐๐โฒ๐๐โฒ ๐๐ ๐ข๐ข ๐ข๐ข ๐บ๐บ ๏ฟฝ ๐ฃ๐ฃ ๐ฃ๐ฃ ๏ฟฝ๐ฅ๐ฅ ๐ฅ๐ฅ ๏ฟฝ ๏ฟฝโฒ ๐ฃ๐ฃ ๐ฃ๐ฃ โ ๐๐ ๐ข๐ข ๐ข๐ขโฒ ๐๐ ๐ข๐ข ๐ข๐ข
Combining the above shows that
Cov[ , ] = 2 (Eq. A13) โฒ โฒ ๐๐ ๐๐๐๐ ๐๐๐๐ ๐๐๐๐๐๐ ๐๐
We may consider Eq. A14 as the definition for the matrix. depends on the allele frequencies in the ancestral population ( ) and additive values of the๐๐ alleles ( )๐๐. Equivalently, as discussed above, we can ๐๐ ๐๐ define. by constructing a hypothetical individual i that represents the allele frequencies in the focal ๐๐๐๐๐๐ ๐ฃ๐ฃ๐๐๐๐ population,๐๐ i.e. by randomizing the alleles according to the allele frequencies independently for each allele ๐๐ in each gene. This leads to = 1/2, so the interpretation that is the variance in the additive value in ๐๐๐๐๐๐ such a hypothetical individual is consistent with Eq. A13. ๐๐ ๐๐๐๐๐๐ ๐๐
G in a local population
To compute for a particular population, we apply Eq. A12 with the allele frequencies corresponding to those present in that population. We denote the amount of additive variance present in ๐๐๐๐population by , ๐๐ 1 ๐๐ and the allele frequencies present in the population by = . ( ) 2 , ๐๐ ๐๐๐๐ ๐๐ ๐๐ ๐๐ ๐๐๐๐ ๐๐๐๐ โ๐๐โ๐๐ ๐๐ ๐ฅ๐ฅ๐๐๐๐๐๐๐๐ By the above it holds that
4 SI O. Ovaskainen et al.
, = 2 , ( ) ( ) ( ) โฒ , , โฒ โฒ โฒ โฒ ๐๐๏ฟฝ๐๐ ๐๐ ๏ฟฝ ๐๐๐๐๐๐ ๐๐ ๐ข๐ข ๐๐ ๐ข๐ข ๐ข๐ข ๐๐ ๐๐๐๐ ๐๐ ๐๐๐๐ ๐๐ ๐๐๐ข๐ข ๐๐ ๏ฟฝโฒ ๐ฃ๐ฃ ๐ฃ๐ฃ ๏ฟฝ๐ฟ๐ฟ ๐๐ โ ๐๐ ๐๐ ๏ฟฝ ๐๐ ๐ข๐ข ๐ข๐ข = 2 ( ) ( ) ( ) . (Eq. A14) โฒ , โฒ ๏ฟฝ ๏ฟฝ๏ฟฝ ๐๐ ๐๐ ๐๐๐๐ ๐ฃ๐ฃ๐๐๐๐๐๐ ๐ฃ๐ฃ๐๐๐๐ ๐๐ โ ๏ฟฝ ๐๐ ๐๐ ๐๐๐๐ ๐๐ ๐๐ ๐๐๐ข๐ข ๐ฃ๐ฃ๐๐๐๐๐๐ ๐ฃ๐ฃ๐๐๐๐ โฒ๐๐โฒ ๏ฟฝ ๐๐ ๐ข๐ข ๐ข๐ข ๐ข๐ขโฒ The allele frequencies in the population and thus also depend on the realization of the process g. To compute the expectation of , we note that ๐๐ ๐๐๐๐ ๐น๐น ๐๐๐๐ 1 E[ ] = E[ ] = (Eq. A15) ( ) 2 , ๐๐ ๐๐๐๐ ๐๐๐๐๐๐๐๐ ๐๐๐๐ ๐๐ ๐๐ ๏ฟฝ ๐ฅ๐ฅ ๐๐ ๐๐ ๐๐โ๐๐ ๐๐ and
1 E[ ] = E[ ]. (Eq. A16) ( ) ( ) 4 2 , , , ๐๐ ๐๐๐๐ ๐๐ ๐๐๐๐ โฒ ๐๐๐๐๐๐๐๐ ๐๐โฒ๐๐๐๐ โฒ๐ข๐ขโฒ ๐๐ ๐๐ โฒ ๏ฟฝ ๐ฅ๐ฅ ๐ฅ๐ฅ ๐๐๐๐ ๐๐ ๐๐ โ๐๐ ๐๐ ๐๐โฒ By Eqs. A2 and A3,
1 2 2 E[ ( ) ( ) ] = 2 + (1 ) = + 1 (Eq. A17) , ๐ซ๐ซ ๐ซ๐ซ ๐๐ ๐๐๐๐ ๐๐ ๐๐๐๐ ๐๐๐๐โฒ ๐๐๐๐ ๐๐๐๐โฒ ๐๐๐๐ ๐๐ ๐๐๐๐ ๐๐ ๐๐๐๐ ๐๐ ๐๐ ๏ฟฝโฒ ๐๐ ๐๐ โ ๐๐ ๐๐ ๐๐ ๐๐ ๏ฟฝ โ ๐๐ ๏ฟฝ ๐๐ ๐๐๐๐ ๐๐ ๐๐ โ๐๐ and for
๐ข๐ข โ ๐ข๐ขโฒ 1 E[ ( ) ( ) ] = 2 (1 ) = (1 ) . (Eq. A18) , ๐ซ๐ซ ๐๐ ๐๐๐๐ ๐๐ ๐๐๐ข๐ขโฒ ๐๐๐๐โฒ ๐๐๐๐ ๐๐๐๐ โฒ ๐๐ ๐๐๐๐ ๐๐๐๐ โฒ ๐๐ ๐๐ ๏ฟฝโฒ โ ๐๐ ๐๐ ๐๐ โ ๐๐ ๐๐ ๐๐ ๐๐๐๐ ๐๐ ๐๐ โ๐๐ Thus
E[ ( ) ( ) ] = + 1 , (Eq. A19) , ๐ซ๐ซ ๐ซ๐ซ , โฒ ๐๐ ๐๐๐๐ ๐๐ ๐๐๐๐ โฒ ๐๐๐๐๐๐ ๐๐๐๐ โฒ๐๐โฒ ๐๐ ๐๐๐๐ ๐๐๐๐๐๐ ๐๐๐๐๐๐ โฒ ๐๐ ๐๐๐๐ ๐๐ ๐ข๐ข ๐๐๐๐๐๐ ๐๐๐๐ โฒ๐๐โฒ ๏ฟฝ ๐๐ ๐๐ ๐ฃ๐ฃ ๐ฃ๐ฃ ๐๐ ๏ฟฝ ๐๐ ๐ฃ๐ฃ ๐ฃ๐ฃ ๏ฟฝ โ ๐๐ ๏ฟฝ ๏ฟฝโฒ ๐๐ ๐๐ ๐ฃ๐ฃ ๐ฃ๐ฃ ๐ข๐ข ๐ข๐ขโฒ ๐ข๐ข ๐ข๐ข ๐ข๐ข and the expectation of is
๐๐๐๐
E[ ( , )] = 2 2 2 1 โฒ โฒ ๐ซ๐ซ ๐ซ๐ซ , โฒ โฒ ๐๐ ๐๐ ๐๐ ๐๐๐๐ ๐๐๐๐๐๐ ๐๐๐๐ ๐๐ ๐๐ ๐๐๐๐ ๐๐๐๐๐๐ ๐๐๐๐๐๐ โฒ ๐๐ ๐๐๐๐ ๐๐ ๐ข๐ข ๐๐๐๐๐๐ ๐๐๐ข๐ข ๐๐โฒ ๐๐ ๏ฟฝ ๏ฟฝ๏ฟฝ ๐๐ ๐ฃ๐ฃ ๐ฃ๐ฃ โ ๐๐ ๐๐ ๐ฃ๐ฃ ๐ฃ๐ฃ โ ๏ฟฝ โ ๐๐ ๏ฟฝ ๏ฟฝโฒ ๐๐ ๐๐ ๐ฃ๐ฃ ๐ฃ๐ฃ ๏ฟฝ ๐๐ ๐ข๐ข ๐ข๐ข ๐ข๐ข = 2 1 = 1 . (Eq. A20) ๐ซ๐ซ โฒ , โฒ โฒ ๐ซ๐ซ ๐๐ ๐๐ ๐๐๐๐ ๐๐๐๐๐๐ ๐๐๐๐ ๐๐ ๐๐๐๐ ๐๐ ๐ข๐ข ๐๐๐๐๐๐ ๐๐ ๐ข๐ข ๐๐โฒ ๐๐ ๐๐๐๐โฒ ๏ฟฝ โ ๐๐ ๏ฟฝ ๏ฟฝ ๏ฟฝ๏ฟฝ ๐๐ ๐ฃ๐ฃ ๐ฃ๐ฃ โ ๏ฟฝโฒ ๐๐ ๐๐ ๐ฃ๐ฃ ๐ฃ๐ฃ ๏ฟฝ ๏ฟฝ โ ๐๐ ๏ฟฝ ๐บ๐บ ๐๐ ๐ข๐ข ๐ข๐ข ๐ข๐ข We note in passing that can be technically defined also for a "population" consisting of a single individual. Denoting for individual i by , we have E[ ] = (1 ) . Thus, the expected amount of additive ๐๐ ๐๐ ๐๐ ๐๐๐๐ ๐๐๐๐ โ ๐๐๐๐๐๐ ๐๐ O. Ovaskainen et al. 5 SI variation present in a local population represented by a single non-inbred individual ( = 1/2) is half of the additive variation that is present in the ancestral population. ๐๐๐๐๐๐
Additive genetic variation among populations
The additive variance-covariance among the populations is a random variable over the process g, and we next compute its expectation. We have ๐๐ ๐น๐น 1 1 Cov , = Cov[ , ] = 2 = 2 , (Eq. A21) ๐ซ๐ซ ๐ซ๐ซ , , ๐๐ ๐ซ๐ซ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐๐๐ ๐๐๐๐ ๏ฟฝ๐๐ ๐๐ ๏ฟฝ ๐๐ ๐๐ ๏ฟฝ ๐๐ ๐๐ ๐๐ ๐๐ ๏ฟฝ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐โ๐๐ ๐๐ โ๐๐1 ๐๐ ๐๐ ๐๐โ๐๐2๐๐ โ๐๐ Cov , = Cov , = , (Eq. A22) ๐ซ๐ซ ๐ซ๐ซ ๐ซ๐ซ ๐ซ๐ซ ๐ซ๐ซ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐๐๐ ๏ฟฝ๐๐ ๐๐ ๏ฟฝ ๐ซ๐ซ ๏ฟฝ ๏ฟฝ๐๐ ๐๐ ๏ฟฝ ๐ซ๐ซ ๏ฟฝ ๐๐ ๐๐ 1 ๐๐ ๐๐โ๐ซ๐ซ 2 ๐๐ ๐๐โ๐ซ๐ซ Var = Cov[ , ] = 2 = 2 . (Eq. A23) ๐ซ๐ซ ๐ซ๐ซ ๐ซ๐ซ , ๐ซ๐ซ ๐๐ ๐ซ๐ซ ๐๐ ๐๐ ๐๐๐๐ ๏ฟฝ๐๐ ๏ฟฝ ๐ซ๐ซ ๏ฟฝ ๐๐ ๐๐ ๏ฟฝ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐โ๐ซ๐ซ ๐๐๐ซ๐ซ ๐๐ ๐๐โ๐ซ๐ซ As the expectation of additive value is the same for each individual (and thus for and for ), it holds that ๐ซ๐ซ ๐ซ๐ซ 1 1 ๐๐4๐๐ ๐๐ [ ] = Var 2Cov[ , ] + Var = 2 + 2 ๐ซ๐ซ ๐ซ๐ซ ๐ซ๐ซ ๐ซ๐ซ ๐ซ๐ซ ๐ซ๐ซ ๐ซ๐ซ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐๐๐ ๐ธ๐ธ ๐๐ ๐ซ๐ซ ๏ฟฝ๏ฟฝ ๏ฟฝ๐๐ ๏ฟฝ โ ๐๐ ๐๐ ๏ฟฝ๐๐ ๏ฟฝ๏ฟฝ ๐ซ๐ซ ๏ฟฝ ๏ฟฝ ๐๐ โ ๐ซ๐ซ ๏ฟฝ ๐๐ ๐๐ ๏ฟฝ ๐๐ ๐๐ ๐๐โ๐ซ๐ซ = 2 . (Eq. A24) ๐๐ ๐๐โ๐ซ๐ซ ๐๐ ๐๐โ๐ซ๐ซ ๐ซ๐ซ ๐๐ ๏ฟฝ๐๐ โ ๐๐ ๏ฟฝ๐๐ Covariance structure of breeding experiment data
The four random processes ( g , gโฒ, s, e) described in the main text are disjoint, so that each of them can be applied in isolation or in combination with the others. We denote expectations (and related variances and ๐น๐น ๐น๐น ๐น๐น ๐น๐น covariances) over combinations of these random processes by listing the processes over which the expectations are taking in the subscript, e.g. Cov(s,e)[ ]. When an expectation is taken over some of the random processes while others are kept at a fixed realization, the result is conditional on the realizations kept โ fixed.
We assume that the environmental effects are independent between individuals, and are independent of additive genetic effects. This gives Cov(e) , = E, where E denotes the environmental covariance matrix. The definition in left-hand side is conditional on the realizations of the processes , , , but in this ๏ฟฝ๐๐๐๐ ๐๐๐๐ ๏ฟฝ ๐ฟ๐ฟ๐๐๐๐ ๐๐ ๐๐ g gโฒ s case the result in the right-hand side of is independent of these realizations. Similarly, we have Cov , = ๐น๐น ๐น๐น ๐น๐น(g) Cov , = Cov , = 0. (g ) (s) ๏ฟฝ๐๐๐๐ ๐๐๐๐๏ฟฝ
โฒ ๏ฟฝ๐๐๐๐ ๐๐๐๐ ๏ฟฝ ๏ฟฝ๐๐๐๐ ๐๐๐๐ ๏ฟฝ
6 SI O. Ovaskainen et al.
To proceed, we need to introduce some more notation. We denote co-ancestry coefficients for individual- 1 1 population pairs as = , so that = . We denote by s( ) and d( ) the sire and the ๐พ๐พ๐พ๐พ ๐ซ๐ซ ๐พ๐พ๐พ๐พ ๐๐๐๐ ๐๐ โ๐๐ ๐๐๐๐ ( ( ๐๐โ๐๐ ๐๐๐๐ dam of the individual๐๐ i, and๐๐ ๐๐recallโ that๐๐ ) and๐๐ ๐๐๐๐ ) denote๐๐๐๐ โ the๐๐ local populations from๐๐ which๐๐ the sire and the dam of the individual i originate. ๐๐ ๐๐ ๐ท๐ท ๐๐
As shown by Eq. A13, the covariance among genetic effects for a pair of individuals is given by
Cov(g,g ) , = 2 , where i and j refer to any individuals in the field populations or the laboratory populationโฒ . Thus, utilizing๐๐ the recursive formula for coancestry coefficients, 2 = + , we obtain ๏ฟฝ๐๐๐๐ ๐๐๐๐ ๏ฟฝ ๐๐๐๐๐๐ ๐๐ ( ) ( ) for laboratory individuals i and j ๐๐๐๐๐๐ ๐๐๐ ๐ ๐๐ ๐๐ ๐๐๐๐ ๐๐ ๐๐
1 1 Cov g,g , = Cov g,g [ , ] + Cov g,g [ , ] 2 ( ) 2 ( ) โฒ ( ) โฒ ( ) โฒ ๏ฟฝ ๏ฟฝ ๐๐ ๐๐ ๏ฟฝ ๏ฟฝ ๐๐ ๐๐โฒ ๏ฟฝ ๏ฟฝ ๐๐ ๐๐โฒ ๏ฟฝ๐๐ ๐๐ ๏ฟฝ 1๐๐ ๐๐ ๏ฟฝ ๐๐ ๐๐1 ๐ท๐ท ๐๐ ๏ฟฝ ๐๐ ๐๐ = ๐๐ ๐๐โฒโ๐๐ ๐๐ 2 + ๐๐ 2 ๐๐โฒโ๐ท๐ท ๐๐ 2 ( ) 2 ( ) ( ) ๐๐ ( ) ๐๐ ๐๐๐๐โฒ ๐๐๐๐โฒ 1๐๐ ๐๐ ๏ฟฝ ๐๐ ๐๐ ๐ท๐ท ๐๐ ๏ฟฝ 1๐๐ ๐ฎ๐ฎ ๐๐ ๐๐โฒโ๐๐ ๐๐ ๐๐ ๐๐โฒโ๐ท๐ท ๐๐ = [ ( ) + ( ) ] + [ ( ) + ( ) ] 2 ( ) 2 ( ) ( ) โฒ โฒ ๐๐ ( ) โฒ โฒ ๐๐ ๐ ๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐ ๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ ๏ฟฝ ๐๐ ๐๐ ๐๐ ๐ท๐ท ๐๐ ๏ฟฝ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐โฒโ๐๐ ๐๐ ๐๐ ๐๐โฒโ๐ท๐ท ๐๐ = + + + . (Eq. A25) 2๐๐ ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ๐๐ ๐พ๐พ๐พ๐พ ๐พ๐พ๐พ๐พ ๐พ๐พ๐พ๐พ ๐พ๐พ๐พ๐พ ๏ฟฝ๐๐๐ ๐ ๐๐ ๐๐ ๐๐ ๐๐๐๐ ๐๐ ๐๐ ๐๐ ๐๐๐ ๐ ๐๐ ๐ท๐ท ๐๐ ๐๐๐๐ ๐๐ ๐ท๐ท ๐๐ ๏ฟฝ For the field individual iโ and lab-individual j, it holds that
1 1 Cov g,g , = Cov g,g [ , ] + Cov g,g [ , ] 2 ( ) 2 ( ) โฒ ( ) โฒ ( ) โฒ ๏ฟฝ ๏ฟฝ ๐๐โฒ ๐๐ ๏ฟฝ ๏ฟฝ ๐๐โฒ ๐๐โฒ ๏ฟฝ ๏ฟฝ ๐๐โฒ ๐๐โฒ ๏ฟฝ๐๐ ๐๐ ๏ฟฝ 1 ๐๐ ๐๐ ๏ฟฝ ๐๐1 ๐๐ ๐ท๐ท ๐๐ ๏ฟฝ ๐๐ ๐๐ ๐๐ ๐๐โฒโ๐๐ ๐๐ ๐๐ ๐๐โฒโ๐ท๐ท ๐๐ = + = ( ) + ( ) . (Eq. A26) ( ) ( ) ( ) ๐๐ ( ) ๐๐ ๐พ๐พโฒ๐พ๐พ ๐พ๐พโฒ๐พ๐พ ๐๐ ๐๐โฒ๐๐โฒ ๐๐โฒ๐๐โฒ ๐๐ ๐๐ ๐๐ ๐๐ ๐ท๐ท ๐๐ ๐๐ ๐๐ ๏ฟฝ ๐๐ ๐๐ ๐ท๐ท ๐๐ ๏ฟฝ ๐๐ ๐๐ ๏ฟฝ๐๐ ๐๐ ๏ฟฝ ๐๐ ๐๐ ๐๐โฒโ๐๐ ๐๐ ๐๐ ๐๐โฒโ๐ท๐ท ๐๐ Thus, we obtain for the laboratory individuals i and j
1 1 Cov g,g , = Cov g,g , + Cov g,g , 2 ( ) 2 ( ) โฒ ( ) โฒ โฒ ( ) โฒ โฒ ๏ฟฝ ๏ฟฝ ๐๐ ๐๐ ๏ฟฝ ๏ฟฝ ๐๐ ๐๐ ๏ฟฝ ๏ฟฝ ๐๐ ๐๐ ๏ฟฝ๐๐ ๐๐ ๏ฟฝ 1๐๐ ๐๐ โฒ๏ฟฝ ๏ฟฝ๐๐ ๐๐ ๏ฟฝ ๐ท๐ท ๐๐ 1โฒ ๏ฟฝ ๏ฟฝ๐๐ ๐๐ ๏ฟฝ ๐๐ ๐๐ โ๐๐ ๐๐ ๐๐ ๐๐ โ๐ท๐ท ๐๐ = ( ) + ( ) + ( ) + ( ) 2 ( ) 2 ( ) ( ) ๐พ๐พโฒ๐พ๐พ ๐พ๐พโฒ๐พ๐พ ๐๐ ( ) ๐พ๐พโฒ๐พ๐พ ๐พ๐พโฒ๐พ๐พ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐ท๐ท ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐ท๐ท ๐๐ 1 ๐๐ ๐๐ โฒ๏ฟฝ ๏ฟฝ๐๐ ๐๐ ๏ฟฝ ๐๐ ๐ท๐ท ๐๐ โฒ ๏ฟฝ ๏ฟฝ๐๐ ๐๐ ๏ฟฝ ๐๐ = ๐๐ ๐๐ โ๐๐ ๐๐+ + + ๐๐ ๐๐ โ๐ท๐ท ๐๐. (Eq. A27) 2 ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ๐ซ๐ซ ๐ซ๐ซ ๐ซ๐ซ ๐ซ๐ซ ๐๐ ๏ฟฝ๐๐๐๐ ๐๐ ๐๐ ๐๐ ๐๐๐๐ ๐๐ ๐ท๐ท ๐๐ ๐๐๐ท๐ท ๐๐ ๐๐ ๐๐ ๐๐๐ท๐ท ๐๐ ๐ท๐ท ๐๐ ๏ฟฝ๐๐ Combining the above, for lab-individuals i and j it further holds that
O. Ovaskainen et al. 7 SI
Cov g,g , = Cov g,g , Cov g,g , โฒ 1 โฒ โฒ ๏ฟฝ ๏ฟฝ๏ฟฝ๐๐๐๐ ๐๐๐๐ ๏ฟฝ= ๏ฟฝ ๏ฟฝ๏ฟฝ๐๐+๐๐ ๐๐๐๐ ๏ฟฝ โ +๏ฟฝ ๏ฟฝ๏ฟฝ๐๐๐๐ ๐๐+๐๐ ๏ฟฝ 2 ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ๐พ๐พ๐พ๐พ ๐พ๐พ๐พ๐พ ๐พ๐พ๐พ๐พ ๐พ๐พ๐พ๐พ ๐๐ 1 ๐ ๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐ ๐ ๐๐ ๐ท๐ท ๐๐ ๐๐ ๐๐ ๐ท๐ท ๐๐ ๏ฟฝ๐๐ + ๐๐ + ๐๐ + ๐๐ ๏ฟฝ๐๐ (Eq. A28) 2 ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ๐ซ๐ซ ๐ซ๐ซ ๐ซ๐ซ ๐ซ๐ซ ๐๐ โ ๏ฟฝ๐๐๐๐ ๐๐ ๐๐ ๐๐ ๐๐๐๐ ๐๐ ๐ท๐ท ๐๐ ๐๐๐ท๐ท ๐๐ ๐๐ ๐๐ ๐๐๐ท๐ท ๐๐ ๐ท๐ท ๐๐ ๏ฟฝ๐๐ and that
Cov g,g , = Cov g,g , โฒ = Cov โฒ , Cov , Cov , + Cov , ๏ฟฝ ๏ฟฝ๏ฟฝ๐๐๐๐ ๐๐๐๐ ๏ฟฝ ๏ฟฝg,g ๏ฟฝ๏ฟฝ๐๐๐๐ โ ๐๐๐๐ ๐๐๐๐ โ ๐๐g,๐๐g๏ฟฝ g,g g,g 1 โฒ 1 โฒ โฒ โฒ = ๏ฟฝ ๏ฟฝ๏ฟฝ๐๐๐๐ ๐๐๐๐ ๏ฟฝ โ +๏ฟฝ ๏ฟฝ๏ฟฝ๐๐๐๐ ๐๐๐๐+๏ฟฝ โ ๏ฟฝ +๏ฟฝ๏ฟฝ๐๐๐๐ ๐๐๐๐ ๏ฟฝ ๏ฟฝ ๏ฟฝ๏ฟฝ๐๐๐๐ ๐๐๐๐ ๏ฟฝ 2 2 ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) 1 ๐๐ ๐พ๐พ๐พ๐พ ๐พ๐พ๐พ๐พ ๐พ๐พ๐พ๐พ ๐พ๐พ๐พ๐พ ๐๐ ๐๐๐๐๐๐ ๐๐ โ+๏ฟฝ๐๐๐ ๐ ๐๐ ๐๐ ๐๐ + ๐๐๐๐ ๐๐ ๐๐ ๐๐ + ๐๐๐ ๐ ๐๐ ๐ท๐ท ๐๐ ๐๐๐๐ ๐๐ ๐ท๐ท ๐๐ ๏ฟฝ๐๐ 2 ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ๐พ๐พ๐พ๐พ ๐พ๐พ๐พ๐พ ๐พ๐พ๐พ๐พ ๐พ๐พ๐พ๐พ ๐๐ 1 ๐ ๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐ ๐ ๐๐ ๐ท๐ท ๐๐ ๐๐ ๐๐ ๐ท๐ท ๐๐ +โ ๏ฟฝ๐๐ + ๐๐ + ๐๐ +๐๐ ๏ฟฝ๐๐ . (Eq. A29) 2 ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ๐ซ๐ซ ๐ซ๐ซ ๐ซ๐ซ ๐ซ๐ซ ๐๐ ๏ฟฝ๐๐๐๐ ๐๐ ๐๐ ๐๐ ๐๐๐๐ ๐๐ ๐ท๐ท ๐๐ ๐๐๐ท๐ท ๐๐ ๐๐ ๐๐ ๐๐๐ท๐ท ๐๐ ๐ท๐ท ๐๐ ๏ฟฝ๐๐ We next treat the randomness associated to sampling individuals from the field populations to form the parental generation for the laboratory population. As we assume that the individuals used in the breeding design are a random sample from the field populations, we obtain for laboratory individuals i and j 1 1 E( ) ( ) ( ) = E( ) ( ) โฒ = โฒ โฒ = ( ) ( ) (Eq. A30) ( ) ( ) ( ) ๐พ๐พ๐พ๐พ ( ) ( ), ( ) ๐ซ๐ซ ๐ ๐ ๐ ๐ ๐๐ ๐๐ ๐๐ ๐ ๐ ๐ ๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ ๏ฟฝ๐๐ ๏ฟฝ ๏ฟฝ ๐๐ ๐๐ ๏ฟฝ ๐๐ ๏ฟฝ ๐๐ ๐๐ ๐๐ ๐๐ ๏ฟฝ ๐๐ ๐๐ ๐๐ ๐๐โฒโ๐๐ ๐๐ ๐๐ ๐๐ ๐๐โฒโ๐๐ ๐๐ ๐๐โฒโ๐๐ ๐๐ Similarly,
E( ) ( ) ( ) = ( ) ( ), E( ) ( ) ( ) = ( ) ( ), E( ) ( ) ( ) = ( ) ( ). (Eq. A31) ๐พ๐พ๐พ๐พ ๐ซ๐ซ ๐พ๐พ๐พ๐พ ๐ซ๐ซ ๐พ๐พ๐พ๐พ ๐ซ๐ซ ๐ ๐ ๐ ๐ ๐๐ ๐ท๐ท ๐๐ ๐๐ ๐๐ ๐ท๐ท ๐๐ ๐ ๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐ท๐ท ๐๐ ๐๐ ๐๐ ๐ ๐ ๐๐ ๐๐ ๐ท๐ท ๐๐ ๐ท๐ท ๐๐ ๐ท๐ท ๐๐ The expected๏ฟฝ๐๐ sire-๏ฟฝdam๐๐ and dam-sire coancestry๏ฟฝ๐๐ coefficients๏ฟฝ ๐๐ are given ๏ฟฝby๐๐ ๏ฟฝ ๐๐
E( ) ( ) ( ) = ( ) ( ), E( ) ( ) ( ) = ( ) ( ). (Eq. A32) ๐ซ๐ซ ๐ซ๐ซ ๐ ๐ ๐ ๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐ท๐ท ๐๐ ๐ ๐ ๐๐ ๐๐ ๐ ๐ ๐๐ ๐ท๐ท ๐๐ ๐๐ ๐๐ In case of sire-sire (or dam๏ฟฝ๐๐ -dam)๏ฟฝ combinations,๐๐ the two๏ฟฝ๐๐ sires (dams)๏ฟฝ ๐๐ may represent the same individual or different individuals. Whether they represent the same or different individuals is part of the study design and thus independent of the sampling process, giving
( ) ( ) ( ) ( ) E( ) ( ) ( ) = (Eq. A33) ๐ซ๐ซ ( ) = ( ), ๐๐๐๐(๐๐)๐๐ ๐๐ ๐ ๐ ๐๐ โ ๐ ๐ ๐๐ and ๐ ๐ ๏ฟฝ๐๐๐ ๐ ๐๐ ๐ ๐ ๐๐ ๏ฟฝ ๏ฟฝ ๐ฎ๐ฎ ๐๐๐๐ ๐๐ ๐ ๐ ๐๐ ๐ ๐ ๐๐ ( ) ( ) ( ) ( ) E( ) ( ) ( ) = (Eq. A34) ๐ซ๐ซ ( ) = ( ). ๐๐๐ท๐ท(๐๐)๐ท๐ท ๐๐ ๐๐ ๐๐ โ ๐๐ ๐๐ ๐ ๐ ๐๐ ๐๐ ๐๐ ๐๐ As ๏ฟฝ๐๐ ๏ฟฝ ๏ฟฝ ๐ฎ๐ฎ ๐๐๐ท๐ท ๐๐ ๐๐ ๐๐ ๐๐ ๐๐
8 SI O. Ovaskainen et al.
1 1 + , = 2 2 ( ) ( ) = (Eq. A35) 1 ๐ ๐ ๐๐ ๐๐ ๐๐ ( ) ( ) + ( ) ( )๐๐+ ( ) ( )๐๐+ ๐๐ ( ) ( ) , , ๐๐๐๐๐๐ ๏ฟฝ4 ๏ฟฝ๐๐๐ ๐ ๐๐ ๐ ๐ ๐๐ ๐๐๐ ๐ ๐๐ ๐๐ ๐๐ ๐๐๐๐ ๐๐ ๐ ๐ ๐๐ ๐๐๐๐ ๐๐ ๐๐ ๐๐ ๏ฟฝ ๐๐ โ ๐๐ we obtain
1 1 + = 2 2 ( ) ( ) 1 ๐ซ๐ซ โง +๐๐๐๐ ๐๐ ๐ท๐ท ๐๐ + + ๐๐ ๐๐ , ( ) ( ), ( ) ( ) 4 ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) โช ๐ซ๐ซ ๐ซ๐ซ ๐ซ๐ซ ๐ซ๐ซ โช1 ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐ท๐ท ๐๐ ๐ท๐ท ๐๐ ๐๐ ๐๐ ๐ท๐ท ๐๐ ๐ท๐ท ๐๐ E( )[ ] = โช ๏ฟฝ๐๐ ( ) + ( ๐๐) ( ) + ( ๐๐) ( ) + ( ๐๐) ( ) ๏ฟฝ ๐๐ โ ๐๐,๐ ๐ (๐๐) โ = ๐ ๐ (๐๐),๐๐(๐๐) โ ๐๐(๐๐) (Eq. A36) โช4 ๐ฎ๐ฎ ๐ซ๐ซ ๐ซ๐ซ ๐ซ๐ซ ๐ ๐ ๐๐๐๐ 1 ๐๐ ๐๐ ๐๐ ๐๐ ๐ท๐ท ๐๐ ๐ท๐ท ๐๐ ๐๐ ๐๐ ๐ท๐ท ๐๐ ๐ท๐ท ๐๐ ๐๐ ๏ฟฝ๐๐ ๐๐+ ๐๐+ ๐๐+ ๏ฟฝ ๐๐ โ ๐๐, ๐ ๐ (๐๐) ๐ ๐ (๐๐), ๐๐(๐๐) =โ ๐๐(๐๐) โจ4 ( ) ( ) ( ) ( ) ( ) ( ) ( ) ๐ซ๐ซ ๐ซ๐ซ ๐ซ๐ซ ๐ฎ๐ฎ โช1 ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐ท๐ท ๐๐ ๐ท๐ท ๐๐ ๐๐ ๐๐ ๐ท๐ท ๐๐ โช ๏ฟฝ๐๐ ( ) + ( ๐๐) ( ) + (๐๐) ( ) + (๐๐) ๏ฟฝ ๐๐ โ ๐๐,๐ ๐ (๐๐) โ = ๐ ๐ (๐๐),๐๐(๐๐) = ๐๐(๐๐) โช4 ๐ฎ๐ฎ ๐ซ๐ซ ๐ซ๐ซ ๐ฎ๐ฎ โช ๐๐ ๐๐ ๐๐ ๐๐ ๐ท๐ท ๐๐ ๐ท๐ท ๐๐ ๐๐ ๐๐ ๐ท๐ท ๐๐ โฉ ๏ฟฝ๐๐ ๐๐ ๐๐ ๐๐ ๏ฟฝ ๐๐ โ ๐๐ ๐ ๐ ๐๐ ๐ ๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ We next are now ready to extend the covariances over (g, g ) to covariances over (g, g , s). Since the sampling โฒ โฒ process s is independent of the processes g and gโฒ, it holds for any random variables and ,
๐น๐น ๐น๐น ๐น๐น ๐ด๐ด ๐ต๐ต
Cov g,g ,s [ , ] = E g,g ,s [ ] E g,g ,s [ ]E g,g ,s [ ] โฒ โฒ โฒ โฒ ๏ฟฝ ๏ฟฝ ๐ด๐ด ๐ต๐ต = E(๏ฟฝs) E ๏ฟฝg,๐ด๐ดg ๐ด๐ด[ โ] ๏ฟฝ E g๏ฟฝ,g ๐ด๐ด[ ๏ฟฝ]E g,g๏ฟฝ ๐ต๐ต[ ] + E(s) E g,g [ ]E g,g [ ] E g,g ,s [ ]E g,g ,s [ ] โฒ โฒ โฒ โฒ โฒ โฒ โฒ = E(s) ๏ฟฝCov๏ฟฝ g๏ฟฝ,g๐ด๐ด๐ด๐ด( ,โ) ๏ฟฝ+ E๏ฟฝ(s๐ด๐ด) E๏ฟฝ g,g ๏ฟฝ [๐ต๐ต]๏ฟฝE g,g [ ๏ฟฝ] ๏ฟฝ E๏ฟฝ g๐ด๐ด,g ,s๏ฟฝ [ ]๏ฟฝE๐ต๐ตg,g๏ฟฝ โ,s [ ๏ฟฝ] (Eq๏ฟฝ ๐ด๐ด. A37๏ฟฝ ) ๏ฟฝ ๐ต๐ต โฒ โฒ โฒ โฒ โฒ ๏ฟฝ ๏ฟฝ ๏ฟฝ ๐ด๐ด ๐ต๐ต ๏ฟฝ ๏ฟฝ ๏ฟฝ ๏ฟฝ ๐ด๐ด ๏ฟฝ ๏ฟฝ ๐ต๐ต ๏ฟฝ โ ๏ฟฝ ๏ฟฝ ๐ด๐ด ๏ฟฝ ๏ฟฝ ๐ต๐ต For any realization of the process s and any laboratory individual i,
๐น๐น
E g,g [ ] = E g,g [ ] = E g,g [ ] = 0, (Eq. A38) โฒ โฒ โฒ ๏ฟฝ ๏ฟฝ ๐๐ ๏ฟฝ ๏ฟฝ ๐๐ ๏ฟฝ ๏ฟฝ ๐๐ and thus, for any laboratory individuals๐๐ i and j, ๐๐ ๐๐
1 Cov , = E Cov , = + + + , (Eq. A39) g,g ,s (s) g,g 2 ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) โฒ โฒ ๐ซ๐ซ ๐ซ๐ซ ๐ซ๐ซ ๐ซ๐ซ ๐๐ ๏ฟฝ ๏ฟฝ๏ฟฝ๐๐๐๐ ๐๐๐๐ ๏ฟฝ ๏ฟฝ ๏ฟฝ ๏ฟฝ๏ฟฝ๐๐๐๐ ๐๐๐๐ ๏ฟฝ๏ฟฝ ๏ฟฝ๐๐๐๐ ๐๐ ๐๐ ๐๐ ๐๐๐๐ ๐๐ ๐ท๐ท ๐๐ ๐๐๐ท๐ท ๐๐ ๐๐ ๐๐ ๐๐๐ท๐ท ๐๐ ๐ท๐ท ๐๐ ๏ฟฝ๐๐ Cov g,g ,s , = E(s) Cov g,g , = E(s) ( ) + ( ) โฒ 1 โฒ ๐พ๐พ๐พ๐พ ๐พ๐พ๐พ๐พ ๐๐ ๏ฟฝ ๏ฟฝ๏ฟฝ๐๐๐๐ ๐๐๐๐=๏ฟฝ ๏ฟฝ ๏ฟฝ+ ๏ฟฝ๏ฟฝ๐๐๐๐ ๐๐๐๐+๏ฟฝ๏ฟฝ ๏ฟฝ๐๐+๐๐๐๐ ๐๐ ๐๐๐๐๐๐ ๐๐ ๏ฟฝ๐๐ = Cov , , (Eq. A40) 2 ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) g,g ,s ๐ซ๐ซ ๐ซ๐ซ ๐ซ๐ซ ๐ซ๐ซ ๐๐ โฒ ๏ฟฝ๐๐๐๐ ๐๐ ๐๐ ๐๐ ๐๐๐๐ ๐๐ ๐ท๐ท ๐๐ ๐๐๐ท๐ท ๐๐ ๐๐ ๐๐ ๐๐๐ท๐ท ๐๐ ๐ท๐ท ๐๐ ๏ฟฝ๐๐ ๏ฟฝ ๏ฟฝ๏ฟฝ๐๐๐๐ ๐๐๐๐ ๏ฟฝ Cov g,g ,s , = Cov g,g ,s , = 0, (Eq. A41) โฒ โฒ ๏ฟฝ ๏ฟฝ๏ฟฝ๐๐๐๐ ๐๐๐๐ ๏ฟฝ ๏ฟฝ ๏ฟฝ๏ฟฝ๐๐๐๐ โ ๐๐๐๐ ๐๐๐๐ ๏ฟฝ Cov g,g ,s , = E(s) Cov g,g , = 2E(s) ij , (Eq. A42) โฒ โฒ ๐๐ ๏ฟฝ ๏ฟฝ๏ฟฝ๐๐๐๐ ๐๐๐๐ ๏ฟฝ ๏ฟฝ ๏ฟฝ ๏ฟฝ๏ฟฝ๐๐๐๐ ๐๐๐๐ ๏ฟฝ๏ฟฝ ๏ฟฝ๐๐ ๏ฟฝ๐๐
O. Ovaskainen et al. 9 SI and thus
Cov g,g ,s , = Cov g,g ,s , = Cov g,g ,s , Cov g,g ,s , โฒ = 2 โฒ (Eq. A43) โฒ โฒ ๏ฟฝ ๏ฟฝ๏ฟฝ๐๐๐๐ ๐๐๐๐ ๏ฟฝ ๏ฟฝ ๏ฟฝ๏ฟฝ๐๐๐๐ โ ๐๐๐๐ ๐๐๐๐ โ ๐๐๐๐ ๏ฟฝ ๏ฟฝ ๏ฟฝ๏ฟฝ๐๐๐๐ ๐๐๐๐ ๏ฟฝ โ ๏ฟฝ ๏ฟฝ๏ฟฝ๐๐๐๐ ๐๐๐๐ ๏ฟฝ ๐๐ ๐๐๐๐๐๐ ๐๐ where 1 1 + = 2 4 ( ) ( ) ( ) ( ) 0 ๐ซ๐ซ ๐ซ๐ซ , ( ) ( ), ( ) ( ) โง โ ๏ฟฝ๐๐๐๐ ๐๐ ๐๐ ๐๐ ๐๐๐ท๐ท ๐๐ ๐ท๐ท ๐๐ ๏ฟฝ ๐๐ ๐๐ 1 โช , ( ) = ( ), ( ) ( ) = โช 4 ( ) ( ) ( ) ๐๐ โ ๐๐ ๐ ๐ ๐๐ โ ๐ ๐ ๐๐ ๐๐ ๐๐ โ ๐๐ ๐๐ (Eq. A44) โช ๐ฎ๐ฎ ๐ซ๐ซ 1 ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐๐๐ ๏ฟฝ๐๐ โ ๐๐ ๏ฟฝ ๐๐ โ ๐๐ ,๐ ๐ (๐๐ ) ๐ ๐ (๐๐ ),๐๐ (๐๐ )โ =๐๐ (๐๐ ) ๐๐ 4 ( ) ( ) ( ) โจ1 ๐ฎ๐ฎ ๐ซ๐ซ โช ๏ฟฝ๐๐๐ท๐ท+๐๐ โ ๐๐๐ท๐ท ๐๐ ๐ท๐ท ๐๐ ๏ฟฝ ๐๐ โ ๐๐,๐ ๐ (๐๐) โ = ๐ ๐ (๐๐),๐๐(๐๐) = ๐๐(๐๐) โช4 ( ) ( ) ( ) ( ) ( ) ( ) ๐ฎ๐ฎ ๐ฎ๐ฎ ๐ซ๐ซ ๐ซ๐ซ โช ๐๐ ๐๐ ๐ท๐ท ๐๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐ท๐ท ๐๐ ๐ท๐ท ๐๐ โฉ ๏ฟฝ๐๐ ๐๐ โ ๐๐ โ ๐๐ ๏ฟฝ ๐๐ โ ๐๐ ๐ ๐ ๐๐ ๐ ๐ ๐๐ ๐๐ ๐๐ ๐๐ ๐๐
References
COCKERHAM, C. C., and B. S. WEIR, 1987 Correlations, descent measures - drift with migration and mutation. Proceedings of the National Academy of Sciences of the United States of America 84: 8512-8514.
10 SI O. Ovaskainen et al.
File S2
Parameter estimation
Given neutral molecular data and the breeding experiment data, we fitted a neutral model to the data using Bayesian inference, and then tested for deviation from neutrality using the approach described in the main paper. Here we describe the statistical model, the prior distributions and a Markov Chain Monte Carlo (MCMC) method that was used to parameterize the model.
The statistical model for neutral molecular markers
We model the neutral divergence of the local populations from a common ancestral population as a mixture of independent lineages. The model summarized in this section will be published later in ๐๐๐ซ๐ซ more detail (KARHUNEN and OVASKAINEN 2011). ๐๐โ The allele frequencies in locus in lineage are distributed as
๐๐ ๐๐~Dirichlet .
๐๐๐๐ ๐๐ ๐๐ where denotes the vector of allele frequencies๐๐ in the๏ฟฝ๐๐ ancestral๐๐ ๏ฟฝ population, and measures the amount of drift experienced by lineage . The allele frequencies in locus in local population are ๐๐๐๐ ๐๐๐๐ defined as a mixture of the lineage-specific frequencies, ๐๐ ๐๐ ๐ด๐ด = . ๐ฟ๐ฟ ๐๐=1 ๐ด๐ด ๐ด๐ด ๐๐ ๐๐ ๐๐๐๐ ๐๐ ๏ฟฝ๐๐ ๐ ๐ ๐๐ Here, we assume that the mixture weights sum up to unity over the lineages, =1 = 1, so ๐ฟ๐ฟ that vector is a proper frequency distribution.๐ด๐ด The genotype (in terms of the neutral๐๐ marker๐ด๐ด loci) ๐ ๐ ๐๐ โ๐๐ ๐ ๐ ๐๐ of each individual๐ด๐ด in a population is a multinomial random variable of the allele frequencies, ๐๐๐๐ ~Multinomial 2, . . ๐ด๐ด ๐ด๐ด ๐๐๐๐ ๐๐ ๐ฅ๐ฅThe population-population๏ฟฝ ๐๐ ๏ฟฝ coancestry coefficients depend on the model parameters as
= . ๐ฟ๐ฟ ๐ด๐ด ๐ต๐ต ๐๐=1 + 1 ๐ซ๐ซ ๐ ๐ ๐๐ ๐ ๐ ๐๐ ๐๐๐ด๐ด๐ด๐ด ๏ฟฝ Note that is defined through probability of IBD๐๐ for ๐๐neutral๐๐ loci, and thus, it does not depend on allele frequencies๐ซ๐ซ in the ancestral generation. ๐๐๐ด๐ด๐ด๐ด ๐๐๐๐
O. Ovaskainen et al. 11 SI
The Directed Acyclic Graph (DAG) that describes the dependencies among model variables (including both neutral marker data and quantitative trait data) is shown in Supplementary Figure 1. While the framework develop in the paper and presented in the DAG allows for the inclusion of an arbitrary level of inbreeding ( ), we did not account for inbreeding in the present estimation scheme and thus ๐ฎ๐ฎ assumed = 0.5(1 ๐๐+ ). ๐ฎ๐ฎ ๐๐ ๐ซ๐ซ ๐๐๐๐ ๐๐๐๐ Prior distributions
To parameterize the model with Bayesian inference, prior distributions need to be defined for the primary parameters of the model: , , E , , and . We assume the priors ๐๐ ๐๐ ๐๐ ๐๐ ๐๐~๐๐N(0,100๐ฟ๐ฟ ),
~Wishart๐๐๐๐( ( + 1) 1, + 1), ๐๐ โ ๐๐ 1 ๐๐E~Wishart( ๐๐ ( ๐๐+ 1) , ๐๐+ 1), โ ๐๐ ~Dirichlet๐๐๐๐ ๐๐ , ๐๐ ๐๐ ๐๐ ๐๐log ~N( ๏ฟฝ, ๐ท๐ท2๐๐ )๏ฟฝ,
~๐๐Dirichlet๐๐ ๐๐๐๐(๐๐๐๐ ). ๐ ๐ Here the indices , , and refer to loci, lineages๐ฟ๐ฟ๐ด๐ด and subpopulations,๐ท๐ท๐ด๐ด respectively. In the numerical examples of this paper, we assume the values = , = 1, 2 = 2. We set the number of ๐๐ ๐๐ ๐ด๐ด ๐๐ lineages equal to the number of populations, and assume๐๐ that lineage makes the dominant ๐ท๐ท๐๐ ๐๐๐๐ ๐๐๐๐ ๐๐๐๐ contribution to population , i.e. that the matrix is diagonally dominant. To do so, we let ๐ด๐ด ๐ด๐ด ๐ฟ๐ฟ 0.2 = 0.8, and = for , 1 ๐ ๐ ๐ ๐ ๐ฝ๐ฝ๐ด๐ด๐ด๐ด ๐ฝ๐ฝ๐ด๐ด๐ด๐ด ๐๐ โ ๐ด๐ด and truncate the prior by the requirement that >๐๐๐ฟ๐ฟ โ for all . The latter specification ensures that label switching is not possible, which๐ ๐ property๐ ๐ improves the mixing of Markov Chain ๐ฝ๐ฝ๐ด๐ด๐ด๐ด ๐ฝ๐ฝ๐ด๐ด๐ด๐ด ๐๐ โ ๐ด๐ด Monter Carlo (MCMC) algorithm (GELMAN et al. 2004).
The number of alleles in neutral marker locus in the ancestral generation is generally unknown, as some alleles may have disappeared after the lineages have diverged or are not present in the sampled ๐๐๐๐ ๐๐ individuals. Due to mathematical properties of the Dirichlet distribution, we may pool all such unobserved alleles into one โmeta-alleleโ. Thus, we define as the number of distinct alleles observed in locus plus one. ๐๐๐๐ ๐๐ 12 SI O. Ovaskainen et al.
Parameter estimation through MCMC
Each model parameter was sampled from its full conditional while keeping the other parameters fixed. Most parameters were updated using a random-walk Metropolis-Hastings (MH) algorithm, the proposal distributions given below. For these the variances of the proposal distributions were adjusted during the burn-in following Ovaskainen et al. (2008) to give an accept ratio of 0.44 for a single parameter or 0.22 for a vector of parameters.
โข Sampling the additive genetic variance-covariance matrix was conducted using the MH algorithm. We used the proposal distribution Wishart( /๐๐, ), where the parameter was ๐๐ adjusted during the burn-in. ๐๐ ๐๐ ฮฝ ฮฝ ฮฝ
โข Sampling the residual variance-covariance matrix E was conducted using the MH algorithm. We used the proposal distribution Wishart( E / , ), where the parameter was adjusted during the ๐๐ burn-in. ๐๐ ฮฝ ฮฝ ฮฝ
โข Sampling the overall mean and the population specific means . Conditional on the other parameters, the full conditional joint distribution of the parameters๐ซ๐ซ ( , ) follows a multivariate ๐๐ ๐๐ normal distribution. These parameters were thus updated directly (not using๐ซ๐ซ the MH algorithm) ๐๐ ๐๐ by drawing a random deviate from the full conditional.
โข Sampling drift parameters . We used N( , 2 ) distributions separately for each to draw 2 proposals for log . The variance parameters ๐๐ were adjusted during the burn-in. ๐๐ ๐๐๐๐ ๐ฟ๐ฟ๐๐ ๐๐ ๐๐ ๐๐๐๐ ๐ฟ๐ฟ๐๐ โข Sampling lineage loadings . We used truncated Dirichlet( ) distributions separately for
each and to draw proposals for . The โs are proposal๐ด๐ด parameters that were adjusted ๐ฟ๐ฟ ๐ฟ๐ฟ๐ฟ๐ฟ ๐ฟ๐ฟ๐ด๐ด during the burn-in. ๐ด๐ด ๐ด๐ด ๐๐ ๐ฟ๐ฟ๐ด๐ด ๐ฟ๐ฟ๐ฟ๐ฟ
โข Sampling ancestral allele frequencies . We used truncated Dirichlet( ) distributions
separately for each to draw proposals for . The โs are proposal parameters๐๐ that were ๐๐ ๐ฟ๐ฟ๐๐ ๐๐๐๐ adjusted during the burn-in. ๐๐ ๐๐ ๐๐๐๐ ๐ฟ๐ฟ๐๐
โข Sampling allele frequencies . We used Dirichlet( ) distributions separately for each
lineage and locus . The โs parameters were adjusted๐๐๐๐ during the burn-in. ๐๐ ๐ฟ๐ฟ๐๐ ๐๐๐๐๐๐ ๐๐๐๐ ๐๐ ๐๐ ๐ฟ๐ฟ๐๐
O. Ovaskainen et al. 13 SI
We implemented the algorithm in Mathematica 7.1 (WOLFRAM RESEARCH 2008), and used 1,000 iterations for the burn-in and 2,000 for sampling the posterior distribution. We note that these sample sizes are rather small, and they were chosen due to logistic constrains (the estimation was conducted for 3,600 data sets in total). We tested for the sufficiency for mixing by examining the chains for convergence and for running a sample of the datasets for 10,000 + 20,000 iterations. Both tests (as well as the Results, which separated well the cases with and without selection, see main text) indicated that the length of the MCMC chain was sufficient for the present purpose. For the case of real data analysis, we naturally recommend a much longer MCMC chain.
Supplementary Figure 1. Directed acyclic graph (DAG) for the Bayesian model. The quantities in the ellipses are the parameter values being estimated. In the single boxes are the observations (the neutral data and the quantitative trait data ) and in the double boxes are the fixed quantities. The full arrows give stochastic relationships (i.e. those that have probability distributions associated with ๐ฟ๐ฟ ๐๐ them) and the dashed arrows are for deterministic relationships.
14 SI O. Ovaskainen et al.
References ALBERT, A. Y. K., S. SAWAYA, T. H. VINES, A. K. KNECHT, C. T. MILLER et al., 2008 The genetics of adaptive shape shift in stickleback: Pleiotropy and effect size. Evolution 62: 76-85. GELMAN, A., J. B. CARLIN, H. S. STERN and D. B. RUBIN, 2004 Bayesian Data Analysis. Chapman & Hall/CRC, Boca Raton. KARHUNEN, M., and O. OVASKAINEN, 2011 Defining and estimating FST through coancestry. Manuscript. WOLFRAM RESEARCH, I., 2008 Mathematica Edition: Version 7.1. Wolfram Research, Inc., Champaign, Illinois.
O. Ovaskainen et al. 15 SI
File S3
Generation of individual-based data
We assumed that in each generation of each population the breeding population consists of 20 individuals. We assumed a hermaphrodite species and randomized the two parents for each propagule independently of each other (selfing allowed). To model gene flow among the populations, we assumed that the parent is an immigrant from one of the other populations with probability =0.01.
๐๐ We assume a set of 32 loci determining the genotypic values among these traits, with 5 allelic variants in each locus, each having an equal frequency in the ancestral generation. We randomized the additive values of each allele in each locus from a two-dimensional multivariate normal distribution with mean zero and such a variance-covariance matrix (see Appendix A) that the expected amount of additive genetic variation in the ancestral generation was 1.0 0.9 E[ ] = . 0.9 1.0 ๐๐ ๐๐ ๏ฟฝ ๏ฟฝ The phenotypic value of individual was modeled as the sum of the additive component and the environmental effect, i.e. = + , where the environmental effect was randomized from the ๐๐ two dimensional multivariate normal distribution with mean zero and variance-covariance matrix ๐ณ๐ณ๐๐ ๐๐๐๐ ๐๐๐๐ ๐๐๐๐
0.5 0 = . 0 0.5 ๐๐๐ธ๐ธ ๏ฟฝ ๏ฟฝ If modeling traits under selection, we let the populations become adapted locally to their environments by assuming that the environment experienced by a population favored an optimal phenotype . We assumed that the values are distributed (i.i.d. among the populations) multinormally with mean zero ๐๐ ๐๐๐๐ and variance covariance๐๐ matrix ๐๐ 5 4.5 = , 4.5 5 โ ๐๐๐๐ ๏ฟฝ ๏ฟฝ and we thus have assumed that the expected directionโ of population divergence through the selective gradient is not proportional to . ๐๐ ๐๐ We assume that the populations undergo weak selection, so that the initial number of propagules produced is high (ten times the actual population size), and selection operates through competition influencing survival. We define the fitness of an individual with phenotype in population as
๐ณ๐ณ ๐๐ ( ) = exp( ( ) 1( )) โ ๐๐๐๐ ๐ณ๐ณ โ ๐ณ๐ณ โ ๐๐๐๐ ๐๐๐น๐น ๐ณ๐ณ โ ๐๐๐๐
16 SI O. Ovaskainen et al. where the variance-covariance matrix
0.5 0 = 0 0.5 ๐๐๐น๐น ๏ฟฝ ๏ฟฝ describes the shape of the fitness landscape around the local optimum. We selected the 20 individuals that survived to form next generation among 200 propagules by sampling each individual with a probability that was proportional to its fitness value. In the case of no selection, the weights were set equal for all individuals.
To generate phenotypic data that allows for the estimation of genetic variability among and between populations, we assumed a breeding design in which five pairs of individuals were randomly sampled from each of the eight populations. We assumed a full-sib design with five offspring for each of the pairs, so that the total number of offspring was 8 x 5 x 5 = 200. These individuals were assigned environmental effects as for the wild individuals, and their phenotypes were then recorded.
For neutral molecular markers, we also assumed a set of 32 loci with 5 alleles in each locus with equal frequencies in the ancestral population. We randomized the flow of the neutral alleles from the ancestral to the present generation, sampled ten individuals from each population, and genotyped these for the neutral molecular markers.
O. Ovaskainen et al. 17 SI
File S4
Implementation of the method of Martin et al. (2008)
We analyzed the simulated data sets with Martin et al.โs method (MARTIN et al. 2008b). This method involves two separate tests: Comparing the estimates of the intra-population G matrix and the covariance matrix of population means for proportionality, and secondly, testing the proportionality coefficient of these matrices against the FST of molecular markers. The first test should see selection that has a magnitude comparable to random drift, but a direction incompatible with . The second test should see selection that has a magnitude incompatible with random drift, whether balancing๐๐ or ๐๐ disruptive. Both tests are based on the assumption that
2FST E[ ] = E[ ] 1 FST ๐๐ ๐๐๐๐ under neutrality. We performed these tests followingโ the procedure described by the authors (CHAPUIS et al. 2008; MARTIN et al. 2008b). Thus, we estimated the G matrices using Rโs MANOVA with three covariance components: inter-population, inter-family and inter-individual. The inter-population covariance matrix was used as an estimate of E[ ], whereas the inter-population covariance matrix was multiplied by two to yield an estimate of E[ ], which is appropriate for full-sib study design (LYNCH ๐๐ and WALSH 1998). Because the phenotypic data was known to be multivariate normal, it was not Cox ๐๐๐๐ transformed, but we followed the recommendation of standardization (CHAPUIS et al. 2008).
After this, the proportionality of E[ ] and E[ ] was tested, and a 95 % confidence interval was derived for the proportionality coefficient using code provided by MARTIN et al. (2008a). The Bartlett p ๐๐ ๐๐๐๐ value of the proportionality test was recorded. To perform the second test of MARTIN et al. (2008b), a 95 % confidence interval was estimated for the FST using bootstrap over loci and the Weir-Cockerham estimator (WEIR and COCKERHAM 1984) implemented in Fstat (GOUDET 1995). The confidence interval 2FST of FST was transformed into the confidence interval of . This was compared with the confidence 1 FST interval of the proportionality coefficient obtained from โthe phenotypic data. If these intervals did not overlap, this was taken as a signal of selection.
References
CHAPUIS, E., G. MARTIN and J. GOUDET, 2008 Effects of Selection and Drift on G Matrix Evolution in a Heterogeneous Environment: A Multivariate Q(st)-F-st Test With the Freshwater Snail Galba truncatula. Genetics 180: 2151-2161. GOUDET, J., 1995 FSTAT (Version 1.2): A computer program to calculate F-statistics. Journal of Heredity 86: 485- 486.
18 SI O. Ovaskainen et al.
LYNCH, M., and B. WALSH, 1998 Genetics and analysis of quantitative traits. Sinauer Associates Incorporated, New York. MARTIN, G., E. CHAPUIS and J. GOUDET, 2008a http://www.isem.cnrs.fr/spip.php?article934, pp. Institut des sciences de l'evolution montpellier, Montpellier. MARTIN, G., E. CHAPUIS and J. GOUDET, 2008b Multivariate Q(st)-F-st Comparisons: A Neutrality Test for the Evolution of the G Matrix in Structured Populations. Genetics 180: 2135-2149. WEIR, B. S., and C. C. COCKERHAM, 1984 Estimating F-Statistics for the Analysis of Population-Structure. Evolution 38: 1358-1370.
O. Ovaskainen et al. 19 SI