LINKAGE DISEQUILIBRIUM in SUBDIVIDED POPULATIONS' I N A
Total Page:16
File Type:pdf, Size:1020Kb
LINKAGE DISEQUILIBRIUM IN SUBDIVIDED POPULATIONS’ MASATOSHI NE1 AND WEN-HSIUNG LI Center for Demographic and Population Genetics, University of Texas, Houston, Texas 77025, and Department of Medical Genetics, University of Wisconsin, Madison, Wisconsin 53706 Manuscript received October 12, 1972 ABSTRACT The linkage disequilibrium in a subdivided populaton is shown to be equal to the sum of the average linkage disequilibrium for all subpopulations and the covariance between gene frequencies of the loci concerned. Thus, in a subdivided population the linkage disequilibrium may not be 0 even if the link- age disequilibrium in each subpopulation is 0. If a population is divided into two subpopulations between which migration occurs, the asymptotic rate of approach to linkage equilibrium is equal to either r or 2(m, + m,) - (m, + m,)2, whichever is smaller, where r is the recombination value and m, and m2 are the proportions of immigrants in subpopulations 1 and 2, respectively. Thus, if migration rate is high compared with recombination value, the change of linkage disequilibrium in subdivided populations is similar to that of a single random mating population. On the other hand, if migration rate is low, the approach to lnkage equilibrium may be retarded in subdivided populations. If isolated populations begin to exchange genes by migration, linkage disequi- librium may increase temporarily even for neutral loci. If overdominant selec- tion operates and the equilibrium gene frequencies are different in the two subpopulations, a permanent linkage disequilibrium may be produced without epistasis in each subpopulation. INa single random mating population, linkage disequilibrium may be produced by two factors, namely, epistatic interaction in fitness between the loci con- cerned and genetic random drift due to finite population size. The first factor has been studied by WRIGHT(1952), KIMURA (1956), LEWONTINand KOJIMA (1960) and others, while the second factor by HILLand ROBERTSON( 1968), SVED (1968) , and OHTA and KIMURA(1968a,b). However, most natural populations are not a single random mating unit but subdivided into local populations. In the Iollowing we shall show that subdivision of a population may also create linkage disequilibrium. In the case of two subpopulations, this problem was recently studied by SINNOCKand SING (1972). Although the statistical description of linkage disequilibrium in subdivided populations is simple and holds true for any pattern of migration and selection, it does not give any information on the dynamics of linkage disequilibrium in populations. In order to have a rough idea about the effect of migration on linkage disequilibrium we shall study a simple mathematical model. It will be shown that under certain circumstances migration creates linkage disequilibrium with- Parr of this work was done while the authors were affiliated with Brown University, Providence, A. I. Supported by Public Health Service grants GM-17719 and GM-20293 and National Sclence Foundation grant GB-21224 and GP-20868. Genetics 75: 213-219 September, 19i3 214 M. NE1 AND W.-H. LI out epistatic selection. Our method will be deterministic rather than stochastic: a general stochastic treatment seems to be quite complicated. Linkage Disequilibrium in a Subdivided Population Consider a population of size N which is divided into k subpopulations. Let N, k be the size of the i-th subpopulation with N, = N. We consider two loci A a=1z and B at which pairs of alleles A.a and B:b are segregating respectively. Let the frequencies of the four types of gametes AB, Ab, aB and ab in the i-th subpopu- lation be P,, Q1, R, and S, (P,f Q,+ R,f S, = 1 ) respectively. Then the linkage disequilibrium in the i-th population is given by D, = PJ, - Q,R, (1) The average value of linkage disequilibrium for all subpopulations is k b = E(D) = I: WLD,, (2) r=l where wa= N,/N is the proportion of the i-th subpopulation. In terms of gamete frequencies, D may be expressed as D E(PS - QR) =FS - QR i- COV(P,S)- COV(Q,R) , (3) where Cov denotes the covariance. The frequencies of alleles A and B are given by pA =P -I- Q and pB = P + R, respectively. Thus, the covariance between pA and pBis cov (pA,pB) = cov (PfQ, P+R) COV(P,1-Q-S) f COV(Q,P+R) = -COV (P,S) + COV(Q,R) . (4) Therefore, if we denote the linkage disequilibrium in the total population by D,i.e., DT = pg- QR, then we obtain the following simple formula: DT = fi f cov (pA,pB). (5) Namely, the linkage disequilibrium in the total population is the sum of the average linkage disequilibrium for subpopulations and the covariance of gene frequencies of the two segregating loci. If the number of subpopulations is 2, then formula (5) reduces to DT= WDi f WzD?f WiWz PA^ - PA^) (PB~- PB,) 7 in which pA, and pB are the frequencies of allele A and allele B in the i-th sub- population, respectively. This is identical with SINNOCKand SING’S (1972) formula, though they inadvertently put a negative sign in front of the last term. Note also that formula (5) can be easily extended to the case of multiple alleles if linkage disequilibrium is defined as D,,= PALR,- pa pB,, in which PA’ and PB/ LINKAGE DISEQUILIBRIUM 215 are the frequencies of the i-th allele at the A locus and the j-th allele at the B locus, respectively, and PaiBiis the frequency of the gamete type AiBj. Formula (5) indicates that if Cov(pA,ps) # 0, Dr may be 0 even if D is not 0, or conversely DT may not be 0 even if D is 0. Migration Between Two Populations In a single random mating population with no selection, the linkage disequi- librium is reduced in every generation at a rate of r, where r is the recombination value between the two loci under consideration. Let us now study how this property is affected if the population is not a single random mating unit. For simplicity, we shall consider the simplest case where migration occurs between two populations. Let N, and N, be the sizes of populations 1 and 2, respectively. We assume that N, and Nzare fairly large and remain constant for all genera- tions. If the two populations exchange M individuals per generation, the pro- portions of migration in populations 1 and 2 are given by m, = M/N, and m, = M/N,, respectively. Although mathematically m, and m, can take any value between 0 and 1, we confine our discussion to the case of (m,f m2)/25 0.5, since in nature the mean of m, and m, are unlikely to be larger than 0.5. For example, m, = m, = 0.5 corresponds to the case of random mating in the whole population. Let P1,Q1,R1,and S, be the frequencies of the four gamete types AB, Ab, aB, and ab in a generation in population 1, respectively. We assume that migration occurs after mating. Then, after migration they become (1 - ml)P1 f mlP2, (1 - m,) Q1 + mlQ2, (1 - m,) R, + mlR2 and (1 - ml)S1 + mlSz, respectively, where P,, Q,, R,, and S, are the gamete frequencies in population 2. The gamete frequencies in the next generation, after meiosis, are given by Pi = (1 - m,) (P, - rDll) + ml(Pz - rDZz) Q:= (1 - m,) (Q1+ rDll) + ml(Qz+ rD2d (6) R; = (1 - m,) (R, + rDI1)+ m, (R, + rDzz) S’,= (1 - m,) (S, - rOl1) f m,(S, - rDzz) where Dll = PIS, - Q,R,, D,,= P,S2 - Q,R,, and r is the recombination value between the A and B loci. Similar recurrence equations may be obtained also for the gamete frequencies in population 2. If random mating occurs in each pop- lation, the linkage disequilibria for populations 1 and 2 in the next generation are given by D;, = P<S:- Q:Rf1and D:, = Pis’, - QZR‘,, respectively. Therefore, we obtain the following recurrence equations for linkage disequilibria: D;,= (1 - m,) (1 - m, - r)Dl, + m,(l - m1)D1,3- m,(ml - r)Dzz D:,= [ (1 - m,) (2mz- r) - m,r]Dll + (1 - m, - mz + 2m1m2)D,, + [(l - m,) (2m1- r> - mlrla, D’,,=m, (m,- I) D,, + m, (1 - m,) D,, + (1 - m,) (1 - m, - r)D,, 216 M. NE1 AND W.-H. LI where In matrix notation the above equation may be written as D(t -I- 1) = MD(t) (7) where D (t) is the column vector of D,,(t) , D,,(t) , and D,, (t), in which D,j (t) denotes the value of Dij in the t-th generation, and (1 -ml)(l--ml-r) m,(l - m,) m, (m,- r) M = (1 - m,) (2m, - r) - m,r 1 - m, - m, + 2qm, (1 - m,) (2m, - r) - mlr iIm, (m, - r) m,(l- m2) (1 - m,) (1 - m, - r) 1 The eigenvalues for M are A, = 1 - I, h, = (1 - ml - m,)’, and h3 = (1 - I) (1 - m, - m,), and the corresponding eigenvectors are VI= I]v,= E] v3= [mzrj in which a = m,[ (m,+ 2mz - m1mz - m: - 1)I + ml(2 - 3mi - 3m2) + mi (mi+ m,) ‘1 b = [ (I - mlr - 2m, + 2m1m,+ m:)a + m, (I - ml>c]/[ml (1 - ml)1 c =m2[(em, + m, - m,ma- m:- 1)r + m,(2 - 3ml - 3m2) + m,(m1+ m,)’]. Using the standard method of matrix algebra, it can be shown that the general formulas for D,,(t) , D,,(1) and D,,(t) are D,,(t) = llh; + a1,h; + mlI,h~ Dlz(t) = 21,Xt, + bI,ht, + (ml - mz)13Xt (8) D,,(t) = llhi i-c1,h: -I- ml13hi where I, = C [(c - b)Dll(0) + aD,,(O) - aD,,(O) - ml “z { (2c - b)D,,(O) + (a- C)D,,(O) - (2a - b)D,,(O))l I, = c [Dll(O> - DlZ(0) +D22(0)1 I, = c [ (2c - b)D,,(O) + (a - C)D12(0) - (ea - b)D22(0)I/(m1+ m,> in which C = (a- b + c)-l.