Downloaded by guest on September 24, 2021 ftelvl fsufln ihndfeetcrmsms(50–52), chromosomes different same within shuffling the made of been by levels have produced the comparisons gametes Finally, of 45–49). across 39, and (37, 44] individual ref. in reviewed www.pnas.org/cgi/doi/10.1073/pnas.1817482116 and flies (37–42), in [e.g., humans sex (35–37), same the mice of differ- (30–34), individuals exist also across can shuffling There in (26). informative ences shuffling of be value concomitantly adaptive 27), could the differ- which of 26, these 26–29), explain (14, 14, (13, to shuffling ences theories in evolutionary has several differences the literature prompting testing male/female large to A studied 17). 23–25) distinguishing (15, also 15, domestication from (1, of sex effects ranging (reviewed genomic of implications species advantages across with evolutionary the out 22–24), carried refs. been in have shuffling interest. considerable wide empirical of much of quantity occurs subject the a that genetic been therefore, has shuffling selfish and is, of importance of production amount invasion gamete harmful total in the The from (11–14). par- and complexes evolving separately 10) been rapidly (9, act further from asites has populations to shuffling protecting selection in Genetic implicated (3–8). natural efficiency mutations long-term allowing distinct the on by improves adaptation it envi- addition, of changing or In novel (2). to and adaptation ronments gametes aids among thus diversity and genetic (1), offspring to leads It populations. ual T recombination independent crossovers. is to of entirely which that almost of (ii due (i contribution that is and the that 1/2 shuffling assortment, reveals of of value level female possible high maximum and this its male to analysis close human our is from of humans data Application such sequencing. DNA to either by positions or crossover advantage determining cytologically for selective methods of a emergent Utilization ful suggests interference. which crossover spaced, of evenly more that fact are the implica- include metric Intrinsic this assortment. of independent tions number from crossover and be position from can and measure contributions This separate production. into gamete decomposed during shuffled are loci a Veller Carl law and second positions Mendel’s crossover account into shuffling takes genetic that genome-wide of measure rigorous A osntsfe rmteelmttos edfieteparame- the define We that limitations. shuffling these genome-wide from ter of suffer assort- measure not rigorous independent does Here, a law). from present second (Mendel’s shuffling we chromosomes homologous consider traditional of Moreover, to ment tip. the criti- fail more toward also causes measures crossover chromosome are a a of chromosome than middle shuffling a the toward along crossover a because crossovers cal: inadequate, of qualitatively positions is the This of meiosis. number average per the crossovers counting measures simply Existing by quantity. this crossing-over occurs by of consider measure hampered that direct been a shuffling of have absence on genetic studies the Such of critically production. amount gamete rely Singh) total during Nadia genetics and the Przeworski evolutionary of Molly by evaluation reviewed in 2018; 11, studies October review Comparative for (sent 2018 27, November Kleckner, Nancy by Contributed 02138 MA Cambridge, University, Harvard 02138; MA Cambridge, eateto raimcadEouinr ilg,HradUiest,Cmrde A02138; MA Cambridge, University, Harvard Biology, Evolutionary and Organismic of Department orsodnl,cmaaiesuis(,1–1 fgenome- of 15–21) (1, studies comparative Correspondingly, r uiggmt rdcini nipratpoesi sex- in process genes important inherited an paternally is and production gamete maternally during of shuffling he stepoaiiyta h lee ttornol chosen randomly two at alleles the that probability the as a,b | ac Kleckner Nancy , crossing-over c eateto oeua n ellrBooy avr nvriy abig,M 23;and 02138; MA Cambridge, University, Harvard Biology, Cellular and Molecular of Department | sex | c,1 meiosis n atnA Nowak A. Martin and , r slre hncrossovers when larger is 0tmsgetrthan greater times ∼30 r sealdb power- by enabled is Arabidopsis ) a,b,d (43); r in ) aeit con hfigcue yI.Afut esr of measure fourth A measures IA. above by supposed caused the this shuffling beyond of account shuffling None into to take minimum. contribute num- required the that structurally is COs frequency” to CO of chromosomes “excess each ber its the Since for (55), CO (15). properly one number segregate least COs at chromosome of requires haploid number usually the the bivalent distance is (54). of map gametes used excess of sometimes the 1% in is being in that shuffled cM measure are that Another 1 loci CO either cM, linked average measured two 50 is between length as by of Map multiplied prophase number data. frequency the sequence meiotic from simply during or is cytologically occur frequency CO that only fre- length. COs CO considered map specifically, have more or and shuffling quency crossing-over of of contribution measures the used widely most respectively. shuffling, chromo- chromosomal” homologous of (IA), assortment somes crossovers independent by by both and caused (COs) is Shuffling shuffling. genome-wide of sex new as used be 53). to (26, or chromosomes (13) complexes to genetic susceptible selfish most harboring are chromosomes which for implications with 1073/pnas.1817482116/-/DCSupplemental. y at online information supporting contains article This 1 aadpsto:MTA oefralcluain saalbeo iHb( GitHub on available is calculations all for code MATLAB deposition: Data the under Published interest.y of conflict no declare authors Oregon.y The of University N.S., paper.y and and the University; wrote N.K., Columbia M.A.N. C.V., M.P., and Reviewers: research; N.K., C.V., performed and C.V. data; analyzed research; M.A.N. designed C.V. contributions: Author github.com/cveller/rbar † * owo orsodnesol eadesd mi:[email protected]. y Email: addressed. be should correspondence whom To h nmiuu em“eei hfig o ut“hfig)throughout. “shuffling”) use just we (or confusion, shuffling” avoid “genetic To biology. term molecular unambiguous molecules—in the DNA break- of meaning—the rejoining different and a has age recombination but literature, genetics population below. hfigcue yete rsigoe rI srfre oa rcmiain nthe in “recombination” as to referred is IA or crossing-over either by caused discussed Shuffling as contribution negligible a makes but shuffling causes also conversion Gene ihvlei agl u oidpnetassortment. this independent that to and due one-half largely of is value value high possible maximum its to close htarnol hsnpi flc hfe hi lee na in alleles their shuffles loci of Measuring pair gamete. chosen randomly shuf- features: a these total that account of into measure takes a that the develop fling importantly, We crossovers. and, of assortment positions independent take to account failing crossovers, into of shuffling number genetic average of the count amount simply total chromosomes. the homologous of measures over of Traditional crossing assortment both of by independent shuffling gametes the and in DNA is paternal organisms and sexual maternal in process important An Significance uniaiecmaioslk hs eur rprmeasure proper a require these like comparisons Quantitative ∗ hc opie“nrcrmsml n “inter- and “intrachromosomal” comprise which NSlicense.y PNAS b rga o vltoayDnmc,HradUniversity, Harvard Dynamics, Evolutionary for Program ). y r nhmn,w n htttlsufln is shuffling total that find we humans, in d eateto Mathematics, of Department † www.pnas.org/lookup/suppl/doi:10. NSLts Articles Latest PNAS npeiu tde,the studies, previous In r h probability the , | f10 of 1 https://
EVOLUTION aggregate shuffling, which does take into account IA, is Dar- incorporates CO position in the measurement of genome-wide lington’s “recombination index” (RI) (1, 56, 57), defined as the shuffling has been developed and implemented. sum of the haploid chromosome number and CO frequency. The Here, we present a simple intuitive measure of the genome- rationale for this measure derives from the fact that, given no wide level of shuffling. We define r¯ as the probability that a chromatid interference in meiosis (54, 58), two linked loci sep- randomly chosen pair of loci shuffles their alleles in meiosis, arated by one or more COs shuffle their alleles with probability taking into account CO number and position as well as the con- 1/2 in the formation of a gamete, as if the loci were on sepa- tribution of IA. We have chosen the name “r¯” to echo classic rate chromosomes. The RI is, therefore, the average number of population genetics terminology, where the parameter r for a “freely recombining” segments per meiosis. given pair of loci is the probability that they shuffle their alleles Importantly, none of these existing measures take into account in a gamete. Our parameter r¯ is simply this quantity averaged the specific positions of COs on the chromosomes. Intuitively, across all locus pairs. however, CO position is a critical parameter. For example, a CO r¯ is the most direct metric of the primary effect of genetic at the far tip of a pair of homologous chromosomes does little shuffling, the diversification of gametes, and thus the transmis- work in shuffling the genetic material of those chromosomes, sion of new genetic combinations to offspring (1). r¯ should be while a CO in the middle causes much shuffling (Fig. 1). Addi- used when such diversification is the quantity of interest: for tionally, two COs close together may cancel each other’s effect example, in judging the ability of a population to respond to a (except for the few loci that are between them), and thus result sudden change (or continual fluctuations) in its environment (2). in less allelic shuffling than two COs spaced farther apart. Genetic shuffling also has indirect effects, such as the gradual Therefore, existing measures do not actually define the total reduction of allelic correlations (linkage disequilibrium) within genome-wide amount of shuffling, instead serving only as prox- populations over time (6, 7) and the prevention of invasion of ies for this critical parameter. This is not a trivial concern. There selfish genetic complexes (11). Here, CO positions and IA again is significant heterogeneity in the chromosomal positioning of matter, so standard metrics that simply count COs remain inade- COs at all levels of comparison—between species (16, 59, 60), quate. Therefore, r¯ is likely a better measure in these cases as populations within species (61–65), the sexes (14, 26, 66–69), well, although we suggest (Discussion) modifications of r¯ that individuals (42, 70–73), and different chromosomes (50, 69, 74, more directly measure these quantities of interest. 75), all of which will impact the level of shuffling that these COs In this work, we define the quantity r¯, develop it mathemat- cause. Moreover, there can be major differences in chromosome ically and statistically, and document its intrinsic implications. number and size across species (1, 76, 77), which will seriously We show how it can be decomposed into separate components influence the total amount of shuffling due to IA. deriving from COs and from IA. We then discuss the approaches The fact that the positions of COs matter for the total amount now available to allow chromosome-specific and/or genome-wide of shuffling has been recognized for many years (1, 66, 76, 78–80). measurement of CO positions. With these developments in hand, The need for a measure of total shuffling that accounts for CO we present an application of r¯ to quantitative evaluation of positions has also previously been recognized. Indeed, Burt et al. shuffling in human males and females. (27) give an explicit formula, identical to Eq. 3 below, for “the proportion of the genome which recombines” (in the population Derivation of ¯r genetics sense†). Colombo (81) also gives an explicit character- r¯ is the probability that the alleles at two randomly chosen loci ization of what we call r¯ phrased as a generalization of the RI. are shuffled in the production of a gamete. It can be calculated Finally, Haag et al. (82), after noting that terminal COs cause with knowledge of CO positions on each chromosome in con- little shuffling and that map length is, therefore, an imperfect junction with knowledge of the fraction of the total genomic measure of genome-wide shuffling, suggest a better measure to length (in base pairs) accounted for by each chromosome (chro- be “the average likelihood that a CO occurs between two ran- mosome lengths influence shuffling caused by both IA and COs). domly chosen genes.” However, no mathematical expression that In what follows, we always assume that there are sufficiently many loci so that the difference between sampling them with or without replacement is negligible.
AB Formulas for r. In the ideal situation, CO positions are defined for individual gametes, and the parental origins of the chromo- some segments delimited by these COs are known. In this case, the proportion p of the gamete’s genome that is paternal can be determined, and the probability that the alleles of a randomly chosen pair of loci were shuffled during formation of the gamete (equivalently, the proportion of locus pairs at which the alle- les are shuffled), either by crossing-over or by IA, is simply the product of the probabilities that one locus is of paternal origin (probability p) and the other is of maternal origin (probability 1 − p) in each of two possible combinations:
r¯ = 2p(1 − p). [1]
Such data can emerge directly from sequencing of single gam- etes (Fig. 2B) or of a diploid offspring in which the haploid contribution from a single gamete can be identified (Fig. 2C). Fig. 1. The position of a CO affects the amount of genetic shuffling that In some analyses, while CO positions for a gamete are known, it causes. The figure shows the number of shuffled locus pairs that result the specific parental identities of the chromosome segments that from crossing-over between two chromatids in a one-step meiosis. The chro- these COs delimit are not known, because the gamete producer mosome is arbitrarily divided into eight loci. (A) A CO at the tip of the has been sequenced but its parents have not. In this case, we can chromosome between the seventh and eighth loci causes 7 of 28 locus pairs nonetheless calculate an expected value of r¯ for the gamete. The to be shuffled in each resulting gamete. (B) A CO in the middle of the chro- parental origin of segments will alternate across COs on each mosome between the fourth and fifth loci causes 16 of 28 locus pairs to be chromosome in the gamete, and therefore, we can still define, shuffled in each resulting gamete. The central CO thus causes more shuffling for any given chromosome k, the proportions that originate from than the terminal CO. the two different parents (pk and 1 − pk ). However, it is not
2 of 10 | www.pnas.org/cgi/doi/10.1073/pnas.1817482116 Veller et al. Downloaded by guest on September 24, 2021 Downloaded by guest on September 24, 2021 oiaesufldi eutn aeei / utpidb the by multiplied 1/2 segments: of separate is pair on are gamete chosen loci resulting randomly two the a a that at in probability meiosis alleles shuffled at the are COs that loci of probability configuration the interference the chromatid I, given no Therefore, assumes 57)]. [this (54, alle- 1/2 seg- their probability shuffle different they with on segments, les different situated is on be situated which indeed to of are need probability production loci the the two in ments, alleles the its gamete, shuffle a to of pair locus selected domly oa eoelnt is length segment genome that suppose total and order, (arbitrary) some of into number bivalents haploid Burt the the by divide If given 2A). (Fig. formulas (81) is the chromosomes Colombo and following (27) case al. this et in calculated be 1 elre al. et Veller their way. about quantitative assertions rigorous a previous shuffling), in evaluate contributions (interchromosomal to relative IA us allowing from thereby deriving component and a shuffling) (intrachromosomal COs from deriving component (Discussion). of species 86) of number 85, a formulation the (1, Our frequency for increase CO to way than is rather effective chromosomes shuffling IA, most genome-wide is increase the species to sexual that in correspondingly, shuffling of or, source predominant of the Components that Intrachromosomal and Inter- in especially diversity, of measure (84). used ecology commonly a (83), index E[¯ for value expected An I. prophase meiotic of stage chromosomes bivalent that along at positions CO define to allows somes gamete. the in realized actually value the Eq. by so multiplied probabilistic, chromosome), is that chromosomes, same term different the probability on on the are are is loci term chosen 1 second randomly The two the origin. parental ent k Eq. mosomes in term first The proportion a are there is chromosome fraction and If shuffled a set alle- Law. been haploid the the Second have in Mendel’s chromosomes that chromosomes with separate probability consistent on the loci one-half, that two address at We assuming sets parents. les which by different next or ambiguity same the the this to from chromosome are one segments of from know to possible ‡ fpsil atrso hoai novmn nterslto fCsa eoi and I meiosis at COs of thereafter. distribution resolution chromatids the the of to in patterns respect involvement segregation with chromatid time of this patterns but possible along value of expected COs its observe calculate we we therefore, when I, Similarly, meiosis to at assortments. respect chromosomes possible with bivalent expectation these its of calculate distribution we parental Therefore, the possible segments. the various of assortments the different of for origins then If different known, being “parameter”). value a not realized is are its it with origins here, paternal (i.e., and known maternal precisely the shuffled—is been have alleles of which value realized the segments, chromosome ie aeesqec n nweg fteptra n aenloiiso all of origins maternal and paternal the of knowledge and sequence gamete a Given − − r ial,ctlgclaayi fmitcpcyeechromo- pachytene meiotic of analysis cytological Finally, utpidb h rbblt ht fs,te r fdiffer- of are they so, if that, probability the by multiplied ] p P sdfie yEq. by defined as , k k n so h te,then other, the of is =1 E[¯ k L L r k 2 k httornol hsnlc r nchromosome on are loci chosen randomly two that = ] p ie,1 (i.e., ftegnm’ eghadi edtriethat determine we if and length genome’s the of k n X k fchromosome of =1 n n hr r oa of total a are there and r ¯ 2p E[¯ − 2 losu opriinttlsufln noa into shuffling total partition to us allows r k h rbblt htarno aro loci of pair random a that probability the stepoaiiysme vralchro- all over summed probability the is l = ] (1 i spootoa oteGini–Simpson the to proportional is 3, n with , − ¯ r + 1 2 narsliggmt saanarno variable; random a again is gamete resulting a in 2 p I
k sa xetto of expectation an is )L emns ae h emnsin segments the Label segments. 1 l 1 − k k 2 + ¯ r bigtepooto flcspisat pairs locus of proportion the —being + so n aetloii and origin parental one of is n X 1 i · · · =1 +I − 1 2 l P +
i 2 ‡ ! I 1 l i n ¯ r n =1 +I . − O,te hs COs these then COs, +I eoe admvariable, random a becomes r . X k l ti fe argued often is It 1 = =1 n i 2 hssecond This 1/2. ftetoloci two the If . L o n ran- any For . k i r k 2 ¯ sfato of fraction ’s consfor accounts ! ahrthan rather . r ¯ can [2] [3] n L is length genomic total of length tion genomic total of fractions with is chromosomes lent of number haploid the that pose Eq. as same the is which Eq. in term first the then is retain partition we therefore, and 4 which), is which and origin parental we proportion then a delimit, the calculate they not still that but can segments complement the haploid of origin a parental in specific realizations CO know we If i.2. Fig. (B nagmt rhpodcmlmn fa fsrn.I chromo- If is offspring. decomposition an for appropriate of decomposition complement this and haploid some chromosome or present same gamete first a the We in on alleles. are and is their loci component chromosomes shuffle two intrachromosomal that separate the probability on while the are alleles, their loci shuffle chosen randomly two C B A 1 u oeifrainaottescn em h appropriate The term. second the about information lose but and enwpeettedcmoiinfor decomposition the present now We h necrmsmlcmoetof component interchromosomal The + k DNA ingamete ofpaternal proportion total genomiclength of length asafraction segment bivalent chromosome crossover positionon L C k exhibits 2 acltn h xetdvleof value expected the Calculating (A) Calculating ) Paternal gamete + otisaproportion a contains E[¯ r ¯ · · · = i r ’s genomic = ] X | k + =1 n nrcrm component Intrachrom. I k a | X L k
=1 2p n nrcrm component Intrachrom. O,dvdn tit segments into it dividing COs, n ¯ r 1 = k ngmtso fsrn sn Eq. using offspring or gametes in 2p 1 (1 {z − k hn Eq. Then, . − (1 {z p 2. chromosome 1 p k − k fteohr(lhuhntknowing not (although other the of )L p k L chromosome 2 k 2 } )L p k p + k k = k 2 } fchromosome of | X 3 fptra otn,te the then content, paternal of i l + 6=j k spriinda follows: as partitioned is necrm component Interchrom. ,1 l | 2 1 2p k necrm component Interchrom. ¯ + r ,i NSLts Articles Latest PNAS ,
E[ i r ¯ Chromosome . l chromosome 1 (1 k 1 E[¯ ¯ r ,2 stepoaiiythat probability the is ] {z − tmissIuigEq. using I meiosis at , − r + Other gamete Other {z ] X k chromosome 2 p tmissI Sup- I. meiosis at =1 · · · n Expected valueof Expected j i Maternal gamete 1. )L n 1, = L + k n htbiva- that and i k 2 L ob fone of be to ! l k j } . . . } ,I . , k k +1 | , sfrac- ’s I f10 of 3 with , k
1 + [5] [4] 3. r ¯
EVOLUTION n Ik +1 ! n ! 1 X 2 X 2 1 X 2 Properties and Intrinsic Implications of r [¯r] = L − l + 1 − L . [6] E 2 k k,i 2 k Properties. We note three properties of r¯. First, its minimum k=1 i=1 k=1 value is zero, and its maximum value is one-half, the latter rely- | {z } | {z } Intrachrom. component Interchrom. component ing on our assumption of many loci. The maximum value of r¯ in a gamete can occur by chance equal segregation of mater- Averaging r. The above measures can be aggregated to obtain nal and paternal DNA to the gamete. The maximum value of the average value of r¯ across many gametes (or meiocytes). We E[¯r] at meiosis I requires, unrealistically, that every pair of loci denote the average value of a variable x by hxi. To average r¯ either is on separate chromosomes or experiences at least one given sequence data from many gametes and supposing that, for CO between them in every meiosis. The minimum value of r¯ in each gamete, we can distinguish which sequences are paternal a gamete could result from chance segregation of only CO-less and which are maternal, we can take the average value of Eq. 4, chromatids of one parental origin to the gamete. The minimum noting that the Lk are constants: value of E[¯r] at meiosis I requires crossing-over to be absent [as in Drosophila males and Lepidoptera females (76)] and either a n karyotype of one chromosome or a meiotic process that causes X 2 X hr¯i = h2pk (1 − pk )i Lk + h2pi (1 − pj )i Li Lj . [7] multiple chromosomes to segregate as a single linkage group [as k=1 i=6 j in some species in the evening primrose genus, Oenothera (87)]. | {z } | {z } Second, r¯ satisfies the intuitive property that a CO at the Intrachrom. component Interchrom. component tip of a chromosome causes less shuffling than a CO in the middle (some example calculations of r¯ are given in Fig. 3). If segregation is Mendelian, then in a large sample of gametes, This is easily seen in Eq. 3. Consider a bivalent on which a h2p (1 − p )i ≈ 1/2 i j , and therefore, the interchromosomal com- single CO will be placed. This will divide the chromosome 1 1 − Pn L2 ponent in Eq. 7 will be close to 2 k=1 k . into two segments of length l1 and l2, where l1 + l2 = L is If we have sequence data from many gametes and can deter- the chromosome’s fraction of total genomic length. The con- mine CO positions but not the parental origin of the sequences tribution of this CO to E[¯r] is seen from Eq. 3 to be 1/2 − 5 2 2 that these COs delimit, then we take the average of Eq. , not- (l + l )/2, which simplifies to l1l2 under the constraint l1 + l2 = L 1 2 ing again that the k are constants (which here means that the L. l1l2 is maximized when l1 = l2 = L/2 (i.e., when the CO is interchromosomal component is constant): placed in the middle of the bivalent). It is minimized when l1 = 0 or l2 = 0 (i.e., when the CO is placed at one of the far ends of the n n ! X 2 1 X 2 bivalent). In general, as the CO is moved farther from the mid- h [¯r]i = h2p (1 − p )i L + 1 − L . [8] E k k k 2 k dle of the bivalent, E[¯r] decreases. This is true regardless of the k=1 k=1 positions of the COs on other chromosomes. Importantly, this | {z } | {z } Intrachrom. component Interchrom. component relationship also holds if, instead of considering where to place a single CO on a whole bivalent, we consider where to place a Given data of CO positions along bivalent chromosomes in new CO on a segment already delimited by two COs (or by a CO many meiocytes, we take the average of Eq. 6: and a chromosome end). In general, if we place a fixed number of COs along a single chromosome, E[¯r] is increased if they are n *Ik +1 +! n ! evenly spaced (SI Appendix). 1 X 2 X 2 1 X 2 h [¯r]i = L − l + 1 − L . [9] Third, r¯ will tend to increase sublinearly with both the num- E 2 k k,i 2 k k=1 i=1 k=1 ber of COs and the number of chromosomes (Figs. 3 D–F). | {z } | {z } For example, increasing the number of COs by some factor will Intrachrom. component Interchrom. component increase the intrachromosomal component of r¯ by a lesser factor (and r¯ as a whole by a yet lesser factor, because its interchromo- Eqs. 7–9 mean that we can estimate the average intrachro- somal component is unaffected). Similarly, doubling the number mosomal contribution to r¯ separately for each chromosome, of chromosomes in the haploid set will cause a less than twofold possibly from different sets of data, and then combine these aver- increase in the interchromosomal component of r¯. That is, there ages into a final measure of r¯. This is useful, because often in are “diminishing returns” to genome-wide shuffling from having sequencing or cytological studies of large numbers of gametes increasingly more COs and more chromosomes. These effects or meiosis I nuclei, it is possible (or desired) to obtain accurate are described in mathematical detail in SI Appendix. measurements for only a subset of the chromosomes in each cell so that the sets of cells from which measurements are taken for Intrinsic Implications. We have noted above that, given some two chromosomes will not overlap. This is the case, for exam- number of COs along a chromosome at meiosis I, E[¯r] is max- ple, in the rich cytological data from human pachytene nuclei imized if they are evenly spaced. This observation carries the in ref. 48. interesting implication that positive CO interference—a classical Unrelated to averaging r¯ across multiple measurements of phenomenon (88, 89) where the presence of a CO at some point individual gametes or meiocytes, an average value for r¯ can also along a bivalent chromosome inhibits the formation of nearby be calculated directly given pairwise average rates of shuffling COs—will tend to increase r¯. It thus suggests a possible selective for all loci. These can be estimated from linkage maps generated advantage for this phenomenon (Discussion). from pooled sequence data (see below). Suppose that we have Also, the general diminishing returns to r¯ of having more COs measured, for each locus pair (i, j ), their rate of shuffling rij . hr¯i (above) carry particular implications for differences in total auto- is then simply the average value of rij across all locus pairs (i, j ). somal shuffling in the two sexes of a species. When the number If Λ is the total number of loci, then of COs and/or their localization do not differ grossly between the ! ! ! sexes, then the amount of shuffling will be similar in both sexes X Λ X Λ X Λ hr¯i = rij = rij + rij , (e.g., for male and female humans as discussed below). 2 2 2 i