Copyright  2000 by the Society of America

The Relationship Between Count-Location and Stationary Renewal Models for the Chiasma Process

Sharon Browning Department of Statistics, Texas A&M University, College Station, Texas 77843 Manuscript received September 15, 1999 Accepted for publication April 12, 2000

ABSTRACT It is often convenient to define models for the process of chiasma formation at as stationary renewal models. However, count-location models are also useful, particularly to capture the biological requirement of at least one chiasma per chromosome. The Sturt model and truncated Poisson model are both count-location models with this feature. We show that the truncated Poisson model can also be expressed as a stationary renewal model, while the Sturt model cannot. More generally, we show that there is only one family of count-location models for the chiasma process that can also be expressed as stationary renewal models. The models in this family can exhibit either positive or negative interference.

ECAUSE of its mathematical simplicity, Haldane’s cies for many of the map functions in use. A much more B Poisson process model (Haldane 1919) for the satisfactory approach is presented in Zhao and Speed occurrence of crossovers along chromosomes at meiosis (1996). Zhao and Speed (1996) show that stationary is generally assumed when calculating probabilities on renewal processes give rise to most of the map functions multiple linked genetic loci. However, there is increas- found in the literature and characterize the class of map ing interest in investigating alternative models. Several functions that may be achieved by stationary renewal authors (e.g., Foss et al. 1993; McPeek and Speed 1995; process models. Zhao et al. 1995b) have compared the fit of various Rather than define models for the crossover process models to meiosis and tetrad data. Browning (1999) directly, it is best to start with a model for the underlying presents a method for comparing the fit of models to chiasma process, since the formation of chiasmata is the identity-by-state data from pairs of related individuals. physical event underlying crossing over. At meiosis, each The approach in Browning (1999) can also be used chromosome duplicates to form two sister to incorporate any model for the crossover process into and the two pairs of homologous chromatids line up relationship analysis based on identity-by-state data from into a bundle of four. On this bundle, chiasmata occur, two individuals of uncertain relationship. and each chiasma causes two nonsister chromatids to Regardless of the crossing-over model, the genetic cross over. Figure 1 illustrates the process. Any one of distance in Morgans between two loci on a chromosome the four resulting chromatids may be the one transmit- is defined to be the expected number of crossovers ted to the offspring. It is common to assume no chroma- between them in a single meiosis; hence the rate of cross- tid interference (NCI), so that each is in- ing over along a chromosome will be 1/M. volved in a crossover with probability one-half at each A map function M(d) gives the probability of recombi- chiasma independent of the outcome at other chiasmata nation (an odd number of crossovers) in an interval of on the bundle. This assumption seems to fit the available genetic length d. Until recently, it was common to define data (see Zhao et al. 1995a) and is convenient. A chi- map functions rather than to model the crossover point asma process model with a model for chromatid inter- process directly. However, as Fisher (1947) pointed out, ference or the assumption of NCI defines a correspond- map functions do not uniquely determine multilocus ing crossover process model. recombination probabilities (the probabilities of pat- The two main classes of chiasma process models that terns of recombination and nonrecombination between have been studied are stationary renewal models and multiple loci) for more than three loci. The issue was count-location models (see Lange 1997). Count-loca- complicated by the approach of Liberman and Karlin tion models (Karlin and Liberman 1978) define a dis- (1984) for extending map functions to multilocus re- tribution for the number, N, of chiasmata and assume combination probabilities, which leads to inconsisten- that the locations of the chiasmata given N are indepen- dent. In this article, we characterize the relationship between count-location and stationary renewal models, showing that there is only one family of count-location Address for correspondence: Department of Statistics, 447 Blocker Bldg., Texas A&M University, College Station, TX 77843-3143. models that can also be expressed as stationary renewal E-mail: [email protected] models. In particular, we examine the Sturt and trun-

Genetics 155: 1955–1960 (August 2000) 1956 S. Browning

RESULTS In this section we show that the truncated Poisson model can be expressed as a stationary renewal model while the Sturt model cannot. Moreover, we show that there is only one family of count-location chiasma pro- cess models that can be expressed as stationary renewal models, and we characterize interference for this family. Theorem 1. Count-location chiasma process models with probabilities of the following form can be expressed as stationary renewal models: Figure 1.—The process of chiasma formation. ␤nLneϪ␤L pn ϭ (1 Ϫ p0) for n Ն 1, (1) n!(1 Ϫ eϪ␤L) cated Poisson count-location models, showing that the where p ϭ P(N ϭ n) and ␤Ͼ0 solves truncated Poisson model is a stationary renewal model n Ϫ␤L while the Sturt model is not. (1 Ϫ p0)␤/2 ϭ 1 Ϫ e . (2) Sturt and truncated Poisson models: The formation Conversely, count-location models can be expressed as renewal of at least one chiasma per chromatid bundle seems to models only if they have probabilities of this form. be essential for correct disjunction of chromatids at In this family of models, the probability, p , of no chiasmata meiosis. As a result, several chiasma process models have 0 can take any value between zero and one. But by definition of been proposed that ensure at least one chiasma per genetic length (genetic length, in morgans, equals expected chromatid bundle. The Sturt (1976) and truncated number of crossovers per meiosis, which equals half the expected Poisson (Lange 1997) models are both count-location number of chiasmata on the chromatid bundle), L ϭ E(N)/2, models, with the locations of chiasmata distributed uni- so L and p must satisfy L Ն (1 Ϫ p )/2 since E(N) Ն formly and independently along the chromosome given 0 0 P(N Ն 1) ϭ 1 Ϫ p . N, the number of chiasmata. 0 The models are mixtures of scaled truncated Poisson and For a chromosome of genetic length L M, the Sturt point mass at zero. Let X be the genetic distance to the first model has the number N of chiasmata distributed as 1 1 (leftmost) chiasma (if there is no chiasma on the chromosome, ϩ Poisson(2L Ϫ 1)—that is, the model places one chi- we consider X Ͼ L), and let X be the genetic distance between asma at a uniformly chosen random location on the 1 2 the first chiasma and the second (X Ͼ L Ϫ X if there is only chromatid bundle and then superimposes a Poisson 2 1 one chiasma). Write f (x) for the probability density of X , chiasma process with mean 2L Ϫ 1 so that the overall X1 1 and F (x) ϭ P(X Յ x) for the cumulative distribution func- number of chiasmata has mean 2L as required. (Note X2 2 tion of X .On0 Ͻ x Ͻ L, f (x) ϭ 2eϪ␤x for these models, that a genetic length of L M implies an average of L 2 X1 and F (x) ϭ 1 Ϫ eϪ␤x. Thus f (x) ϭ 2(1 Ϫ F (x)),as crossovers or 2L chiasmata per meiosis). The probability X2 X1 X2 expected for a stationary renewal process of rate 2. The renewal distribution of N under this model is, for n Ն 1, distribution on x Ն L is in general not uniquely determined (2L Ϫ 1)nϪ1eϪ(2LϪ1) by the count-location model. P(N ϭ n) ϭ . (n Ϫ 1)! Proof of Theorem 1 can be found in the appendix. The truncated Poisson model has N, the number of Corollary 1. It follows immediately that the truncated chiasmata, following a Poisson distribution conditional Poisson model can be expressed as a stationary renewal model on N Ն 1. The rate ␣ of the distribution must be chosen and that the truncated Poisson model is the only count-location such that the expected value of N ϭ 2L, hence ␣ solves with P(N ϭ 0) ϭ 0 that can be expressed as a stationary Ϫ␣L Ϫ␣L ␣L/(1 Ϫ e ) ϭ 2L and thus ␣/2 ϭ 1 Ϫ e . For the renewal model. In the appendix we show that the renewal truncated Poisson model, the probability distribution distribution is uniquely determined on x Ն L for this model of N is, for n Ն 1, and has FX2(x) ϭ 1 for x Ն L. L n Ϫ␣L ϭ (␣L) e Note that ͐0 fX1(x) dx 1 so that fX1 is a proper density ϭ ϭ P(N n) Ϫ␣L . with support on (0,L), which ensures at least one chiasma on n!(1 Ϫ e ) Ϫ␣L the chromosome. Also, the distribution of X2 has mass of e Note that under these two models, all chromosomes at L. This mass does not affect the process on [0,L] since we must have L Ն 0.5. The requirement of at least one observe X2 only in the interval [X1,L], with X1 Ͼ 0. chiasma per chromatid bundle guarantees that at least Corollary 2. The Sturt model cannot be expressed as a half of the resulting chromatids show at least one cross- stationary renewal model. over. Thus the genetic length in morgans (the expected number of crossovers per meiosis) of a chromosome is Characterization of interference: A traditional mea- at least 0.5. sure of chiasma interference is the coincidence coeffi- Models for the Chiasma Process 1957 cient, C, which is, for two disjoint intervals on a chromo- If M corresponds to a stationary renewal model, this some, the ratio of the probability of recombination in nondifferentiability at L arises when the interevent dis- both intervals to the product of the marginal probabili- tribution has mass at L, as is the case for the renewal ties of recombination in each of the intervals. Equiva- representation of the truncated Poisson model. The lently, map function for the truncated Poisson model is 1 Ͼ Ͼ ∞ Ϫ␣d Ϫ␣L ⁄4P(NI1 0 and NI2 0) 1 n 1 e Ϫ e C(I , I ) ϭ M(d) ϭ 1 Ϫ pn(1 Ϫ d/L) ϭ 1 Ϫ 1 2 1 Ͼ 1 Ͼ 2΂ ͚ ΃ 2΂ 1 Ϫ eϪ␣L ΃ ⁄2P(NI1 0) ⁄2P(NI2 0) nϭ0 1 (see Lange 1997). It has M(L) ϭ ⁄2 and limd L MЈ(d) ϭ (Risch and Lange 1979), where NI1 and NI2 are the % 1 Ϫ␣L Ϫ␣L Ϫ␣L numbers of chiasmata in the two intervals. If C Ͻ 1, we ⁄2␣e /(1 Ϫ e ) ϭ e Ͼ 0, and hence M does not say that chiasma interference is positive, while if C Ͼ 1, satisfy (B) or (B)Ј. we say that chiasma interference is negative. Risch and Thus, if one is willing to consider only stationary re- Lange (1979) give a partial characterization of interfer- newal processes for which the interevent distribution is ∞ ence for count-location models. absolutely continuous on [0, ) [and thus has a density on [0, ∞), as the statement of the theorem implicitly Theorem 2. For the family of models given in Theorem 1, assumes], the theorem is correct, although it would be C ϭ␤/2 for any pair of disjoint intervals on a chromosome. best to replace MЈ(L) by limd L MЈ(d). However, while % Hence C increases monotonically as a function of p0, with C Ͻ it is, for biological plausibility and computational sim- Ϫ2L Ϫ2L 1 for p0 Ͻ e , and C Ͼ 1 for p0 Ͼ e . plicity, reasonable to require the interevent distribution Proof of Theorem 2 can be found in the appendix. to have a density on [0,L), it seems unnecessary to re- quire the interevent distribution to be absolutely contin- uous on [L,∞) as this part of the distribution has no COMPARISON WITH THE RESULTS relevance or practical application for chromosomes of OF ZHAO AND SPEED length L. Zhao and Speed (1996) investigate properties of ge- netic mapping functions corresponding to stationary DISCUSSION renewal chiasma processes. According to Theorem 2 of their article, the truncated Poisson model does not In some cases, the crossover process model has a correspond to a stationary renewal model, which contra- simple form and is as easy to work with as the chiasma dicts Corollary 1 of this article. process model, while in other cases, it is simplest to We reproduce Zhao and Speed’s theorem and then work directly with the chiasma process. In Monte Carlo resolve the discrepancy. methods that sample possible realizations of either the For a map function M defined on [0, L], where L Ͻ ∞, chiasma process or the crossover process underlying we say that M satisfies condition (B) if the observed data (such as in Browning 1999), Monte M(0) ϭ 0 (B1), MЈ(d) Ն 0, Carlo error is reduced by sampling the crossover process for all d (B2), MЈ(0) ϭ 1 (B3), M″(d) Յ 0, rather than the chiasma process. Assuming NCI, the Ј ϭ ϭ 1 for all d (B4), M (L) 0 (B5), M(L) ⁄2 (B6). crossover process corresponding to a count-location chi- We say that M satisfies condition (B)Ј if it satifies (B1)– (B4) and asma process is also a count-location process. Let M be the number of crossovers on a chromosome. For the Ј Ͼ Ј Ͻ 1 Ј M (L) 0 (B5) , M(L) ⁄2 (B6) . truncated Poisson, Theorem [Zhao and Speed (1996)]. Let M be the map eϪ␣L/2 function for a stationary renewal chiasma process satis- P(M ϭ 0) ϭ fying NCI on a chromosome arm of finite length. Then 1 ϩ eϪ␣L/2 M satisfies (B) or (B)Ј for any L. Conversely, suppose that 1 and a function M from [0, L] into [0, ⁄2] satisfies (B) or (B)Ј. Then there is a stationary renewal chiasma process whose eϪ␣L/2(␣L/2)m Ϫ ″ P(M ϭ m) ϭ for m Ն 1. map function is M and whose renewal density is M Ϫ Ϫ␣L when d Յ L. m!(1 e ) There is a difficulty with this theorem as M is only Derivation of this result, along with corresponding prob- defined on [0, L] and thus MЈ(L) is not actually defined. abilities for the Sturt model, can be found in Browning Even if M can be extended beyond L (as will be the (1999). By thinning, the truncated Poisson crossover case when there is a stationary renewal chiasma process process is a stationary renewal process and has inter- whose map function is M), the extension may not be event density differentiable at L. In particular, if M satisfies (B), then, ␣ Ϫ␣x/2 assuming NCI, the only possible extension is M(L) ϭ g(x) ϭ e for x Ͻ L. 1 2 ⁄2 for d Ն L, and thus if limd L MЈ(d) ϶ 0, then M is not % differentiable at L. The truncated Poisson and Sturt models, along with 1958 S. Browning other count-location models that require at least one Sturt, E., 1976 A mapping function for human chromosomes. Ann. Hum. Genet. 40: 147–163. chiasma per chromosome, are useful models because of Zhao, H., and T. P. Speed, 1996 On genetic map functions. Genetics the biological requirement that they satisfy. As renewal 142: 1369–1377. models have received more attention in the recent statis- Zhao, H., M. S. McPeek and T. P. Speed, 1995a Statistical analysis of chromatid interference. Genetics 139: 1057–1065. tical genetics literature than count-location models, and Zhao, H., T. P. Speed and M. S. McPeek, 1995b Statistical analysis thus methods for working with chiasma process models of crossover interference using the chi-square model. Genetics may be described only for renewal models, it is helpful 139: 1045–1056. to be able to express the truncated Poisson model as a Communicating editor: S. Tavare´ renewal model and to be aware that other models, such as the Sturt model, cannot be expressed in this way. Count-location models do have a severe drawback, in that they cannot incorporate chiasma interference in APPENDIX any meaningful way since the locations of the chiasmata Chiasma process probabilities for count-location mod- are assumed to be independent given the number of els: Let p , X , and X be as in Theorem 1. For count- chiasmata. An ideal model would incorporate the re- n 1 2 location models, P(X Յ x|N ϭ n) ϭ 1 Ϫ (1 Ϫ x/L)n quirement of at least one chiasma, along with a pattern 1 (for x Ͻ L), so of chiasma interference that fits the available data, such ∞ as is found in the Kosambi map function (Kosambi P(X1 Յ x) ϭ ͚ P(N ϭ n)P(X1 Յ x N ϭ n) 1944) and in the chi-square model (for example, see nϭ0 | Zhao et al. 1995b). Goldgar and Fain (1988) present ∞ x n one approach to constructing such a model. The proba- ϭ Ϫ Ϫ ͚ pn΄1 ΂1 ΃ ΅ bility distribution for the number of crossovers is mod- nϭ0 L ∞ eled as for a count-location model, but the locations of x n ϭ Ϫ Ϫ the crossovers are not assumed to be independent but 1 ͚ pn΂1 ΃ nϭ0 L to follow a distribution that allows for crossover interfer- ence. The model does seem to fit data well, but has and the density for X1 is several drawbacks, which are discussed in Zhao et al. ∞ nϪ1 npn x (1995b). fX (x) ϭ 1 Ϫ . (3) 1 ͚ L ΂ L΃ I am grateful to Elizabeth Thompson for many helpful discussions nϭ1 and to Terry Speed, Hongyu Zhao, and the referees for their com- Let f (x|n) be the probability density of X at x given ments. This work was supported in part by the Burroughs Wellcome X1|N 1 Fund through a Program in Mathematics and Molecular Biology train- that N ϭ n. Then ing fellowship. P(N ϭ n X1 ϭ x) ϭ fX N(x n)P(N ϭ n)/fX (x) | 1| | 1 ∞ n x nϪ1 mp x mϪ1 ϭ 1 Ϫ p m 1 Ϫ LITERATURE CITED ΂ ΃ n ͚ ΂ ΃ L L Ͳmϭ1 L L Browning, S., 1999 Monte Carlo Likelihood Calculation for Identity by Descent Data. Ph.D. thesis, University of Washington, Seattle. and for x Ͻ L Ϫ y, Fisher, R. A., 1947 The theory of linkage in polysomic inheritance. Philos. Trans. R. Soc. Lond. Ser. B 233: 55–87. ∞ Foss, E., R. Lande, F. W. Stahl and C. M. Steinberg, 1993 Chiasma P(X2 Յ x X1 ϭ y) ϭ ͚ P(N ϭ n X1 ϭ y)P(X2 Յ x X1 ϭ y, N ϭ n) | nϭ0 | | interference as a function of genetic distance. Genetics 133: 681– ∞ 691. np (1 Ϫ y/L)nϪ1 x nϪ1 ϭ n 1 Ϫ 1 Ϫ Goldgar, D. E., and P. R. Fain, 1988 Models of multilocus recombi- ͚ ∞ mϪ1 ΂Rmϭ1mpm(1 Ϫ y/L) ΃΂ ΂ L Ϫ y΃ ΃ nation: nonrandomness is chiasma number and crossover posi- nϭ1 tions. Am. J. Hum. Genet. 43: 38–45. ∞ nϪ1 nϪ1 Rnϭ1npn((L Ϫ y)/L) ((L Ϫ y Ϫ x)/(L Ϫ y)) Haldane, J. B. S., 1919 The combination of linkage values, and the ϭ 1 Ϫ ∞ R np (1 Ϫ y/L)nϪ1 calculation of distances between the loci of linked factors. J. nϭ1 n Genet. 8: 299–309. ∞ R np (1 Ϫ (x ϩ y)/L)nϪ1 Karlin, S., and U. Liberman, 1978 Classifications and comparisons ϭ 1 Ϫ nϭ1 n . (4) R∞ Ϫ nϪ1 of multilocus recombination distributions. Proc. Natl. Acad. Sci. nϭ1npn(1 y/L) USA 75: 6332–6336. Kosambi, D. D., 1944 The estimation of map distances from recom- For stationary renewal models, X2 is independent of bination values. Ann. Eugen. 12: 172–175. X1. Hence if the count-location model can be expressed Lange, K., 1997 Models of recombination, chapter 12, pp. 206–227 as a stationary renewal model, then the final expression in Mathematical and Statistical Methods for Genetic Analysis. Springer, New York. in Equation 4 must simplify so that it does not depend Liberman, U., and S. Karlin, 1984 Theoretical models of genetic on y. map functions. Theor. Popul. Biol. 25: 331–346. McPeek, M. S., and T. P. Speed, 1995 Modeling interference in Proof of Theorem 1. From Equation 3, for count-location . Genetics 139: 1031–1044. Risch, N., and K. Lange, 1979 An alternative model of recombina- models of the form in Equation 1, the density of X1 on tion and interference. Ann. Hum. Genet. 43: 61–70. 0 Ͻ x Ͻ L is Models for the Chiasma Process 1959

∞ Ϫ␤L n nϪ1 ne (␤L) x Ϫ␤ ϪRi ␤ ϪRi ␤ ϪRiϩ1 f (x) ϭ (1 Ϫ p ) 1 Ϫ ϭ e (L jϭ1xj)(e (L jϭ1xj) Ϫ e (L jϭ1xj)) X1 ͚ 0 Ϫ␤L ΂ ΃ nϭ1 Ln!(1 Ϫ e ) L Ϫ␤ ∞ ϭ 1 Ϫ e xiϩ1 ␤eϪ␤L ␤nϪ1(L Ϫ x)nϪ1 ϭ (1 Ϫ p ) 0 Ϫ␤L ͚ 1 Ϫ e nϭ1 (n Ϫ 1)! so that Xiϩ1 is independent of {Xj : j Յ i}, and the density Ϫ␤x Ϫ␤L of Xiϩ1 is fXiϩ1(x) ϭ␤e as required. Hence the station- ␤e ␤(LϪx) ϭ (1 Ϫ p0) e ary renewal model with interevent distribution F (x) ϭ 1 Ϫ eϪ␤L X2 1 Ϫ eϪ␤x for x Ͻ L is a representation of the count- Ϫ␤x Ϫ␤L ϭ 2e since (1 Ϫ p0)␤/2 ϭ 1 Ϫ e . location model in Equation 1. We now show that count-location models can be ex- From Equation 4, pressed as renewal models only if they have probabilities R∞ (␤L)n(1 Ϫ (x ϩ y)/L)nϪ1/(n Ϫ 1)! following the form in Equation 1. We do so by showing Յ ϭ ϭ Ϫ nϭ1 P(X2 x X1 y) 1 ∞ n nϪ1 | Rnϭ1(␤L) (1 Ϫ y/L) /(n Ϫ 1)! that for any given value of p1, there can be at most one count-location model that can be expressed as a renewal   x ϩ y y ϭ 1 Ϫ exp␤L 1 Ϫ Ϫ␤L 1 Ϫ  model—and hence for any given value of p0 the same  ΂ L ΃ ΂ L΃ is true. For a stationary renewal chiasma model, ϭ 1 Ϫ eϪ␤x. ϭ Ϫ fX1(x) 2(1 FX2(x)). (5) Since P(X2 Յ x X1 ϭ y) does not depend on y, we have Ϫ␤| x FX (x) ϭ 1 Ϫ e . 2 Also, for a stationary renewal process, X2 is independent We can check that the probability distribution of the of X , so that F (x) ϭ P(X Յ x X ϭ y) (and this expres- 1 X2 2 | 1 distance Xiϩ1 between the ith and (i ϩ 1)st chiasmata sion does not depend on y). Hence for a count-location (i Ն 2) is the same as that for X2, and does not depend chiasma model that can be expressed as a stationary on {Xj : j Յ i}, by induction. Suppose X1, X2,...,Xi are model, Equations 3, 4, and 5 imply that Ϫ␤x independent, with probability densities fXj(x) ϭ␤e ∞ Ϫ␤x nϪ1 ∞ nϪ1 for 2 Յ j Յ i and f (x) ϭ 2e . Then npn x 2Rnϭ1npn(1 Ϫ (x ϩ y)/L) X1 1 Ϫ ϭ ͚ ∞ nϪ1 ϭ L ΂ L΃ Rnϭ1npn(1 Ϫ y/L) f(X ϭ x ,...,X ϭ x N ϭ n) n 1 1 1 i i| and hence ϭ f(X1 ϭ x1 N ϭ n) f(X2 ϭ x2 N ϭ n, X1 ϭ x1) | | ∞ ∞ ∞ x nϪ1 y nϪ1 x ϩ y nϪ1 ϭ ϭ ϭ ϭ np 1 Ϫ np 1 Ϫ ϭ 2L np 1 Ϫ . ...f(Xi xi N n, X1 x1,...,XiϪ1 xiϪ1) ΂͚ n΂ ΃ ΃΂͚ n΂ ΃ ΃ ͚ n΂ ΃ | nϭ1 L nϭ1 L nϭ1 L nϪ1 nϪ2 n x n Ϫ 1 x (6) ϭ 1 Ϫ 1 1 Ϫ 2 L΂ L΃ L Ϫ x1 ΂ L Ϫ x1΃ Write

nϪi ∞ n Ϫ i ϩ 1 xi φ nϪ1 Ϫ (s) ϭ npn(1 Ϫ s) . (7) ... iϪ1 1 iϪ1 ͚ L Ϫ Rjϭ1xj ΂ L Ϫ Rjϭ1xj ΃ nϭ1 n!(L Ϫ Ri x )nϪi Then Equation 6 can be written ϭ jϭ1 j (n Ϫ i)!Ln x y x ϩ y φ φ ϭ 2Lφ and, writing Xi for {Xj : j Յ i}, ΂L΃ ΂L΃ ΂ L ΃

P(Xiϩ1 Յ xiϩ1 X1 ϭ x1,...,Xi ϭ xi) and this holds for all {(x,y):x Ն 0, y Ն 0, x ϩ y Ͻ L}. | ∞ Hence, ϭ P(X Յ x N ϭ n,X ϭ x )P(N ϭ n X ϭ x ) ͚ iϩ1 iϩ1| i i | i i nϭi φ(s)φ(t) ϭ 2Lφ(s ϩ t) (8) ∞

ϭ P(Xiϩ1 Յ xiϩ1 N ϭ n, Xi ϭ xi) for all {(s,t):s Ն 0, t Ն 0, s ϩ t Ͻ 1}. Now φ(0) ϭ ͚ | nϭi E(N) and hence φ(0) ϭ 2L by the definition of genetic φ f(Xi ϭ xi N ϭ n)P(N ϭ n) distance. Also lims 1 (s) ϭ p1. ϫ | % f(Xi ϭ xi) Suppose p1 is set at some value, while the {pn, n ϶ 1}

∞ Ϫ can take any values. Then we can show that φ is uniquely x n i n!(L Ϫ Ri x )nϪi ϭ 1 Ϫ 1 Ϫ iϩ1 jϭ1 j ͚ i n determined by Equation 8. To show this by induction, ϭ ΂ ΂ L Ϫ R x ΃ ΃ (n Ϫ i)!L n i jϭ1 j suppose that for some n the value of φ(i/2n) is deter- n Ϫ␤L ϭ n (␤L) e i mined for i 0,1,...,2 (for fixed L and p1 this holds iϪ1 Ϫ␤R ϭ x ϫ (1 Ϫ p0) (2␤ e j 1 i) φ nϩ1 φ n n!(1 Ϫ eϪ␤L)Ͳ for n ϭ 0). Then trivially (0/2 ) ϭ (0/2 ), which is determined. From Equation 8, φ(1/2nϩ1)φ(1/2nϩ1) ϭ ∞ nϪi i nϪi nϪi iϩ1 nϪi φ n φ nϩ1 √ φ n Ϫ␤ ϪRi ␤ (L Ϫ Rjϭ1xj) ␤ (L Ϫ Rjϭ1xj) 2L (1/2 ); thus (1/2 ) ϭ 2L (1/2 ) is determined. ϭ e (L jϭ1xj͚ Ϫ nϭi΂ (n Ϫ i)! (n Ϫ i)! ΃ Further, applying Equation 8 multiple times, [φ(1/ 1960 S. Browning

nϩ1 i iϪ1 φ nϩ1 φ nϩ1 2 )] ϭ (2L) (i/2 ) and hence (i/2 ) ϭ (1/ a chromosome of length L. Let NI be the number of 2L)iϪ1 [φ(1/2nϩ1)]i is determined for i ϭ 2,3,...,2nϩ1. chiasmata in I. Then, for the models in Equation 1, φ n Thus by induction, (i/2 ) is determined for all n and ∞ d n n ϭ ϭ Ϫ i ϭ 0,1,...,2. By continuity of φ (see the definition P(NI 0) ͚ pn΂1 ΃ nϭ0 L of φ in Equation 7), φ(s) is determined for all 0 Յ s Ͻ 1. ∞ Ϫ␤ φ(nϪ1) φ(i) ␤nLn(1 Ϫ d/L)ne L Note that pn ϭ lims 1 (s)/n!, where (s)isthe ϭ p ϩ (1 Ϫ p ) % 0 ͚ 0 Ϫ␤L φ ∞ nϭ1 n!(1 Ϫ e ) ith derivative of at s, for n Ն 1 and p0 ϭ 1 Ϫ Rnϭ1pn. Thus for a given value of p , Equation 5 uniquely deter- (1 Ϫ p )eϪ␤L ∞ ␤nLn(1 Ϫ d/L)neϪ␤L 1 ϭ p Ϫ 0 ϩ (1 Ϫ p ) 0 Ϫ␤L ͚ 0 Ϫ␤L mines pn for all n ϶ 1. Equivalently, given a value of p0, 1 Ϫ e nϭ0 n!(1 Ϫ e ) Equation 5 uniquely determines p for all n ϶ 0. n 1 Ϫ p (1 Ϫ p )eϪ␤d ϭ 1 Ϫ 0 ϩ 0 Hence we have demonstrated that, for a given value 1 Ϫ eϪ␤L 1 Ϫ eϪ␤L of p0, there is only one count-location model that can (1 Ϫ p )(1 Ϫ eϪ␤d) be expressed as a renewal model. In fact, for a given ϭ 1 Ϫ 0 1 Ϫ eϪ␤L value of p0, the unique count-location model that can be expressed as a renewal model has the form given in ϭ 1 Ϫ (2/␤)(1 Ϫ eϪ␤d), Equation 1. substituting Equation 1 for pn in the second line and Proof of extension in Corollary 1: We show that for the applying condition (2) in the last line. truncated Poisson model, FX2 has a unique extension to Hence, for two disjoint intervals with genetic lengths ∞ [0, ). First, we require limx ∞FX (x) ϭ 1. We have d1 and d2, → 2 Ϫ␣L Ϫ␣L limx LFX (x) ϭ 1 Ϫ e , thus we have probability e % 2 Ͼ Ͼ P(NI1 0 and NI2 0) that X2 takes the value L or greater. Second, we require C ϭ 1 P(NI Ͼ 0)P(NI Ͼ 0) E(X2) ϭ ⁄2 since chiasmata occur at rate 2/M. Let 1 2 Ͻ 1{X2 ϽL} equal one if X2 L and zero otherwise. Ϫ ϭ Ϫ ϭ ϩ ϭ 1 P(NI1 0) P(NI2 0) P(NI1ʜI2 0) L ϭ Ϫ ϭ Ϫ ϭ ϭ [1 P(NI1 0)][1 P(NI2 0)] E(X21{X2ϽL}) xfX2(x)dx Ύ0 Ϫ␤(d1ϩd2) Ϫ␤d1 Ϫ␤d2 L (2/␤)(1 ϩ e Ϫ e Ϫ e ) Ϫ␣x ϭ ϭ x␣e dx 2 Ϫ␤d Ϫ␤d (2/␤) (1 Ϫ e 1)(1 Ϫ e 2) Ύ0 1 Ϫ eϪ␣L ϭ␤/2. ϭ Ϫ LeϪ␣L ␣ Ϫ␤L Now, condition (2) gives us p0 ϭ 1 Ϫ 2(1 Ϫ e )/␤. 1 We can check that p0 increases monotonically with ␤ ϭ Ϫ LeϪ␣L. (for ␤Ͼ0) and hence that C increases monotonically 2 Ϫ␤L with p0. Let g(␤) ϭ 1 Ϫ 2(1 Ϫ e )/␤. The derivative of g is gЈ(␤) ϭ 2(e ␤L Ϫ (1 ϩ␤L))/(␤2e ␤L) Ͼ 0 for ␤Ͼ Thus E(X21{X2ՆL}) ϭ E(X2) Ϫ E(X21{X2ϽL}) should equal ␤L ∞ i LeϪ␣L. The two conditions are met by giving the distribu- 0, since e ϭ Riϭ0(␤L) /i! Ͼ 1 ϩ␤L, and hence p0 ϭ Ϫ␣L g(␤) is an increasing function of ␤ for ␤Ͼ0. The no tion of X2 mass e at L. Ϫ2L interference model has p0 ϭ e . Thus C Ͻ 1 for p0 Ͻ Proof of Theorem 2. Let I be an interval, or union of eϪ2L (C 0asp 1 Ϫ 2L) and C Ͼ 1 for p Ͼ eϪ2L → 0 → 0 two disjoint intervals, with total genetic length d on (C ∞ as p 1). → 0 →