<<

bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Parental Population Range Expansion Before Secondary

2 Contact Promotes Heterosis

1,2,∗,† 3 Ailene MacPherson Silu Wang1,3,† Ryo Yamaguchi4,5 Loren H. Rieseberg1 Sarah P. Otto1

4 1. University of British Columbia, Vancouver, BC V6T 1Z4, Canada;

5 2. Current Address: University of Toronto, Toronto, Ontario M5S 1A5, Canada;

6 3. Current Address: University of California and Berkeley, Berkeley, CA 94720;

7 4. Tokyo Metropolitan University, Hachioji, Tokyo, 192-0397, Japan;

8 5. Hokkaido University, Sapporo, Hokkaido, 060-8588, Japan;

9 ∗ Corresponding author; e-mail: [email protected].

10 † Contributed equally to manuscript.

11 Manuscript elements: Figure 1-5, Table 1, Appendix and Supplementary Figures Figure S1-S3

12 and Table S1, Dryad submission.

13 Keywords: zone, allele surfing, range expansion, genetic incompatibilities.

14 Manuscript type: Article.

15 Prepared using the suggested LATEX template for Am. Nat.

1 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

16 Abstract

17 Population genomic analysis of hybrid zones is instrumental to our understanding of the evolu-

18 tion of . Many temperate hybrid zones are formed by the secondary contact

19 between two parental populations that have undergone post-glacial range expansion. Here we

20 show that explicitly accounting for historical parental isolation followed by range expansion prior

21 to secondary contact is fundamental for explaining genetic and fitness patterns in these hybrid

22 zones. Specifically, ancestral population expansion can result in allele surfing where neutral or

23 slightly deleterious drift to high frequency at the expansion front. If these surfed dele-

24 terious alleles are recessive, they can contribute to substantial heterosis in hybrids produced at

25 secondary contact, counteracting negative effects of Bateson-Dobzhansky-Muller incompatibili-

26 ties (BDMIs) hence weakening reproductive isolation. When BDMIs are linked to such recessive

27 deleterious alleles the fitness benefit of introgression at these loci can facilitate introgression at the

28 BDMIs. The extent to which this occurs depends on the strength of selection against the linked

29 deleterious alleles and the distribution of recombination across the . Finally, surfing

30 of neutral loci can alter the expected pattern of population ancestry, thus accounting for historical

31 population expansion is necessary to develop accurate null genomic models of secondary-contact

32 hybrid zones.

2 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

33 Population expansions often lead to secondary contact and hybridization with closely related

34 (Arntzen et al., 2017; Duvernell et al., 2019; Szymura, 1976; Weir and Schluter, 2004). This

35 is most commonly reported for population expansions out of separate glacial refugia, which has

36 led to many of the hybrid zones we observe today across both aquatic and terrestrial habitats

37 (April et al., 2013; Avise et al., 1998; Duvernell et al., 2019; Gao et al., 2015; Haffer, 1969; Hewitt,

38 1999; Licona-Vera et al., 2018; Taberlet et al., 1998; Weir and Schluter, 2004). The well-documented

39 association between invasive species and hybridization provides another example in which the

40 rapid expansion of previously isolated species has led to widespread genetic mixing (Edmonds

41 et al., 2004).

42 Range expansion can have significant evolutionary and ecological consequences (Edmonds

43 et al., 2004; Excoffier et al., 2009). One important genetic consequence of range expansion is

44 an increase in the fixation rate of alleles, including deleterious ones, at the range edge due

45 to an increase in . This phenomena, termed “gene (allele) surfing” by Klopfstein

46 et al. (2006), was first characterized by Edmonds (2004). While all alleles, beneficial, neutral,

47 and deleterious can surf, surfing of deleterious alleles can lead to a substantial reduction in

48 population mean fitness at the range edge, termed “expansion load” (Peischl et al., 2013), and

49 may even limit the extent of range expansion (Peischl and Excoffier, 2015).

50 Here we investigate the effect of post-expansion allele surfing on hybridization at secondary

51 contact. Upon hybridization, deleterious alleles that have increased in frequency on the edge of

52 one parental range can be masked by their beneficial counterparts present in the other parental

53 population. Such masking may result in substantial heterosis (hybrid fitness advantage over the

54 parental lineages) and alter the introgression dynamics that ultimately determines whether the

55 diverged lineages could become independent evolutionary trajectories or not. Here we explore

56 the effect of parental population range expansion before secondary contact on the fitness regime

57 in hybrids both initially (among F1 crosses) and on the long-term consequences for introgression.

58 There are a range of possible evolutionary outcomes following secondary contact of closely

59 related lineages: (1) fusion of the hybridizing lineages, (2) emergence of a hybrid species or,

3 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

60 (3) the maintenance and strengthening of the species boundary. Between these three distinct

61 outcomes is the formation of a , a geographical region where divergent lineages

62 hybridize (Barton, 2001; Barton and Hewitt, 1985; Endler, 1977; May et al., 1975; Slatkin, 1973).

63 Among hybrid zones, there is extensive variation in the direction and strength of selection as

64 well as the rate at which reproductive isolation evolves within them (Barton and Hewitt, 1985;

65 Roux et al., 2016; Servedio and Noor, 2003). Models of hybrid zones typically assume that

66 gene flow and selection are at equilibrium (Barton and Gale, 1993; Barton and Hewitt, 1989;

67 Endler, 1977; May et al., 1975; Slatkin, 1973). A history of parental population expansion prior to

68 secondary contact violates such assumptions. Hence, understanding the effect of allele surfing

69 on hybridization will fill a gap in our understanding of .

70 Theoretical Background

71 Models of Secondary Contact

72 The tension-zone model (Barton, 1979; Barton and Hewitt, 1989) explicitly examines the impact

73 of population structure on secondary-contact. Ancestral populations are initially separated by a

74 geographical barrier. Secondary-contact of the parental populations following the removal of this

75 barrier results in the formation of genetic clines due to the balance of migration and selection.

76 The shape and dynamics of the resulting clines hence depend on the nature of selection and

77 introgression at each genetic locus. Motivated by analysis of transects of hybrid zones and

78 the observation of concordant genetic clines along these transects (Barton and Hewitt, 1985),

79 the classic tension-zone model assumes populations are arrayed along a linear habitat. Here we

80 consider a variant of this model where parental populations are initially isolated at opposite ends

81 of this linear habitat separated by a wide geographic barrier and must undergo range expansion

82 prior to coming into secondary contact.

83 The selection regimes that are often considered within hybrid populations include the null

84 case of neutral loci and universal selection against deleterious alleles. In addition, there is se-

4 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

85 lection specifically against hybrid genotypes as a result of underdominance or negative epistasis

86 in the form of Bateson-Dobzhansky-Muller incompatibilities (BDMIs). In the model presented

87 below we will focus on the dynamics at neutral, universally deleterious, and pairs of epistatic

88 BDMI loci. Our model will contrast two different types of BDMI’s, a dominant BDMI for which

89 all recombinant genotypes are less fit, affecting both F1 and later hybrids, and a recessive BDMI

90 for which breakdown is only experienced in F2, backcross, and later crosses (Turelli and Orr,

91 2000).

92 Range Expansion and Allele Surfing

93 Our expectations for the effect of range expansion, model design, and parameter choices are

94 grounded in the results of Peischl and colleagues (Peischl et al., 2013; Peischl and Excoffier,

95 2015; Peischl et al., 2015). Using a combination of individual-based simulations and analytical

96 results, they modeled the decline in population mean fitness on the edge of an expanding range

97 as a result of the accumulation of deleterious mutations by drift. While we do not analyze the

98 dynamics of edge mean fitness directly, this quantity is directly related to heterosis observed

99 among F1 hybrids formed by crossing individuals at the edge of two parental ranges. Peischl

100 and Excoffier (2015), as in our model below, consider deleterious (partially) recessive mutations

101 initially at -selection balance in the range core. As expansion progresses, population

102 mean fitness initially declines rapidly as a result of the fixation of segregating variation. This

103 rapid decline is followed by a period of gradual decline as a result of the continued fixation

104 of new mutations (Peischl and Excoffier, 2015). Analogously we expect F1 heterosis to initially

105 increase rapidly as a function of the distance travelled by parental populations prior to secondary

106 contact, followed by a continued increase in heterosis at a reduced rate.

107 Peischel and Excoffier (2015) found that the decline in mean fitness on the range edge is pre-

108 dominantly the result of alleles with small to intermediate effects for which Ns ≤ 2. In addition

109 to the strength of selection (s) and the population size (N), the accumulation of mutations is

110 also impacted by selective interference between non-recombining loci and hence by the pattern

5 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

111 of recombination and segregation across the genome (Peischl et al., 2015). Specifically, they com-

112 pare expansion dynamics of a completely asexual population to the case of 20 freely-segregating

113 but otherwise non-recombining . While segregation has only mild effects on mean

114 fitness on the range edge itself, it reduces selective interference among chromosomes and allows

115 beneficial core alleles to spread more readily toward the range edge, hence narrowing the region

116 of reduced fitness on the range edge.

117 Introgression and Heterosis

118 Introgression in the tension-zone model (Barton, 1979; Barton and Hewitt, 1989) is characterized

119 by clines in allele frequency across the transect. Relatively insensitive to the nature of selection

120 against hybrids these clines are expected to be sigmoidal in shape at selection-migration balance

121 with allele frequency at location x given by:

1 122 p(x) = . 1 + e−4d(x−c)

123 Parameterized in this manner, the transition of the allele frequency from fixed in one parental

124 population to absent in the other population is gradual and centered around the location c. The

125 slope of the at c is given by d from which the width of the cline is defined as W = 1/d.

126 The transient dynamics of allele frequency at neutral and selected loci are also expected to be

127 sigmoidal with a time dependent center and width. In contrast to selected loci, the neutral alleles

128 will continue to introgress eventually equilibrating at an expected frequency of 0.5 across the

129 range.

130 Both the magnitude and heterogeneity of recombination rate are known to shape the extent

131 of introgression across the genome. Introgression is usually suppressed in regions of low recom-

132 bination rate due to genetic linkage to loci that reduce hybrid fitness (Aeschbacher et al., 2017;

133 Janousekˇ et al., 2015; Juric et al., 2016; Kirkpatrick, 2017; Martin et al., 2019; Navarro and Barton,

6 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

134 2003; Noor et al., 2001; Rieseberg, 2001; Schumer et al., 2017).

135 Methods

136 Allele surfing and range expansion are complex eco-evolutionary processes (Peischl and Ex-

137 coffier, 2015; Peischl et al., 2015). We begin here by describing the genetics, life cycle, and geog-

138 raphy of our eco-evolutionary model of secondary contact. Specifically, we consider secondary

139 contact between two populations that were initially isolated in two distant refugia (of ncore = 5

140 demes each) followed by a subsequent period of range expansion before finally coming into

141 secondary contact (Figure 1).

142 The life cycle of the species is characterized by discrete non-overlapping generations with

143 three stages: population census, reproduction, and migration. Selection occurs during reproduc-

144 tion such that an individual with genotype i reproduces at a density-dependent rate, producing    NT 145 a Poisson distributed number of gametes with mean 2 1 + ρ 1 − K Vi, which combine ran-

146 domly to produce diploid offspring. Here ρ = 0.01 is the intrinsic growth rate, NT is the total

147 number of individuals in the local population, and K = 200 is a constant determining the strength

148 of density-dependence in each deme. Finally, Vi is a measure of the viability of genotype i.

149 An individual’s genotype consists of nN = 80 neutral loci, nD = 100 ‘selected’ loci experi-

150 encing purifying selection, and nB = 4 two-locus BDMI pairs, and nFD = 12 neutral loci that

151 are artificially forced to exhibit fixed differences between parental populations as in the classic

152 tension-zone model for a total of nLoci = nN + nD + 2nB + nFD = 200 loci. All loci are assumed

153 to be biallelic and are arranged randomly along the genome. Selected loci are either fully reces-

154 sive hD = 0 or partially recessive hD = 0.25. We examine three values of directional selection

155 sD = 0.001, sD = 0.003, and sD = 0.005, which correspond to values of Ks, a quantity equivalent

156 to Ns when Vi = 1, of 0.2, 0.6 and 1 respectively. Hence the strength of selection is weak to

157 intermediate across the parameter range explored such that all selected loci are expected to ex-

158 perience extensive surfing during range expansion as predicted by Peischl and Excoffier (2015).

7 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

159 For reference purposes we also examine the special case of neutral (sD = 0.0).

160 In addition to the selected loci, we consider epistatic selection between pairs of BDMI loci.

161 Following the model and notation of Turelli and Orr (2000), for each pair of BDMI loci, B and

162 C, we assume the parental populations are fixed for either the bbCC or BBcc genotype. The

163 BDMI between the pair of loci B and C is characterized by three specific incompatibilities: H0

164 the fitness of the double heterozygous genotypes (BbCc), H1 the fitness of the heterozygous-

165 homozygous genotypes (bbCc, Bbcc), and H2 the fitness of the double homozygous genotype

166 (bbcc). Breakdown in F1 hybrids is determined by the value of H0 whereas F2 and backcross

167 hybrids are affected by all three forms of incompatibilities. We consider two different types of

168 BDMI’s: “dominant BDMIs” for which H0 = H1 = H2 = 1 − sB and “recessive BDMIs” where

169 H0 = H1 = 1 and H2 = 1 − sB. We consider two strengths of epistatic selection at the BDMI loci:

170 weakly incompatible case (sB = 0.01) and strongly incompatible case (sB = 0.05). See Table 1 for

171 a summary of model parameters.

172 Viability across loci is multiplicative with a baseline viability of V0 such that the viability of

173 genotype i is given by the product:

  1 Aj Aj  nN +nFD nD  ∗ ∗ ∗ Vi =V0 VB ∏ 1 ∏ 1 − hDsD Ajaj |{z} j=1 j=1  BDMI  | {z }  neutral 1 − sD ajaj | {z } selected   174 1 BjBj/cjcj (1)  nB  VB = ∏ 1 bjbj/CjCj Dominant BDMI j=1    1 − sB otherwise   nB 1 − sB bjbj/cjcj VB = ∏ Recessive BDMI  j=1 1 otherwise

8 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

175 As there are strong eco-evolutionary feedbacks in the model and we wanted to limit the

176 likelihood of of core populations during burn-in, we set V0 to ensure that the core

177 populations would have an expected size of K at mutation-selection balance. Specifically, we set:

1 178 V0 = (2) 1− 1 R 2K 2 nD 1 [x φ(x,sD,hD,µ,ncore∗K)]dx (1 + sD) 2K

179 where φ (x, s, h, µ, Ne) is Wright’s distribution for mutation-selection-drift equilibrium. We

180 initialize the simulations assuming Ne = ncore ∗ K as an approximation for the effective population

181 size in the range core. As shown in Figure S1, burn-in simulations indicate that this definition of

182 V0 provides an equilibrium deme size near K.

183 The absolute fitness, Wi, and relative fitness, wi, of an individual with genotype i is deter-

184 mined by its viability in the following manner:

  N  W = 1 + ρ 1 − T V i K i 185

wi =Vi/V¯

186 where V¯ is the mean population viability. Even though the absolute fitness, Wi, depends on the

187 local population size, NT, an individual’s relative fitness, wi, does not. We will use Vi as a proxy

188 for fitness throughout, as it is proportional to absolute fitness and determines relative fitness.

189 Parents of genotype i produce a random Poisson-distributed number of gametes with mean

190 2Wi. Given the previously identified importance of recombination to both the dynamics of expan-

191 sion (Peischl et al., 2015) and genome-wide patterns of introgression (Noor and Bennett, 2009),

192 we consider five different recombination scenarios (Figure 1). In the first two cases the recom-

193 bination rate is uniform across the genome with either free (r = 0.5) or constrained (r = 0.005)

194 recombination between any two consecutive loci. In the third case the genome is divided into

195 four segments of equal length with recombination occurring freely, r = 0.5, between them at

196 three recombination “hot-spots” and no recombination within each segment. This third scenario

9 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

197 resembles the genomic structure used by Peischel et al. (2015) and can represent either highly

198 heterogeneous recombination on a single chromosome or independent segregation among four

199 otherwise non-recombining chromosomes. The resulting genome-wide average recombination

200 rate in this case is given by r¯ = 0.0075. In the fourth and fifth scenarios we model the evolution

201 of a single chromosome where the recombination rate varies continuously across the genome as

202 given by the average pattern observed across a wide range of plant and animal taxa respectively,

203 (see Haenel et al. (2018) Figure 1, Table 1). The genome-wide average recombination rate in these

204 latter two scenarios was r¯ = 0.005 to facilitate comparison to the uniformly constrained and chro-

205 mosomal segregation cases. Given this average rate, there is an expected ≈ 1 cross-over events

206 per genome per generation, on par with expectations for many diploid organisms (Stapley et al.,

207 2017).

208 In addition to recombination, gametes carry novel mutations. The per-generation per-site

−4 209 mutation rate, with both forward and backward mutations allowed, was set to µ = 10 at

210 both the neutral and selected loci. To simplify the tracking of reproductive isolation we do

211 not allow mutation at the BDMI loci. To provide an analogous neutral comparison, mutation

212 is also suppressed at the 12 “fixed difference” loci. Following meiosis, mating is random with

213 gametes combining randomly to produce diploid offspring. These offspring then migrate at a

214 rate m = 0.01 to neighbouring demes as described in more detail below. Following migration,

215 each population is censused, completing the generation.

216 We consider the demographic model depicted in Figure 1, modelling first evolution in al-

217 lopatry of populations A and B, followed by their expansion, and finally introgression between

218 the two populations initially occupying opposite ends of a finite linear “stepping-stone” habitat.

219 The model was coded in C++ and proceeds through the following stages (program and simulation

220 results are deposited on Dryad).

221 Initialization: The simulations begin by initializing the habitat with the left-most ncore demes

222 occupied by population A and the right most ncore populations occupied by population B. Indi-

223 viduals migrate among demes such that they move to the left (right) neighbouring deme with

10 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

224 probability m/2 (m = 0.01 is used throughout) with reflective population boundaries. These

225 demes will be referred to as the core of population A and B respectively. We initialize each

226 core deme with K individuals, drawing the initial allele frequencies at the selected loci using

227 Wright’s distribution (see Equation (2)). This initialization is an approximation to the steady

228 state, ensuring that the burn-in phase efficiently reaches equilibrium.

229 Burn-in: For the core populations to reach the true eco-evolutionary equilibrium, we begin

230 by simulating evolution for 10, 000 burn-in generations. To test the quality of the burn-in we

231 evaluate the dynamics of the mean viability V¯ and the population size of each deme (see Figure

232 S1).

233 Expansion: The third stage of the simulations is characterized by the expansion of the parental

234 populations A and B while remaining allopatric. Expansion proceeds for 2000 generations. Each

235 generation individuals are allowed to migrate to the nearest neighbouring demes in a stepping

236 stone manner. As described above the geographic extent (the absolute x-axis distance covered)

237 over the course of the 2000 generations of expansion varies stochastically across simulations.

238 Although artificial, constraining expansion to a fixed time rather than distance allows us to

239 interchange parental populations increasing the power of our analyses. To facilitate analysis of

240 geographical patterns after expansion we scale the x-axis such that the total range of populations

241 A and B upon secondary contact is 100 (e.g., see Figure 4A-C).

242 To quantify the impact of allele surfing on population mean fitness, every 50 generations we

243 measure expansion load, which is defined as the relative difference in mean fitness of range edge,

244 the right (left) most deme, and the range core:

V¯core − V¯edge 245 L = (3) V¯core

246 Similarly, to quantify the effect of masking of deleterious recessive mutations in hybrids,

247 every 50 generations we create an artificial population of 10 F1 hybrids between the range edge

248 of parental populations A and B. We then calculate heterosis by comparing mean F1 hybrid

11 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

249 fitness to the mean fitness of their parents:

250 H = V¯F1 − V¯Par (4)

251 Introgression: Following expansion we consider the dynamics of introgression between pop-

252 ulation A and B allowing all neighbouring populations to be connected by migration as shown

253 in Figure 1. Following secondary contact, introgression proceeds for five-thousand generations.

254 In this phase, we stop mutation (µ = 0). Although artificial, this ensures that all results ob-

255 served during introgression are a result of expansion history and not de novo mutation during

256 introgression itself.

257 Following secondary contact, we can use the allele frequencies at the nN neutral loci that are

258 not constrained to be fixed to define a measure of population ancestry based on the hybrid index.

259 Adapting the likelihood based measure developed by Buerkle (2005), let the allele frequency at

260 neutral locus i in the core of population A (left-most deme) be ri and the allele frequency in the

261 core of population B (right-most deme) be si. Then the probability of observing focal allele given

262 a hybrid index of h is given by:

263 p˜i(h) = hri + (1 − h)si (5)

264 Given the observed allele frequency in deme d, pd, we calculate the hybrid index of deme d by

265 maximizing the likelihood:

L  2N  d 2Nd pd,l 2Nd(1−pd,l ) 266 L(hd|~pd) = ∏ p˜(hd) (1 − p˜(hd)) (6) l=1 2Nd pd,l

267 where Nd is the population size of deme d. As with the individual hybrid index developed

268 by Buerkle (2005), hd accounts for both fixed and non-fixed differences at neutral loci between

269 core demes. As is common in empirical studies however, we restrict our inference to neutral

270 loci exhibiting a difference in allele frequency between the core populations of Abs[ri − si] ≥ 0.3

271 (Corbett-Detig and Nielsen, 2017; Matute et al., 2020; Wang et al., 2021).

12 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Parent A Initialization Parent B Geographical Barrier

Core 10,000

Edge

F1 Hybrid cross Burnin (Allopatry) 2,000 Expansion 5,000 Introgression

Recombination Rate Scenarios

0.5 0.015 Free Constrained 0.4 Chromosomal 0.010 Plants 0.3 Animals

0.2 0.005 0.1

Recombination Rate 0.0 0.000 0 50 100 150 200 0 50 100 150 200 Genomic Position Genomic Position

Figure 1: Schematic of Model Design. Top: Individual-based simulation model proceeds in three phases: 1) evolution in allopatry, 2) parental range expansion, and 3) secondary contact and introgression. Geography is modelled as a 1D stepping-stone habitat. Occupied demes (gray circles of which only a subset are shown) are connected via migration. Time is measured Figure 1: Schematic of Model Design.Top:Individual-basedsimulationmodelproceedsinthreephases:1)evolutionin in units of generations of allopatry (10,000), generations of expansion (2,000), or generations of allopatry, 2) parental range expansion, and 3) secondary contact and introgression. Geography is modelled as a 1D stepping- introgression (5,000) respectively. During range expansion parental edge populations are crossed stone habitat. Occupied demes (shown by gray circles) are connected via migration. Time is measured in units of generations every 50 generations to create F1 hybrids. Bottom left hand panel: The five recombination rate of allopatry (GA), generations of expansion (GE), or generations of introgression (GI) respectively. During range expansion scenarios shown on the same axis as a function of genomic position from locus 1 to locus 200. Bottomparental edge Right: populations Variability are crossed in recombination every 50 generations rate to create under F1 hybrids. the “plants” Bottom left and hand “animals” panel: The scenario five recombination com- paredrate scenarios to uniform shown on constrained the same axis asrecombination. a function of genomic position from locus 1 to locus 158. Bottom Right: Variability in recombination rate under the “plants” and “animals” scenario compared to uniform constrained recombination.

13

1 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Background Selection Background Dominance BDMI Recombination sD hD sB (hB) r¯ Neutral: 0.0 Recessive: 0 No BDMI: 0 (NA) Free: 0.5 Weak: 0.001 Partially Recessive: 0.25 Recessive Weak: 0.01 (0) Constrained: 0.005 Medium: 0.003 Dominant Weak: 0.01 (1) Chromosomal: 0.0075 Strong: 0.005 Recessive Strong: 0.05 (0) Plants: 0.005 Dominant Strong: 0.05 (1) Animals: 0.005

Table 1: Parameter conditions used in individual-based simulation. Individual based simula- tions were run in a full factorial design with 10 replicate simulations per treatment. The genomic location of neutral, background, and BDMI loci vary randomly across treatments but remained constant across the 10 replicates within each treatment. For all simulations the following param- eter values remained fixed: µ = 0.0001, m = 0.01, K = 200, ρ = 0.1, nD = 100, nN = 80, nFD = 12, nB = 4 for a total of 200 loci. Unless otherwise specified the parameters used are given in bold. Recombination rate scenarios are shown graphically in Figure 1.

272 Results

273 The core parental populations A and B reach eco-evolutionary equilibrium during the burn-in

274 period (Figure S1). As expected during the subsequent range expansion phase, allele surfing

275 greatly increased the frequency of deleterious mutations on the range edge and hence the num-

276 ber of fixed differences between the edges of population A and B (Figure 2A). The fixation of

277 deleterious mutations on the range edge results in substantial expansion load as measured by

278 Equation (3) (Figure S2). The selection and dominance coefficient of the deleterious alleles at

279 the selected loci are important determinants of the extent of expansion load. In contrast, the dy-

280 namics of expansion load are insensitive to the recombination rate scenario, with nearly identical

281 accumulation of load even in the case of complete linkage with chromosomal segregation.

282 As the number of fixed differences between the range edges of populations A and B increases

283 so too does the observed heterosis among F1 hybrids created in crosses of edge populations. The

284 temporal dynamics of F1 heterosis (Figure 2B) follow our expectation based on the previously

285 characterized dynamics of fitness on the edge of expanding population (Peischl and Excoffier,

286 2015). The extent of heterosis that results from the masking of (partially) recessive deleterious

287 mutations that have surfed to fixation or near fixation on the range edge of one parental popula-

14 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

288 tion but not the other depends both on the strength of selection on the deleterious alleles, sD, and

289 the dominance of these alleles, hD (see Figure 2B). As expected, dominant BDMIs decrease the

290 fitness of F1 hybrids, yet heterosis from masking deleterious recessive mutations can partially or

291 even completely compensate for the deleterious fitness effects of these incompatibilities (Figure

292 2C). Finally, heterosis from the masking of deleterious mutations is relatively insensitive to the

293 recombination scenario (Figure 2D).

15 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1.0 sD=0.0 0.08 A. B. sD=0.001 s =0.001 Mean h=0 D s =0.002 16.4 s =0.003 D Replicate h=0 0.8 ◆◆ D sD=0.005 Mean h=0.25 27.5 sD=0.005 0.06 ◆ ◆ 38.2 0.6 ◆ ◆ 38.2 ◆ ◆ 0.04 0.4

F1 heterosis 0.02 0.2

0.0 0.00 0 10 20 30 40 50 60 0 500 1000 1500 2000 # Fixed Di↵erences Generations of Expansion C. D. 0.06 No BDMI Free Recessive Weak 0.06 Constrained Dominant Weak Chromosomal 0.04 0.05 Plants Animals 0.04 0.02 0.03 0.00 0.02 F1 heterosis F1 heterosis -0.02 0.01 0.00 0 500 1000 1500 2000 0 500 1000 1500 2000 Generations of Expansion Generations of Expansion

Figure 2: Impact of allele surfing on F1 heterosis during range expansion. Panel A: Number of selected loci fixed for different alleles between parental core populations at the end of evolution in allopatry (mean: dashed vertical line) verses parental edge populations at the end of expansion (histogram, mean: solid vertical line) for different strengths of selections. Panels C-D: F1 Het- Figure 3: F1 heterosis over the course of range expansion.PanelA:numberofselectedlocifixedfordi↵erentalleles erosis as measured by Equation (4) in crosses between parental edge populations during range between parental core populations (blue dashed box in Figure XXX)attheendofevolutioninallopatry(mean:dashedvertical expansion. Dark solid curves give mean dynamics averaged across individual replicates (solid line) verses parental edge populations (red dashed box in Figure XXX)endofexpansion(histogram,mean:solidvertical light curves) when deleterious loci are recessive hD = 0. Dark dashed lines give mean dynamics line) as a function of the strength of selection. Panels C-D: F1 Heterosis as measured by equation XXX in crosses between for partially recessive case hD = 0.25. Panel B: Effect of the strength of selection (case of sD = 0 notparental shown) edge populations, exhibits seeno square heterosis populations as expected. in model schematic. Panel C: Dark Effect solid curves of BDMI give mean presence/absence dynamics averaged across and individual replicates (solid opaque curves) when deleterious loci are recessive hD =0.Darkdashedlinesgivemeandynamics dominance shown here for the case of a weak BDMI sB = 0.01. Panel D: Effect of recombination scenario.for codominant Unless case h specifiedD =0.25. Panel otherwise B: E↵ect parameters of the strength are of selection given by case boldface of sD =0(notshown)exhibitsnoheterosis values in Table 1. as expected. Panel C: e↵ect of BDMI presence/absence and dominance shown here for the case of a weak BDMI sB =0.01. Panel D: e↵ect of recombination scenario. Unless specified otherwise parameters are given by boldface values in table XXX. 294 The dynamics of population mean fitness during range expansion and upon secondary con-

295 tact are shown in Figure 3. As with F1 heterosis, upon secondary contact there is a substantial

296 increase in fitness in the hybrid zone, which expands from the former range edge toward the

297 core as wild-type alleles masking accumulated deleterious alleles introgress. While the presence

16

4 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

298 of BDMIs does not prevent the introgression of wild-type alleles, these incompatibilities result in

299 a persistent reduction in population mean fitness at the point of secondary contact. Relative to

300 dominant BDMIs, recessive incompatibilities result in a shallower and more diffuse reduction in

301 fitness (Figure 3B and C).

Recessive BDMI B.

A. 0 0 V¯ empty space 1.01 1000 2500

2000 1.00 Expansion 1000 5000 0.98 0 Dominant25 50 BDMI75 100 2000 C. Generations ¯

Generations VB 3000 0.97 0 0.999 4000 Introgression 0.95 0.995 5000 2500 0 25 50 75 100 Space 0.990 5000 0 25 50 75 100 Space

Figure 3: Dynamics of mean fitness and the effect of the BDMIs upon secondary contact. Panel A: exemplary dynamics of mean population viability V¯ during range expansion and introgres- sion in the absence of a BDMI, see boldface parameters in Table 1. Fitness effect of BDMI as measured by V¯B in Equation (1) for a weak recessive (panel B) and weak dominant (panel C) BDMI.Figure 5: Dynamics of mean fitness during range expansion and introgression. Panel A: exemplary dynamics of mean population viability V¯ during range expansion (GE) and introgression (GI) in the absence of a BDMI, see boldface parameters

in Table XXX.Fitnesse↵ectofBDMIasmeasuredbyV¯B in Equation XXX for a weak recessive (panel B) and weak dominant 302 Deleterious mutations fixed on the edge of one parental population and not the other will (panel C) BDMI. White area indicates populations where V¯B =1indicatingnoincompatibilitiespresent. 303 ultimately be removed by selection following secondary contact ( recall that the simulation de-

304 sign assumes no mutations occur following secondary contact). While the deleterious alleles

305 persist, however, these loci will exhibit a cline in allele frequency across the hybrid zone. Unlike

306 the sigmoidal clines observed in a tension zone (Barton, 1979), however, the shape of the allele

307 frequency cline is affected both by introgression and historical allele surfing. If we consider all

A 308 loci for which the deleterious allele has a frequency pedge < 0.5 at the range edge of parental

17

6 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

B 309 population A and pedge > 0.5 at the edge of parental population B, then the cline in average allele

310 frequency across these loci peaks just to the left of the contact zone (see Figure 4A).

311 To quantify the transient dynamics of these allele frequency clines at each type of locus (e.g.,

312 neutral, selected, FD, BDMI) and their variation across parameter space we fit them phenomeno-

313 logically with one of the following functions as determined by AIC based model selection.

  d Sech(a (x − c))Tanh(b (x − c)) + e i f x < c  bA A A p¯1(x) =   d Sech(a (x − c))Tanh(b (x − c)) + e i f x > c 314 bB B B (7) a a p¯2(x) = h i + e − . −4d 2 1 + exp a (x − c)

315 These model fits are defined such that the geographic center of the cline is given by c and

316 height e, with the slope (derivative) of the cline at this point by d. Following the definition of

1 317 Barton and Hewitt (1989), the width of the cline is given by W = | d |. For p¯1(x) the value of

318 aA (aB) determines the slope in allele frequency from the range core to its maximum (minimum)

319 near the point of secondary contact as determined by the extent of allele surfing. The remaining

320 parameters (bA,bB) determine the height and symmetry of the cline. By contrast, p¯2(x) describes

321 a sigmoidal function, as observed when parental population have fixed differences. The height

322 of this symmetric sigmoid is given by a. The functions p¯1 versus p¯2 were developed to match

323 the shape of the allele frequency clines in the case of the neutral and selected versus the fixed

324 difference and BDMI clines respectively (see Figure 4A) although neither model is preferred

325 exclusively in either of these cases across all parameter conditions.

18 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

A. Cline Shapes Across Loci B. Selected Loci Clines 1.0 Fixed Diff. 0.8 Generations BDMI 10 0.8 1000 Neutral 0.6 2000 0.6 Selected 3000 0.4 4000 0.4 5000 Frequency Frequency 0.2 0.2

0.0 0.0 0 20 40 60 80 100 0 20 40 60 80 100 C. Neutral Loci Clines D. Population Ancestry Clines 1.0 1.0 Generations 10 0.8 1000 0.8 2000 0.6 3000 0.6 4000 0.4 5000 0.4 Ancestry Frequency 0.2 0.2

0.0 0.0 0 20 40 60 80 100 0 10 20 30 40 50 60 Location Location

Figure 4: Clinal dynamics and population ancestry. Clines in average allele frequency are cal- culated by averaging across loci of a specific type (e.g., neutral, selected, fixed difference, BDMI) A B for which pedge < 0.5 and pedge > 0.5. Panel A: Shape of the allele frequency cline for each locus type 1000 generations after secondary contact. Brown versus blue clines exemplify difference in Figure 6: Clinal dynamics and population ancestry. Clines in average allele frequency are calculated by averaging across fit between Sech (p¯1) and sigmoidal (p¯2) curves given by Equation (7). Panel B: Temporal dynam- loci of a specific type (e.g., neutral, selected, fixed di↵erence, BDMI) for which pA < 0.5andpB > 0.5. Panel A: shape of ics of clines at the selected loci throughout introgression. Paneledge C: Temporaledge dynamics of allele the allele frequency cline for each locus type at generation XXX.Brownversusblueclinesexemplifydi↵erenceinfitbetween frequency clines at neutral loci following secondary contact. Panel D: Temporal dynamics of the sigmoidalp ¯ and sechp ¯ curves given by Equation XXX.PanelB:Temporaldynamicsclinesattheselectedlocithroughout cline in population2 1 ancestry, as defined by hybrid index in Equation (6). See Panel C for colour legend.introgression Parameters (GI). Parameters are given are given by by boldface boldface values values in inTable TableXXX 1..PanelC:Temporaldynamicsofallelefrequency clines at neutral loci. Panel D: Temporal dynamics of cline in population ancestry as defined by Equation XXX.

326 We next focus on the dynamics of cline width at the selected and BDMI loci. As introgression

327 proceeds, the width of the cline increases at both the selected (Figure 5A) and BDMI loci (Figure

328 5B), although less so for the the BDMI loci, as expected. Due to selection favouring the introgres-

329 sion of wild-type mutations, the cline width at the selected loci increases with the strength of

330 selection sD for the parameters explored. Interestingly, the cline width at the selected loci is less

331 than that of the corresponding neutral loci. This is a result of the low frequency of the deleteri-

332 ous alleles in the parental range core and the subsequent spread of alleles from the range core

19

7 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

333 toward the hybrid zone following secondary contact. Essentially, selective recovery from delete-

334 rious allele surfing occurs from two directions, from secondary contact with the other range edge

335 population and from the core of the same population. For further details see Appendix A.

336 The cline width at the BDMI loci decreases with the strength of selection against the BDMI

337 incompatibilities sB, with the width of BDMI clines less than than the clines at the neutral fixed

338 difference loci. Cline width at the BDMI loci is also impacted by the strength of linked selection

339 against deleterious alleles (sD see Figure 5C), with strong linked selection facilitating introgres-

340 sion, at least over short time scales (0-1000 generations of introgression). Finally, the cline width

341 at the BDMI loci is impacted by the distribution of recombination across the chromosome(s).

342 The more uniform recombination is the more BDMI loci are able to introgress. In the case of

343 chromosomal segregation, introgression at BDMI loci and hence cline widths are limited.

20 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

A. Selected Loci

◆◆ 15 sD=0.0 ● sD=0.001 ■ ■◆ sD=0.003 ■◆◆ ◆◆ ◆◆ ■ ■ ● ● sD=0.005 ◆◆ 10 ■● ◆ ● ◆ ◆ ◆ ◆◆ ◆◆ ■◆◆ ■ ■ ◆◆ ● ● ● ◆ ◆◆ ■◆ ◆◆ ■ ■●◆◆ ◆ ●◆ ◆◆ ◆ 5 ◆◆ ■ ■ ● ● ◆◆ ◆ ◆◆◆ ■ ◆◆ ■ ◆ Cline Width ◆◆ ◆◆ ■● ●● ◆◆ ◆◆ ◆◆ ■ ◆◆ ■●◆ ● ■ ◆◆ ■● ◆◆◆ ◆ ● ● ● ◆◆ ■●◆◆ ◆ ●◆◆ ●● ◆◆ ◆◆ ◆◆ ■ ■◆◆ ◆ ■◆◆ ■■◆◆◆ ◆ ◆ ◆◆ ◆◆ ◆◆◆ ■●◆ ■●◆ ● ■●◆ ●● ■ ■◆◆ ◆ ■◆ ◆◆ ●◆ ◆ ●● ◆ ■◆◆ ■●◆◆ ■● ■●◆◆ ■●◆◆ ■◆◆ ■◆◆ ■◆ ◆◆ ◆ ■◆◆ ◆ ◆◆◆ ●◆ ●◆ ●◆ ●◆ ■●◆ ■●◆ ■◆◆◆ ●◆ ● ◆ ●◆ ◆◆◆ ◆ ◆◆ ● ◆ ■●◆ ●◆ ◆ 0 50 100 500 1000 B. BDMI Loci

15 No BDMI Dominant Weak Dominant Strong ◆◆ 10 ◆◆ ● ◆◆ ● ■ ● ■◆ ◆ ◆ ■ ■ ■◆◆ ◆◆ ◆◆ ● ■ ■ ● ■◆ ● ◆◆◆ ◆ ● 5 ■ ●◆◆ ◆◆ ◆ ◆ ●◆ Cline Width ◆◆ ■◆◆ ◆◆ ■●● ◆◆ ● ◆◆ ■●◆ ◆ ●◆ ◆ ■◆◆ ◆◆ ◆◆ ■●◆◆◆ ◆◆ ◆◆ ◆◆ ■●◆◆ ●◆ ■●● ◆ ●◆ ■●◆◆◆ ●◆ ■●◆ ■■◆◆ ◆◆ ■●◆◆◆ ●◆ ■ ◆◆ ■●◆◆◆ ●◆ ■ ■ ■◆◆ ■◆◆ ●● ◆ ◆◆ ■●◆◆◆ ●◆◆◆ ■◆◆ ■◆◆ ◆◆ ●◆ ●◆ ◆◆ ●◆◆◆ ■●◆◆◆ ■●◆ ■●◆ ●◆ ●◆ ■●◆◆◆ ■●◆ ■● 0 50 100 500 1000 Time Since Contact (log) C. BDMI Loci: sD D. BDMI Loci: r

◆◆ ◆◆ ◆◆ ◆◆ ● ● ● 5 ◆ 5 sD=0.0 ◆ ◆◆ ◆◆◆ ◆ Constrained ● ● ◆ sD=0.001 ◆◆ ◆ ◆◆ 4 ◆◆ 4 ●◆◆ ◆◆ Chromosomal ●◆◆ ● sD=0.003 ● ◆◆ ◆ ◆ ● ◆◆ ◆ ● ● ◆ Animals ◆◆ 3 ◆ 3 ◆◆ ◆ sD=0.005 ●◆◆ ◆◆ ◆◆ ● ● ● ● ● ◆ ◆ ◆ ◆ ◆ ◆◆ ◆◆ ◆◆ 2 ◆◆ ● ● 2 ◆◆ ●● ◆ ◆ ◆◆ ●● ◆◆ ● ◆◆ ◆◆ ●● ◆ ◆◆ ◆◆ ◆ ◆◆ ◆ ●● ◆ ◆◆ ◆◆ ◆◆ ●● ◆◆ ●◆ ◆◆ ◆◆ ◆ ● ◆◆ ◆◆ ◆◆ ◆◆ ◆◆ ◆◆◆ ●◆◆◆ ●● ◆◆ ◆◆ ◆◆ ◆◆◆ ● ● ●◆◆ ●◆ ◆◆ ●◆ ●● ◆ ◆◆ ●◆◆◆ ●◆◆◆ ◆◆ ◆ ◆ ◆ ●◆◆ ◆◆ ●◆◆ ●◆◆◆ ●◆◆◆ ◆◆◆ ●◆ ●◆ ● ●◆◆◆ ●◆ ●◆◆◆ ● ●◆◆ ●◆◆◆ ◆◆ ●● ◆◆◆ ◆◆◆ ◆◆ 1 ◆◆ ●◆ ●◆◆◆ ●◆ ●◆ ●◆ 1 ◆◆ ◆◆◆ ●◆◆◆ ◆◆ ●◆◆◆ ●◆◆◆ ●◆ ●◆ ●◆◆◆ ●● ◆ ● ●◆ ● ◆◆ ●◆◆◆ ◆◆◆ ◆◆◆ ●◆ ●◆ ●◆ ●◆ ●◆ ●◆ Cline Width ◆◆◆ ●◆ ●◆ ● ●◆ ◆◆◆ ●◆ ●◆◆◆ ● 0 0 50 100 500 1000 50 100 500 1000 Time Since Contact (log)

Figure 5: Early temporal dynamics and determinants of cline width at selected and BDMI loci. Panel A: temporal dynamics of cline width at selected loci. Effect of the strength of selection sD on cline width of selected loci. Dark points give mean width across replicates with ±1 SE whiskers. Opaque squares show mean cline width at neutral loci in the same genome. Panel B: Cline width at BDMI loci dark points show mean width with ±1 SE whiskers. Opaque squares show mean for fixed difference loci in the same genome. Panel C: effect of the strength of Figure 7: Early temporalselection on dynamics linked deleterious and determinants loci on cline width of cline of BDMI width loci. at Panel selected D: effect and of BDMI recombination loci.PanelA:temporal scenario on cline width at BDMI loci (note plants case not shown). Unless specified otherwise dynamics of cline widthparameters at selected are given loci. by E boldface↵ect of values the strength in Table of 1. selection sD on cline width of selected loci. Dark points give mean width across replicates with 1SEwhiskers.Opaquesquaresshowmeanclinewidthatneutrallociinthesamegenome. ± 21 Panel B: Cline width at BDMI loci dark points show mean width with 1SEwhiskers.Opaquesquaresshowmeanforfixed ± di↵erence loci in the same genome. Panel C: e↵ect of the strength of selection on linked deleterious loci on cline width of BDMI loci. Panel D: e↵ect of recombination scenario on cline width at BDMI loci (note plants case not shown). Unless specified otherwise parameters are given by boldface values in Table XXX.

8 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

344 Finally, we characterize the population ancestry across the geographic range following sec-

345 ondary contact using the population-wide hybrid index at the nN neutral loci that were not

346 constrained to be fixed (Equation 6). As with the allele frequency clines at the neutral loci, the

347 cline in population ancestry is not sigmoidal as would be expected in the tension-zone model.

348 Affected by allele surfing during range expansion, the cline in population ancestry is gradual,

349 showing a less marked shift in ancestry at the time of secondary contact (black line in Figure 4D).

350 This gradual change in ancestry across the range persists over the course of introgression obscur-

351 ing the location of initial contact, making the hybrid zone appear older than it is. These results

352 suggest that incorporating historical range expansion is important for the accurate reconstruction

353 of the of hybrid zones.

354 Discussion

355 Here we investigated the effect of parental population range expansion and allele surfing prior

356 to secondary contact on hybridization and introgression. As found by previous models of range

357 expansion (e.g., Peischl et al. (2015), Peischl and Excoffier (2015), and Peischl et al. (2013)), we

358 find that the edge of two expanding parental populations can accumulate a significant number of

359 deleterious mutations and hence expansion load. If these deleterious alleles are at least partially

360 recessive, hybridization of range edge populations results in extensive heterosis, as wild-type

361 alleles present in one population mask the effects of deleterious mutations accumulated in the

362 other. This masking effect can compensate for the reduction in hybrid fitness due to genetic

363 incompatibilities (e.g. BDMI) between parental populations facilitating introgression over gener-

364 ations after the initial contact. We further uncovered selective interference between the loci with

365 surfed alleles and BDMI alleles in shaping introgression across species boundaries. The strength

366 of the interference is dependent on the selection against surfed alleles and the distribution of the

367 recombination rate across the genome.

368 In this model we have considered only a small subset of the possible selection regimes acting

22 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

369 in post-expansion secondary-contact hybrid zones. Namely, we have not explored the evolu-

370 tion of loci experiencing environment-dependent or underdominant selection. In addition, we

371 have assumed that each parental population is fixed for a particular combination of compatible

372 BDMI genotypes. In reality the genotypes at these loci are also subject to allele surfing during

373 expansion, which could alter the prevalence of incompatible genotypes among hybrids and the

374 permeability of these loci during intorgression.

375 Hybrid zones

376 Our results may help explain previously reported patterns seen in hybrids and hybrid zones. One

377 commonly observed pattern is variation among cross combinations (both within and between

378 species) in the level of heterosis among hybrid offspring (Lowry et al., 2008). This variation has

379 been exploited by plant breeders to develop heterotic groups for hybrid crop production (Crow,

380 1998). Despite its importance, heterosis remains difficult to predict in natural populations (Pickup

381 et al., 2013). Results from the present study imply that the history of the parental populations

382 may be key. If divergence occurs in large adjacent populations, little heterosis is expected upon

383 contact, whereas divergence in distant locations followed by range expansion with extreme drift,

384 such as explored here, creates genomic conditions that facilitate heterosis and introgession via

385 the masking of deleterious recessive alleles that counteract genetic incompatibilities.

386 Another partially unexplained phenomenon is the relatively high frequency of introgressed

387 regions that appear to be positively selected in hybrid populations (Corbi et al., 2018; Hamilton

388 et al., 2013; Rieseberg et al., 1999). This could be a byproduct of in finite populations,

389 such that some favorable alleles were previously in only one of the parental populations (Barton,

390 2001) or the environment may be different in the hybrid zone centre, favoring a different combi-

391 nation of alleles. Alternatively, it seems likely that a substantial fraction of favorable introgressed

392 regions are due to the masking of deleterious mutations, as shown here.

393 The spatial distribution of population ancestry is usually modeled as a sigmoidal clines, the

394 expected shape assuming selection-dispersal balance in the tension zone model (Barton, 1979).

23 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

395 However, here we found distorted shapes of genomic clines (Figure 4D) as a result of allele surf-

396 ing. This pattern might be undisclosed in many empirical system if the spatial genetic sampling

397 within parental ranges was limited. For future investigations in hybrid zones, our results empha-

398 size the importance of broad spatial sampling centered around the core of admixture in detecting

399 signatures and effects of allele surfing in hybrid zone ancestry dynamics.

400 Genomic architecture of speciation

401 The effects of recombination rate heterogeneity in this expansion-hybridization context alters the

402 conditions under which speciation can be impeded by introgression. Here we find elevated intro-

403 gression at the BDMI loci due to hitchhiking with flanking loci, where surfed deleterious alleles

404 were masked by heterospecific alleles. Interestingly, such interference-facilitated BDMI intro-

405 gression is modulated by the distribution of recombination rates, holding the mean magnitude

406 of recombination rate constant.

407 Our observation extends the existing understanding of the effect of recombination on intro-

408 gression by accounting for the distribution (in addition to the magnitude) of recombination and

409 the interference among neutral loci, BDMI loci, and loci with surfed deleterious alleles during

410 parental range expansion. Both allele surfing and introgression are expected to be shaped by the

411 magnitude and distribution of recombination events across the genome. In the expansion context,

412 constrained recombination can reduce absolute fitness at the range edge suppressing population

413 sizes and even resulting in substantial reductions in expansion rate (Peischl et al., 2015). In the

414 introgression context, local recombination suppression due to chromosomal rearrangements can

415 prevent introgression and facilitate differentiation (Navarro and Barton, 2003; Noor et al., 2001;

416 Rieseberg, 2001).

417 In contrast to previous results (Peischl and Excoffier, 2015), we find that recombination has

418 relatively little impact on the accumulation of expansion load (Figure S2) and thus F1 heterosis

419 (Figure 2D), even in the case of chromosomal segregation. The contrast between these results may

420 indicate that only small amounts of recombination are necessary to dramatically reduce selective

24 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

421 interference during range expansion. However for introgression, we find that genetic linkage

422 and the distribution of recombination have large impacts on introgression, effects that can persist

423 for hundreds of generations after the onset of hybridization (Figure 5). Specifically, introgression

424 at neutral loci is facilitated by selection at linked selected loci. Even the introgression at BDMI

425 loci is elevated under stronger selection at those deleterious loci (Figure 5). On the other hand,

426 heterogeneity in the recombination rate suppresses introgression at BDMI loci helping maintain

427 barriers to hybridization. Therefore, the expansion-introgression dynamics could be variable in

428 organismal systems with different level of recombination heterogeneity in the genomes. These re-

429 sults highlight how the recombination landscape and range expansion interact to shape genomic

430 ancestry at species boundaries.

431 Hybrid speciation

432 Our results also have implications for long term outcomes of natural hybridization. Most ob-

433 viously, our simulations indicate that levels and heterogeneity of introgression may be greater

434 than that predicted by standard hybrid zone models (Barton and Hewitt, 1985). Strong heterosis

435 could contribute to the weakening of reproductive barriers, sustained hybrid swarms/species,

436 or even the fusion of previously isolated populations. High level of heterosis would reduce the

437 strength of reinforcement. This leads to the interesting prediction that reinforcement might be

438 more likely when populations have diverged in adjacent areas rather than following extensive

439 range expansion. This differs from the classic scenario that ignores allele surfing, in which re-

440 inforcement “completes speciation” following range expansion and contact between previously

441 allopatric populations (Dobzhansky, 1937).

442 On the other hand, expansion-contact induced heterosis might sustain hybrid populations

443 for extended period of time, allowing hybrid population to diverge from parental lineages under

444 divergent selection (Buerkle et al., 2000). One compelling example resides in a North American

445 wood warblers , where hybrid populations existed long after post-expansion-

446 hybridization between the parental species (Wang et al., 2021). Heterosis could sustained the

25 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

447 ancient hybrid population despite genetic incompatibility between the parental populations. The

448 hybrid populations partitioned distinct climatic niche and diverged from parental lineages at

449 genomic regions related to climatic adaptation (Wang et al., 2021). The expansion-hybridization

450 resulted fusion, hybrid swarms, or hybrid species could be prevalent yet hidden in many extent

451 populations with histories of expansion-admixture.

452 Genomic invasion

453 The heterotic effect of post-expansion hybridization can also facilitate introgression of native

454 species genetic material between native and invasive species. In this case, if invasive species have

455 undergone the most recent range expansion, deleterious alleles may have surfed to higher fre-

456 quencies, facilitating introgression from stable native populations. Here we did not model inva-

457 sion explicitly (asymmetry of expansion, in which the invasive species expanded its range while

458 the native species did not). However, based on the expansion-hybridization induced heterosis

459 and introgression our results predict, we expect masking of deleterious alleles in the invasive

460 species following hybridization with its native congener. Such hybridization-induced heterosis

461 could be the genomic mechanism underlying hybridization-accelerated anthropogenic invasions,

462 including the process in which modern replaced (Mallet, 2005; Mesgaran

463 et al., 2016). Ironically, native species could potentially rescue invaders that are ecologically detri-

464 mental to them. Our result raises questions about the conservation of native populations in the

465 face of anthropogenic invasion of closely related and rapidly-expanding invasive populations.

466 Acknowledgments

467 The Authors would like to thank Darren Irwin and Brett Payseur for their helpful feedback in

468 the development of this project and to Matt Pennell for his helpful comments on the manuscript.

469 AM was supported by a fellowship from the University of British Columbia and by the EEB

470 department Postdoctoral Fellowship from the University of Toronto and support from the Natu-

26 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

471 ral Sciences and Engineering Research Council of Canada (PGS D 331015731 to SW and NSERC

472 RGPIN-2016-03711 to SPO).

473 Literature Cited

474 Aeschbacher, S., J. P. Selby, J. H. Willis, and G. Coop. 2017. Population-genomic inference of the

475 strength and timing of selection against gene flow. PNAS 114:7061–7066.

476 April, J., R. H. Hanner, A.-M. Dion-Cotˆ e,´ and L. Bernatchez. 2013. Glacial cycles as an allopatric

477 speciation pump in north-eastern American freshwater fishes. Mol. Ecol. 22:409–422.

478 Arntzen, J. W., W. de Vries, D. Canestrelli, and I. Mart´ınez-Solano. 2017. Hybrid zone formation

479 and contrasting outcomes of secondary contact over transects in common toads. Mol. Ecol.

480 26:5663–5675.

481 Avise, J. C., D. Walker, and G. C. Johns. 1998. Speciation durations and Pleistocene effects on

482 vertebrate phylogeography. Proc. R. Soc. Lond. B 265:1707–1712.

483 Barton, N. H. 1979. The dynamics of hybrid zones. Heredity 43:341–359.

484 ———. 2001. The role of hybridization in evolution. Mol. Ecol. 10:551–568.

485 Barton, N. H., and K. Gale. 1993. Genetic analysis of hybrid zones. Pages 13–45 in R. Harrison,

486 ed. Hybrid Zones and the Evolutionary Process. Oxford Univ. Press, Oxford, UK.

487 Barton, N. H., and G. M. Hewitt. 1985. Analysis of Hybrid Zones. Annu. Rev. Ecol. Evol. Syst

488 16:113–148.

489 ———. 1989. Adaptation, speciation and hybrid zones. Nature 341:497–503.

490 Buerkle, C. A. 2005. Maximum-likelihood estimation of a hybrid index based on molecular

491 markers. Molecular Ecology Notes 5:684–687.

27 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

492 Buerkle, C. A., R. J. Morris, M. A. Asmussen, and L. H. Rieseberg. 2000. The likelihood of

493 homoploid hybrid speciation. Heredity 84:441–451.

494 Corbett-Detig, R., and R. Nielsen. 2017. A Hidden Markov Model Approach for Simultaneously

495 Estimating Local Ancestry and Admixture Time Using Next Generation Sequence Data in

496 Samples of Arbitrary Ploidy. PLOS Genetics 13:e1006529.

497 Corbi, J., E. J. Baack, J. M. Dechaine, G. Seiler, and J. M. Burke. 2018. Genome-wide analysis

498 of allele frequency change in sunflower crop–wild hybrid populations evolving under natural

499 conditions. Mol. Ecol. 27:233–247.

500 Crow, J. F. 1998. 90 years ago: The beginning of hybrid maize. Genetics 148:923–928.

501 Dobzhansky, T. 1937. Genetics and the Origin of Species. Columbia University Press, New York.

502 Duvernell, D. D., E. Westhafer, and J. F. Schaefer. 2019. Late Pleistocene range expansion of North

503 American topminnows accompanied by admixture and introgression. J. Biogeogr. 46:2126–

504 2140.

505 Edmonds, C. A., A. S. Lillie, and L. L. Cavalli-Sforza. 2004. Mutations arising in the wave front

506 of an expanding population. PNAS 101:975–979.

507 Endler, J. A. 1977. Geographic Variation, Speciation, and Clines. Princeton University Press,

508 Princeton, New Jersey.

509 Excoffier, L., M. Foll, and R. J. Petit. 2009. Genetic Consequences of Range Expansions. Annu.

510 Rev. Ecol. Evol. Syst 40:481–501.

511 Gao, Y.-D., Y. Zhang, X.-F. Gao, and Z.-M. Zhu. 2015. Pleistocene glaciations, demographic ex-

512 pansion and subsequent isolation promoted morphological heterogeneity: A phylogeographic

513 study of the alpine Rosa sericea complex (Rosaceae). Scientific Reports 5:11698.

28 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

514 Haenel, Q., T. G. Laurentino, M. Roesti, and D. Berner. 2018. Meta-analysis of chromosome-scale

515 crossover rate variation in and its significance to evolutionary genomics. Mol. Ecol.

516 27:2477–2497.

517 Haffer, J. 1969. Speciation in Amazonian Forest Birds. Science 165:131–137.

518 Hamilton, J. A., C. Lexer, and S. N. Aitken. 2013. Differential introgression reveals candidate

519 genes for selection across a spruce (Picea sitchensis x P. glauca) hybrid zone. New Phytol.

520 197:927–938.

521 Hewitt, G. M. 1999. Post-glacial re-colonization of European biota. Biol. J. Linn. Soc. 68:87–112.

522 Janousek,ˇ V., P. Munclinger, L. Wang, K. C. Teeter, and P. K. Tucker. 2015. Functional organization

523 of the genome may shape the species boundary in the house mouse. Mol. Biol. Evol 32:1208–

524 1220.

525 Juric, I., S. Aeschbacher, and G. Coop. 2016. The Strength of Selection against

526 Introgression. PLOS Genetics 12:e1006340.

527 Kirkpatrick, M. 2017. The Evolution of Genome Structure by Natural and . The

528 Journal of Heredity 108:3–11.

529 Klopfstein, S., M. Currat, and L. Excoffier. 2006. The Fate of Mutations Surfing on the Wave of a

530 Range Expansion. Mol. Biol. Evol. 23:482–490.

531 Licona-Vera, Y., J. F. Ornelas, S. Wethington, and K. B. Bryan. 2018. Pleistocene range expan-

532 sions promote divergence with gene flow between migratory and sedentary populations of

533 Calothorax hummingbirds. Biol. J. Linn. Soc. 124:645–667.

534 Lowry, D. B., J. L. Modliszewski, K. M. Wright, C. A. Wu, and J. H. Willis. 2008. The strength

535 and genetic basis of reproductive isolating barriers in flowering plants. Philos. Trans. R. Soc. B

536 363:3009–3021.

29 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

537 Mallet, J. 2005. Hybridization as an invasion of the genome. Trends Ecol. Evol. 20:229–237.

538 Martin, S. H., J. W. Davey, C. Salazar, and C. D. Jiggins. 2019. Recombination rate variation shapes

539 barriers to introgression across butterfly genomes. PLOS Biology 17:e2006288.

540 Matute, D. R., A. A. Comeault, E. Earley, A. Serrato-Capuchina, D. Peede, A. Monroy-Eklund,

541 W. Huang, C. D. Jones, T. F. C. Mackay, and J. A. Coyne. 2020. Rapid and Predictable Evolution

542 of Admixed Populations Between Two Drosophila Species Pairs. Genetics 214:211–230.

543 May, R. M., J. A. Endler, and R. E. McMurtrie. 1975. Gene Frequency Clines in the Presence of

544 Selection Opposed by . Am. Nat. 109:659–676.

545 Mesgaran, M. B., M. A. Lewis, P. K. Ades, K. Donohue, S. Ohadi, C. Li, and R. D. Cousens. 2016.

546 Hybridization can facilitate species invasions, even without enhancing local adaptation. PNAS

547 113:10210–10214.

548 Navarro, A., and N. H. Barton. 2003. Accumulating Postzygotic Isolation Genes in Parapatry: A

549 New Twist on Chromosomal Speciation. Evolution 57:447–459.

550 Noor, M. a. F., and S. M. Bennett. 2009. Islands of speciation or mirages in the desert? Examining

551 the role of restricted recombination in maintaining species. Heredity 103:439–444.

552 Noor, M. A. F., K. L. Grams, L. A. Bertucci, and J. Reiland. 2001. Chromosomal inversions and

553 the reproductive isolation of species. PNAS 98:12084–12088.

554 Peischl, S., I. Dupanloup, M. Kirkpatrick, and L. Excoffier. 2013. On the accumulation of delete-

555 rious mutations during range expansions. Mol. Ecol. 22:5972–5982.

556 Peischl, S., and L. Excoffier. 2015. Expansion load: Recessive mutations and the role of standing

557 . Mol. Ecol. 24:2084–2094.

558 Peischl, S., M. Kirkpatrick, and L. Excoffier. 2015. Expansion Load and the Evolutionary Dynam-

559 ics of a Species Range. Am. Nat. 185:E81–E93.

30 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

560 Pickup, M., D. L. Field, D. M. Rowell, and A. G. Young. 2013. Source population characteristics

561 affect heterosis following genetic rescue of fragmented plant populations. Proc. R. Soc. Lond.

562 B 280:20122058.

563 Rieseberg, L. H. 2001. Chromosomal rearrangements and speciation. Trends Ecol. Evol. 16:351–

564 358.

565 Rieseberg, L. H., J. Whitton, and K. Gardner. 1999. Hybrid Zones and the Genetic Architecture

566 of a Barrier to Gene Flow Between Two Sunflower Species. Genetics 152:713–727.

567 Roux, C., C. Fra¨ısse, J. Romiguier, Y. Anciaux, N. Galtier, and N. Bierne. 2016. Shedding Light

568 on the Grey Zone of Speciation along a Continuum of Genomic Divergence. PLOS Biology

569 14:e2000234.

570 Schumer, M., C. Xu, D. L. Powell, A. Durvasula, L. Skov, C. Holland, S. Sankararaman, P. An-

571 dolfatto, G. G. Rosenthal, and M. Przeworski. 2017. interacts with the local

572 recombination rate to shape the evolution of hybrid genomes. bioRxiv page 212407.

573 Servedio, M. R., and M. A. Noor. 2003. The Role of Reinforcement in Speciation: Theory and

574 Data. Annu. Rev. Ecol. Evol. Syst. 34:339–364.

575 Slatkin, M. 1973. Gene Flow and Selection in a Cline. Genetics 75:733–756.

576 Stapley, J., P. G. D. Feulner, S. E. Johnston, A. W. Santure, and C. M. Smadja. 2017. Variation in

577 recombination frequency and distribution across eukaryotes: Patterns and processes. Philo-

578 sophical Transactions of the Royal Society B: Biological Sciences 372:20160455.

579 Szymura, J. M. 1976. Hybridization between Discoglossid toads Bombina bombina and Bombina

580 variegata in southern Poland as revealed by the electrophoretic technique. J. Zool. Syst. Evol.

581 Res 14:227–236.

582 Taberlet, P., L. Fumagalli, A.-G. Wust-Saucy, and J.-F. Cosson. 1998. Comparative phylogeography

583 and postglacial colonization routes in Europe. Mol. Ecol. 7:453–464.

31 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

584 Turelli, M., and H. A. Orr. 2000. Dominance, Epistasis and the Genetics of Postzygotic Isolation.

585 Genetics 154:1663–1679.

586 Wang, S., M. J. Ore, E. K. Mikkelsen, J. Lee-Yaw, D. P. L. Toews, S. Rohwer, and D. Irwin.

587 2021. Signatures of mitonuclear in a warbler species complex. bioRxiv page

588 2020.04.06.028506.

589 Weir, J. T., and D. Schluter. 2004. Ice sheets promote speciation in boreal birds. Proc. R. Soc.

590 Lond. B 271:1881–1887.

32 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

591 Appendix A: Ancestry of Selected Loci

592 The allele frequency clines at the selected loci are characterized by non-sigmoidal clines (see

593 Figure 4B). Namely these clines are determined primarily by loci for which the allele frequency

594 in both core populations, A and B, are low as expected at mutation-selection balance. This is

595 in contrast to the allele frequency clines at neutral loci, as shown in Figure 4C, for which the

596 allele frequency differences between the core A and B tend to be largest. The temporal dynamics

597 of these two cline types differ as well. The dynamics of the neutral clines are characterized

598 predominantly by changes in the cline width anchored around a constant center ({c,e}). In

599 contrast, the selected loci clines exhibit dramatic changes in their width, center, and height over

600 time (Figure 4).

601 Following secondary contact, the temporal dynamics at the selected loci are determined by

602 both the introgression of beneficial alleles from the point of secondary contact and the spread of

603 these same beneficial alleles from the range core. As with the neutral loci, initially the introgres-

604 sion of alleles from the point of secondary contact increases the cline width. As beneficial alleles

605 from secondary contact and from the range core meet, the cline width stabilizes and the height

606 and the center of the cline begins to fall. The net result is shown in Figure 5, with the selected

607 clines exhibiting relatively narrow clines compared to the corresponding neutral loci.

608 To confirm this explanation of the clinal dynamics we ran a modified version of the simu-

609 lations where each selected locus is replaced with a two-locus haplotype with complete linkage

610 (r = 0). The first locus of these haplotypes is the original selected locus and the second locus is a

611 neutral locus fixed for allele 1 across the range of population A and for allele 0 across the range

612 of population B, as with the fixed difference loci above. In this way the ancestry of beneficial alle-

613 les, whether they are from the hybrid zone or spreading from the range core, can be determined

614 while not modifying genomic linkage. Initially the sigmoidal shape and cline width in selected

615 locus ancestry resembles that of the neutral fixed difference loci (Figure S3). On longer time

616 scales (1000-5000 generations post after introgession) the width of the clines in selected locus

33 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

617 ancestry slightly exceeds that of the fixed difference loci due to the beneficial effects of the intro-

618 gressing alleles. This effect is small and only visible on longer time scales due to the relatively

619 weak selection at these loci (sD = 0.03).

620 Appendix B: Supplementary Figures

A. Ecological Dynamics B. Evolutionary Dynamics

1.015 220 1.010 200 s =0.0 1.005 D 180 sD=0.001 1.000 sD=0.003 160 0.995 sD=0.005 Viability 0.990 140

Population Size 0.985 0 2500 5000 7500 10 000 0 2500 5000 7500 10 000 C. D. 1.01

200 1.00

0.99 Free Constrained 150 0.98 Chromosomal 0.97 Plants Animals 100 Viability 0.96

0.95 Population Size

0 2500 5000 7500 10 000 0 2500 5000 7500 10 000 Time in Allopatry

Figure S1: Eco-evolutionary dynamics in Allopatry. The population size (panels A and C) and mean population viability (panels B and D) of parental populations evolving in allopatry. Dashed lines in panels B and D give maximum possible viability. Top Row: Effect of selection. Bottom Row:Figure 2: EffectEco-evolutionary of recombination. dynamics Chromosomalin Allopatry. The scenario population results size (panels in A significantly and C) and mean reduced population viability viability and(panels subsequently B and D) of parental population populations size evolving due in to allopatry. the accumulation Dashed lines in of Bmutation and D give maximum via Muller’s possible ratchet. viability. UnlessTop Row: specified E↵ect of selection. otherwise, Bottom parameters Row: E↵ect are of recombination. given by boldface Chromosomal values scenario in Table results 1 in significantly reduced viability and subsequently population size due to the accumulation of mutation via Muller’s ratchet. Unless specified otherwise parameters are given by boldface values in table XXX.

34

3 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

A. Selection B. Recombination

0.10 Mean h=0 0.10 sD=0.0 Free Replicate h=0 Constrained sD=0.001 0.08 0.08 Chromosomal sD=0.003 Mean h=0.25 Plants s =0.005 Animals 0.06 D 0.06

0.04 0.04

0.02 0.02

Expansion Load 0.00 0.00

0 500 1000 1500 2000 0 500 1000 1500 2000 Generations of Expansion

Figure 4: Accumulation of load during range expansion. Load as measured by equation XXX over the course of range Figure S2: Accumulation of load during range expansion. Load as measured by Equation (3) expansion. Panel A: e↵ect of the strength (color) and dominance (solid: hD =0,dashed: hD =0.25). Dark curves give mean overdynamics the whereas course opaque of range curves expansion. give dynamics Panelfor individual A: Effect replicates. of the Panel strength B: shows (color) the e↵ect and of recombination dominance scenario (solid: on h = 0, dashed: h = 0.25). Dark curves give mean dynamics whereas light curves give dynam- theD accumulation of loadD during range expansion. Unless specified otherwise parameters are given by boldface values in table ics for individual replicates for (shown for h = 0 only). Panel B: Effect of recombination scenario XXX. on the accumulation of load during range expansion. Unless specified otherwise parameters are given by boldface values in Table 1.

35

5 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.28.066308; this version posted July 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

A. Cline Shapes Across Loci B. Cline Width

1.0 ◆◆ ● 20 ■◆ Fixed Diff. ◆◆ ● ◆ ◆◆ 0.8 BDMI ● ■ ◆ Neutral 15 ◆◆ ■ ● ◆ 0.6 Selected ■

Selected Ancestry ◆◆ ●◆ 10 ●◆◆ ■ ■◆ 0.4 ◆◆

Width ● ■◆ ◆◆ Frequency ■●◆ 5 0.2 ◆◆ ◆◆ ◆◆ ■●● ◆ ◆◆ ●■●◆ ◆◆ ■●◆ ■◆ ●◆◆ ■●◆ ◆◆ ●◆◆ ■◆ ◆◆ ■●◆ ■◆ ■●◆◆◆ ■●◆ 0.0 0 0 20 40 60 80 100 50 100 500 1000 5000 C. Neutral Fixed Di↵erence Cline D. Selected Haplotypes 1.0 1.0 GI 100 100 0.8 1000 1000 0.8 2000 2000 3000 0.6 3000 0.6 4000 4000 5000 0.4 5000 0.4 Frequency 0.2 0.2

0.0 0.0 0 20 40 60 80 100 0 20 40 60 80 100 Location Location

Figure S3: Dynamics of selected loci clines with known ancestry. Panel A: The cline shape for each locus after 1000 generations of introgression including the cline in the ancestry at the selected loci as determined by the linked neutral fixed difference (grey). Panel B: The width of the cline at the neutral fixed difference loci (light squares) versus the clines in selected locus ancestry Figure 8: Clinal dynamics and population ancestry. Clines in average allele/haplotype frequency are calculated by (dark points). Panel C: Temporal dynamics in fixed difference clines fixed for the 1 allele across averaging across loci/genotypes of a specific type (e.g., neutral, selected, fixed di↵erence, BDMI,selected haplotypes) for which the range of parental population A and for the 0 allele across the range of parental population pA < 0.5andpB > 0.5. Panel A: shape of the allele frequency cline for each locus type at generation 1000. Gray curve B.edge Panel D: Temporaledge dynamics in selected cline ancestry. Parameters are given by the bold face gives cline in haplotype ancestry at the selected locus with 1 being population A and 2 being population B. Panel B: Temporal values in Table 1 dynamics cline width in selected haplotypes (dark points) with 1SE whiskers and cline width of neutral fixed di↵erence loci. . ± Panel C: Temporal dynamics of allele frequency clines at neutral fixed di↵erence loci. Panel D: Temporal dynamics of cline in selected locus ancestry as determined by the selected haplotype. Parameters are given by boldface values in Table XXX.

36

9