Posted on Authorea 17 Jun 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159242077.72607009 — This a preprint and has not been peer reviewed. Data may be preliminary. fpplto itr icl.Freape ohpplto xasosadaatto edt nexcess an to lead adaptation inference and making expansions other population each both confound can example, For signals the to these difficult. (Charlesworth However, key history genes location 2013). population in Petrov new of adaptation alleles and this local rare Messer in and of 2007; evolution population excess McVean (Tajima adaptive an the new of population by the of signatures the signaled of and success both in capacity variation, environment, alleles carrying genetic new the standing rare the fill or of in to challenges loss grow unique will the the (Excoffier population in to the small result adapt (Charlesworth bottleneck, a and bottlenecks only the location) niche (as These After new bottleneck 2004). the population 2011). Gillespie a in Durbin 1989; through establish and goes will Li first it population 2009; location the new of a to proportion migrates population a When oteeeet r fe eti atrso oeua aito ihnadbtenseis(Charlesworth species between and within variation molecular of patterns in left often al. are events between these migration to of rates (Smith fitness, al. organism ranges influence population habitats, and reorganize Holmgren locations, fundamentally 2000; human- can (Hewitt both shifts shifts) to mental tectonic due and Rosenzweig changes (glaciation 1978; environmental growing humans (Cloudsley-Thompson substantial the and crisis) undergone weather climate extreme has expansion, anthropogenic earth desert destruction, the environment (anthropogenic years, events mediated 25,000 past the positive In recurrent of evidence find we Introduction contrast, In peptides. support antimicrobial little adaptation. Toll-regulated of find the local we signatures and recent However, and system suggesting immunity. Toll-signaling migration, antifungal the genes this and in these following development selection contraction in cuticle population selection in a recurrent sweeps though of selective structure, for signatures population and of find adaptation evidence and we local KYA), little migration, recent (30-100 shows system recent period innubila spatially-distributed glaciation D. a previous Surprisingly, this with KYA). the for (0.2-2 during consistent its northwards model further spread even to genomic innubila data expanded D. a adapt these recently genomic find establish in We has and population history to ecology. inhabit Using population adaptation understood their to well local determine desert. a came to of with of populations innubila signatures expanses distinct for four large examine looking how by we mountain-forests, characterize individuals, separated wild-caught we crisis USA 300 climate southwestern than Here, local current more in the to from forests as species. adapt mountain especially many can interest, that range: for great current subpopulations of shifts creating is range changes isolated, these caused become following has and adapt contract, species how expand, Understanding can conditions. species of populations time, Over Abstract 2020 17, June 1 Hill Tom Island’ ‘Sky a in variation species genomic shape demography and Selection nvriyo Kansas of University 09 Cini 2009; 03 Wright 2003; tal. et 1 n o Unckless Rob and tal. et 09 White 2009; tal. et 02 Porretta 2012; 03 Excoffier 2003; tal. et 1 tal. et 03.Ti dpaincnivleslciesep rmnwmutations new from sweeps selective involve can adaptation This 2013). tal. et 02 Antunes 2012; tal. et 2009). 95 Astanei 1995; 1 tal. et 05.Sgaue ftewyognssadapt organisms way the of Signatures 2015). tal. et tal. et tal. et 05 Rosenzweig 2005; 03 emso n enns2005; Pennings and Hermisson 2003; tal. et 03 uvy20) hs environ- These 2005). Survey 2003; 08 n vnsurltdto unrelated events and 2008) tal. et 03 Excoffier 2003; tal. et 08 Searle 2008; tal. et et et Posted on Authorea 17 Jun 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159242077.72607009 — This a preprint and has not been peer reviewed. Data may be preliminary. fteMdenacieaoi rsot(yradJeie20) eesl ape rmta location, that from sampled easily We collect to 2005). unable Jaenike also and were and flies) we (Dyer (though 84 invasion collect Prescott (PR, recent to possible Prescott in failed a mostly flies), archipelago suggesting surveys 48 previous Madrean Interestingly, (HU, the 1). Huachucas (Figure of flies), flies) 81 67 (SR, between (CH, Ritas Chiricahuas flow Santa gene Arizona: across of locations levels high how characterize shows To and range geographic its populations expanded isolated recently of geographically efficacy has reduced innubila for Drosophila compensating of metabolic means in a as changes Results evi- even to strong possibly adaptation with and potential to genome, location, suggesting due nuclear between selection translocations, the host between these into the toxins development of of translocations and cuticle adaptation process mitochondrial pathogens for fungal of local as in evidence of such difference find dence adaptation, a also local We potentially of mito- suggesting signatures locations. the resistance, some in to pathogen 2L find limited with fungal we syntenic structure and However, is with which 2006). location, B O’Grady element by (Muller and genomes structure chromosome whole single population resequenced a of We and evidence chondria population. little local of and find to populations we (Dyer specific four Surprisingly, and pathogens evo- from recent individuals with how or of wild-caught recurrent understand coevolution bouts of is to the adaptation replicate wanted this regarding observe also if We interesting to and 2011a). particularly opportunity Unckless is rare 2005; a this Jaenike provide and of migration change history limited lutionary migratory with and demographic populations the Isolated reconstruct workhorse to genetic model sought cosmopolitan, ecological We commensal, mushroom-feeding human endemic, ‘island’ the the to ways counterpoint many a in fact, Dyer In 2005; 2012). Jaenike and Dyer (Dyer maximum 2005; more glacial become last have the model they during arrived years, have 100 to past the (McCormack innubila in adaptation Drosophila climate and migra- migration changing drive limiting the may presumably to McCormack which desert, 2005; due arid, of (Survey However, miles species diversity. of most hundreds ecological numerous for (McCormack by locations desert contains separated between of habitat tion Mexico, expanses forest leaving northwestern large retreated, by and then separated USA islands’, southwestern ‘Sky human Coe as our in how known identify located mountains to forested or archipelago, identify location, Madrean new to a 2011). populations The in Durbin human population, humans and in a a (Li utilized of population of globally frequently establishment invading spread the expansion the also ancestors between upon are and divergence fixed signatures in establishment have These increase results which the adaptation an traits population. Following and local diversity original and variants pairwise the 1989). rare population and of in (Tajima excess decrease D an a is variants, Tajima’s there rare statistic, of deficit the a in by detected be can This alak fivsv pce ouaingnmc nld intrso oteek iil ntest fre- site the in visible bottlenecks (Charlesworth of populations signatures between include differentiation genomics and spectrum, population quency species invasive of hallmarks intrso eorpi hneaefeunl eetdi pce hthv eetyudroerange undergone recently Guindon have 2003; that Yohe and species (Astanei (Wright Parmesan in signal introduction 2000; detected the human (Hewitt frequently of to cause are due true change expansion the demographic identify of to Signatures required is analysis thorough al. more et meaning alleles, rare of tal. et 2003). .melanogaster D. 02.Tee‘sad’wr once yls oet uigtepeiu lca aiu which maximum glacial previous the during forests lush by connected were ‘islands’ These 2012). tal. et 05 ank n yr20;Ucls 01;Ucls n ank 01 Coe 2011; Jaenike and Unckless 2011a; Unckless 2008; Dyer and Jaenike 2005; samycophageous a is .innubila D. , .innubila D. neto JeieadDe 2008). Dyer and (Jaenike infection aet nai t urn ag,w olce isfo orSyisland Sky four from flies collected we range, current its inhabit to came a elsuideooy(ahieadSlan20;De n Jaenike and Dyer 2004; Silvain and (Lachaise ecology well-studied a has Drosophila tal. et .innubila D. tal. et pce on hogotteeSyilnsadthought and islands Sky these throughout found species 2 05 Excoffier 2005; 00 Walsh 2010; tal. et nfu ieetSyiln onanranges. mountain island Sky different four in tal. et .innubila D. .innubila D. 09 Coe 2009; tal. et tal. et 09.Teilnsaehtesof hotbeds are islands The 2009). tal. et .innubila D. .melanogaster D. 01 Cini 2011; 09 rtecagn climate changing the or 2009) tal. et tal. et dp oterlclclimate local their to adapt 03 iadDri 2011). Durbin and Li 2003; naiigteSyislands. Sky the inhabiting .melanogaster) D. 2012). 05.Ulk h lab the Unlike 2005). .innubila D. nteeatlocations exact the in tal. et .innubila D. . 02.Other 2012). tal. et represent (Markow north tal. et 2009; Posted on Authorea 17 Jun 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159242077.72607009 — This a preprint and has not been peer reviewed. Data may be preliminary. cnl ihri R(iue2,GMtvle=93.728, = t-value F GLM PR, 2A, into (Figure migration and PR upon (Weir consistent However, contraction in populations results. population higher between Structure/StairwayPlot recent icantly the flow more with F gene consistent a higher nuclear are with is results with mitochondrial there consistent and contrast, clear 0.00567), In = 1984). the median Cockerham as total populations, F 1B, other all (Figure 1984). and Cockerham population and each (Weir between F differentiation index, of amount fixation the calculated also We grey Light flow. gene complete indicating 0 border. and Arizona populations the isolated show completely lines and indicating (PR) 1 in Prescott arrows with (HU), category, of Huachucas Thickness (CH), Chiracahua’s polymorphism. (AZ), mitochondrial (SR). Arizona in Ritas locations Santa flow sample gene four consistent the still across A. is previous there 1: that with suggest Figure consistent these males. 1D), Together some via Figure 2005). potentially but (Supplementary Jaenike populations 1C) and genome between Figure Dyer mitochondrial (Supplementary 2004; the (Dyer genome for findings nuclear location the popu- of by for signature structure locations (Falush corresponding year between Structure a differentiation 200-2000 Using find to the population populations. expected than the we among ancient differentiation populations, or between lation recent isolation more geographic the be Given could invasion the meaning PR, (200-2000 estimate. for recently windows more of much error absence settled large the to was of of appears population combination expansion population PR HU in northern the the from This, ago), while settling years Specifically, ago). populations B). thousand years & with (10-30 1A through ago, first Figure go years settled Supplementary to have thousand 1A, appears thirty (Figure then north and population to one surveyed south Each between expansions 2005). (Survey population Arizona separate in occurring period glaciation (N N size an of population N effective history estimated ancestral demographic current an a in have changes (Falush populations and sampled Structure all structure in sampled population polymorphism our the silent from examined using then DNA We the variation. sequenced to and accommodating isolated more associated when conditions be 1). determine could to (Figure To it leading isolation event, geographic area colonization of recent the kilometers a of 300 was climate over this changing If the 2005). Jaenike with and (Dyer sampled) previously e f1,0-000(iue1 upeetr iue1 ) hsbtlnc onie ihaknown a with coincides bottleneck This B). & 1A Figure Supplementary 1, (Figure 10,000-20,000 of e ceai fterneepninof expansion range the of Schematic f˜,0,0 niiul,tog l xeineabtlnc bu 0,0 er g to ago years 100,000 about bottleneck a experience all though individuals, ˜4,000,000 of ST B sn h oa oyopimars h ulreeet ( elements Muller the across polymorphism total the using , .innubila D. .innubila D. and C. umr fSrcueFtrslsfor results Structure/Fst of Summary ST salse ahpplto n ae fmgainbtenlctos we locations, between migration of rates and population each established Fgr ) lont htSaraPo LuadF 05 a estimated has 2015) Fu and (Liu StairwayPlot that note Also 1). (Figure per ob eeal o costegnsi the in genes the across low generally be to appears tal. et ST .innubila D. ewe iohnra eoe Fgr C.Bt nu- Both 1C). (Figure genomes mitochondrial between B 3 .innubila D. .innubila D. 03 n tiwylt(i n u21) efind We 2015). Fu and (Liu StairwayPlot and 2003) and p C vle=27e12.Tog RF PR Though 2.73e-102). = -value niae h eino F of median the indicates nP ni 21 apigsget recent suggests sampling ˜2016 until PR in n iVbsdo tiwyltresults StairwayPlot on based DiNV and ouain n hrceie genomic characterized and populations tal. et B. ST uooa oyopimand polymorphism autosomal ftencergnm ssignif- is genome nuclear the of 03,w n upiigylittle surprisingly find we 2003), e f˜,0,0 niiul and individuals ˜1,000,000 of ) Drosophila .innubila D. .innubila D. ST o ee neach in genes for chromosomes) .innubila D. ST despite , genome sstill is C. Posted on Authorea 17 Jun 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159242077.72607009 — This a preprint and has not been peer reviewed. Data may be preliminary. eatmtdt dniyi hseeae tutr sdet hoooa nesos oprn F comparing inversions, chromosomal to 10.402, due = is t-value structure (GLM elevated B this element some if suggesting Muller identify 3), to to Figure (Supplementary unique attempted elements population We Muller other structured all of to compared form D Tajima’s elevated has B 30.02, F = elevated t-value of GLM regions 2, Figure inversions (Supplementary segregating populations with all associated F is previously, mentioned genome As innubila D. the in structure Population genome the whole from the differences significant mark F, diamond & white E and D, + In a with visualization. marked marked of are are F CH ease from population of for differences average each removed significant in cases are all background outliers C genome and & B * A, an In with genes). antifungal (excluding genes population. immune each A. for population. D each population. Tajima’s populations. each of other Distribution for all statistics versus D, Tajima’s Summary population negative 3). wide in 2: Figure genome (Figure general (Supplementary a Figure population in expansion show the demographic polymorphism populations from recent of other alleles a 4.39, deficit rare the with Conversely, a removing = consistent bottleneck, t-value also 3). population Figure GLM is recent Supplementary there 2C, more 2C, that (Figure a suggests populations with is This consistent other diversity PR, all population 2). pairwise Table than recent location), Supplementary higher a -19.728, new 1.15e-05, significantly with = a is t-value expected in D As GLM establishment Tajima’s 2B, and 1989). (Figure migration (Tajima lower recent polymorphism and significantly (suggesting popu- diversity total other pairwise PR using like statistics in gene B genetic contraction element each population Muller the for on calculated D outliers also We Tajima’s some with 2). 0.0105), Figure (Supplementary = lations median (PR genome-wide low extremely ST o ahpopulation. each for E. ST F ST r iia nec ouain(upeetr iue2.Adtoal,Mle element Muller Additionally, 2). Figure (Supplementary population each in similar are itiuinfrctclrpoen o ahpopulation. each for proteins cuticular for distribution ST ssgicnl ihro ulreeetBcmae oalohreeet in elements other all to compared B element Muller on higher significantly is B. itiuino ariedvriyfrec population. each for diversity pairwise of Distribution D. 4 F ST p vle=23e8,SplmnayTbe2 and 2) Table Supplementary 2.33e-86, = -value itiuinfrAtfna soitdgnsfor genes associated Antifungal for distribution p itiuino F of Distribution vle=357-6.O ulreeetB, element Muller On 3.567e-56). = -value F. ST F p ST cosgnsfreach for genes across vle=2.579e-25). = -value itiuinfrall for distribution p vle= -value ST fa of C. Posted on Authorea 17 Jun 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159242077.72607009 — This a preprint and has not been peer reviewed. Data may be preliminary. neso,otieo nivrin erteivrinbekons(ihn1kp ro ieetMuller different a on or 10kbp) (within breakpoints inversion the near inversion, an element. of outside inversion, vdnefrlclaatto nec population each in adaptation local for Evidence than greater and frequency 1% than higher at inversions segregating of population total 100kbp. the in element the frequency Muller on on inversions breakpoints 3: of and presence Figure inversions the between actual in link differentiation the a elevated suggest identified and do correctly results B have our this, not Despite may chromosome. we 3A), (Figure inversions (Chakraborty supported well 7.379, not F = higher t-value the drives GLM force chromosomes: other all F -0.178, vs = higher t-value have GLM B 3C, element (Figure outside from difference F (F F populations higher specific with p presence in associated 740510, The is frequencies 3A). = B different (Figure W across element A at evenly test element Muller (spread Muller Sum of of B Rank region end element Wilcoxon a telomeric Delly Muller over the on both inversion at found by found an are are of called 37 22 inversions which, and of chromosome) only entire frequency), (using the 1% windows above total across (89 inversions frequencies of absence (Ye or Pindel and presence the to region 7.702, ST -value hnotieteivre ein ossetwt nig nohrseis(iue3,GMtvle= t-value GLM 3C, (Figure species other in findings with consistent regions inverted the outside than p > vle=13e1)(Machado 1.36e-14) = -value B. .6 o l nesos.Gnswti 0b fa neso rapithv infiatyhigher significantly have breakpoint inversion an of 10kbp within Genes inversions). all for 0.361 umr fteivrin eetdi the in detected inversions the of Summary aiasDand D Tajima’s tal. et 09 Rausch 2009; ST ST n aiasD ie htclsfrlreivrin nsotra aaaeoften are data read short in inversions large for calls that Given D. Tajima’s and hnteohrMle lmns(iue3,otieivrin ulreeetB element Muller inversions outside 3C, (Figure elements Muller other the than C. F ST .innubila D. tal. et tal. et o ee cosMle lmn ,gopdb hi rsneudran under presence their by grouped B, element Muller across genes for tal. et p 07 n h paetycmlxntr fteMle lmn B element Muller the of nature complex apparently the and 2017) 02) efidsvrlivrin costegnm tappreciable at genome the across inversions several find We 2012)). vle=002) huhteeivrin r o nqeo even or unique not are inversions these though 0.0129), = -value 07 Noor 2007; n htti a eascae ihlcladaptation. local with associated be may this that and ST p vle=164-3,sgetn oechromosome-wide some suggesting 1.614e-13), = -value < 5 0.22, rspiainnubila Drosophila tal. et p χ vle=089.Hwvr l ein fMuller of regions all However, 0.859). = -value 2 07,wieisd netdrgosso no show regions inverted inside while 2007), etfrercmn naseicpopulation specific a in enrichment for test ST populations. nteergos(iue3A, (Figure regions these in A. oainand Location Posted on Authorea 17 Jun 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159242077.72607009 — This a preprint and has not been peer reviewed. Data may be preliminary. nteegns(atg n iels20) hsspot h osblt htuacutdfrduplications for unaccounted that possibility the F supports detected elevated This duplications the with 2005). causing consistent Liberles is be which and may genes (Rastogi (Supplementary these genes translocations in these mitochondrial variation in the for We on unaccounted exclusively 5). suggesting Table diversity 6), (Supplementary pairwise Figure PR elevated involving of comparisons peaks all find for also proteins antifungal and comparisons pairwise F elevated F Though ossetwt oa dpain h pe 97.5 upper The adaptation. local with consistent gene any whether identify to attempted We 2). F 2.5 Figure higher upper Supplementary significantly 4, have (Figure groups E ontology and D elements Muller te ouainlvlpoess(e 97 rikhn n an21) nordt,wnoswt peaks with windows data, our In F 2014). and Hahn and F Cruickshank Cruickshank elevated 2013; 1987; of Payseur (Nei D processes and differentiation, pro- population-level (Cutter demographic population other of adaptation of influence local measure the Hoban of relative about inference 2014; years on several Hahn last selection and the background over and discussion cesses considerable been paralogs. has mtDNA-linked efficiently There can still that their genes unlike mitochondrial selection of copies to functional respond contain translocations male-killing mitochondrial the the given possible Alternatively, conditions. local humidity, exposure, F (toxin 5.03, locations F = for between enrichment higher environment except significantly GO populations with 3, among category Table consistent Supplementary ontology are 2F, variable. gene pressures (Figure more Another pathogen be populations most may between which that divergence pressure suggest no pathogen might showing fungal This system immune 4). the Figure F elevated of of most peak with one under all not 16.414, so and genome the across ugssaptnildvrec ntemtblcneso ahpplto,adta eea mitochondrial several that and population, the each in of function new needs a 97.5 metabolic found upper the have the in may genes divergence in potential are 4.53, a categories = of suggests enrichment metabolism insertions 97.5 GO in energy nuclear 3, found the other The Table also several in (Supplementary are transpositions. genes enriched mtDNA ancient autosomal also of only suggesting are insertions of genome, These rest genes mitochondrial the mitochondrial the paralogs. to 9.245, from collapsed compared = diverged to number t-value are copy due GLM elevated 1, be have 5, Table may regions Figure divergence (Supplementary these (Supplementary number find genome do copy as the we mitochondrial However, (such ND5) and activity reads. and 0.065, mis-mapping regions system element = III t-value these nervous (Muller oxidase correlation of with peak cytochrome Pearsons’ coverage second (including associated between The genes genes correlation mitochondrial II). mitochondrial as translocated oxidase other well appear these cytochrome four of there as (including contains Interestingly, one genes of 23.60-23.62Mb) genes. exclusively mitochondrial E, related composed 3 is composed with 11.35-11.4Mb) local also regions E, differing are element the to E (Muller of adapting element peak regions be Muller three may on be they peaks to that clear suggesting other 5), Two (Figure PR F as populations. elevated well those with in as Lcp65Ad) conditions populations Cpr65Av, HU and Cpr65Au, SR (e.g. proteins development ST nMle lmn Fgr ,Mle lmn ,1.61.8b scmoe fecuieycuticle exclusively of composed is 11.56-11.58Mb) D, element Muller 5, (Figure D element Muller on ST p vle=16e1) neetnl,ti steol muectgr iheeae F elevated with category immune only the is this Interestingly, 1.61e-10). = -value th n D and ST ST ecniefrF for percentile nadto oteetrt fMle lmn ,teeaenro hmeso ihF high of chimneys narrow are there B, element Muller of entirety the to addition In . ST slwars oto h eoei ahpplto,teeaesvrlgnmcrgoswith regions genomic several are there population, each in genome the of most across low is XY lohv ek fD of peaks have also r infiatycreae GMR (GLM correlated significantly are tal. et .innubila D. ST r nihdfratfna ee nalppltos hs ee r distributed are genes these populations, all in genes antifungal for enriched are ST 06 ate-oe n htok21) ncnrs oF to contrast In 2018). Whitlock and Matthey-Doret 2016; D , p vle=86e0) hc ol eascae ihdffrne nthe in differences with associated be could which 8.68e-08), = -value XY p XY eoewt rnlctdmtcodilgns(iue5.Tefirst The 5). (Figure genes mitochondrial translocated with genome vle=081,s hseeae F elevated this so 0.861), = -value n ariedvriy(upeetr iue5&6.W n no find We 6). & 5 Figure (Supplementary diversity pairwise and nalpiws oprsn Fgr ,SplmnayFgr 6), Figure Supplementary 4, (Figure comparisons pairwise all in ST .innubila D. hnters ftegnm.W n httegnsi the in genes the that find We genome. the of rest the than th XY th ecniefrD for percentile 6 sa bouemaueta a els estv to sensitive less be may that measure absolute an is Wolbachia ecniefrF for percentile 2 eoeadmyb iegn u odffrne in differences to due diverging be may and genome .2,tvle=11.371, = t-value 0.823, = etc ST ST p ) ossetwt hsrsl,tepa of peak the result, this with Consistent .). sctcegns(iue2,Supplementary 2E, (Figure genes cuticle is SplmnayTbe3 Oercmn = enrichment GO 3, Table (Supplementary parasitizing vle=301-0,ads hselevated this so and 3.081e-20), = -value th XY ecniei H vrl hs results these Overall CH. in percentile Obp93a ST serce o hro rtisi all in proteins chorion for enriched is p nH n R hnloigat looking when PR, and HU in vle=36e0) Additionally, 3.67e-04). = -value ST ST .innubila D. spoal o natfc of artefact an not probably is and nteegnsi ohthe both in genes these in Obp99c p vle=6.33e-30), = -value De 04,i is it 2004), (Dyer ST .W n no find We ). ST .falleni D. Fgr 2F), (Figure hc sa is which ST and on Posted on Authorea 17 Jun 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159242077.72607009 — This a preprint and has not been peer reviewed. Data may be preliminary. ctcedvlpeto iohnra rnlctos r hw ypitsae oeteYae r on are Y-axes the Note shape. point by plot. shown each are for translocations) scales different mitochondrial or development (cuticle F shows Plot 5: Figure gene. per D Tajima’s average (D population within gene and per Sweepfinder2 divergence using pairwise calculated average the (F genes follows: as F several are Muller Values for non-recombining the evidence in 4: expect seen Figure also would is we suppression as variants, the chromosomes, 7). selected Given of Figure of positively telomere (Supplementary weakly portions genes. the even uncharacterized in heterochromatic for sweeps several sweeps the selective selective among several in 39.5-40.5Mb), of recombination A, evidence of (Muller find 45637000, also chromosome chorion We = X four variants. W the specific of Sum population upstream D Rank of elevated also (Wilcoxon selection of 50kbp is chimney peak within sweep a this genes under This of other PR. center to in The compared signal 18.75-19Mb). scale. strongest ( D, same proteins the the Muller on with 7B, not ( Figure populations, proteins though protein (Supplementary all populations cuticle D in other the Muller cuticle of all on upstream in PR just found in is also Sweepfinder2 peak was using extreme peak site population one This each the was in near there sweeps polymorphism but selective reduced identify with to sweep F attempted selective We a variant. of (Huber selected signature the a 5). leaves of Figure Supplementary often F 4, elevated adaptation the (Figure Recent suggesting adaptation proteins, local chorion to or due cuticle likely antifungal, is the in duplications for evidence ST SplmnayFgr 7A, Figure (Supplementary tal. et ST p5 p6 p8 Cp19 Cp18, Cp16, Cp15, ap6 e7 ar,Pm Fhos Prm, hairy, Pex7, Zasp66, ,wti ouainpiws iest e ee,CmoiieLklho ai CR e SNP per (CLR) Ratio Likelihood Compositive genes, per diversity pairwise population within ), oprsno siae ttsisars the across statistics estimated of Comparison eews F Gene-wise ST 06.Teewsn vdneo eetv wesoelpigwt ee ihelevated with genes with overlapping sweeps selective of evidence no was There 2016). o ahgn nteergost dniytecua ee.Gnswt oe functions noted with Genes genes. causal the identify to regions these in gene each for ST hwn ein feeae iegnebtenppltosfrec population. each for populations between divergence elevated of regions showing XY χ 2 nalcmaios(upeetr iue6&7,cnitn ihrecent with consistent 7), & 6 Figure (Supplementary comparisons all in etfroelpo 97.5 of overlap for test n oes(ihn1kpo h we etr eea elorganization cell several center) sweep the of 10kbp (within covers and ) Cpr66D .Teecoinpoen lohv infiatyeeae D elevated significantly have also proteins chorion These ). nkeigwt h ugsino oa dpaino the of adaptation local of suggestion the with keeping in , 7 .innubila D. th ecniewindows percentile eoefrtePect P)population. (PR) Prescott the for genome XY ,tepplto xto ne per index fixation population the ), χ p 2 vle=005)adare and 0.0158) = -value 1.33 = p vle=0.249) 0 = -value ST n D and XY XY Posted on Authorea 17 Jun 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159242077.72607009 — This a preprint and has not been peer reviewed. Data may be preliminary. eas oprdteall rqece ewe 07ml ape ess21 eaesmls gi,F Again, samples. female F 2017 versus (median samples wide F male genome 2017 low between Figure extremely other frequencies to (Supplementary is allele compared the 2017 B compared Muller and also on We 2001 change allele frequency between minor allele The B influencing 7A). element be Figure may Muller (Supplementary autosomes. there else Interestingly, X on something of change. suggests average 2001 telomere little This on show 9A). the between to in decreased increases sweeps appear has frequency selective chromosomes recent other frequency allele of most minor evidence while the also telomere X is points, the time F at between median 2017 and differences 35-40.5Mb, A, frequency element allele Muller actual 8A, Figure (Supplementary (median chromosome timepoints of two collection the 97.5 CH between a differentiation upper from little F elevated sequenced find to we We (due those F populations). populations to between between 2017 time frequencies over in allele changes males identify in CH to 35 2001 sexes in the between males from and 38 samples time the compared over next chromosome We X the in divergence for Evidence iue4 a u olclaatto loatn vrlne iepros ewudepc osesignatures Supplementary see to 3, expect would & we Hill 2 periods, Furthermore, population time (Figure the longer categories. over if proteins those acting that development also in reasoned adaptation cuticle adaptation We local of and to due evolution, adaptation. genes was adaptive local 4) antifungal of Figure recent signatures in to strong seen showing opposed pressures categories pathogen differentiation as strong functional evolution and to recurrent genes due identify suggesting likely to innubila sought D. next in We recurrently evolving are genes the Finally, multiple immune Toll-related 2006). with in with Trivers overrepresentation chromosome associated and sex-ratio its (Burt X skewed in males) the the euchromatin to resulting in (Kageyama the adapting of variants driver, be in alternate region female may sexes the chromosome a the the of X ascertainment Alternatively, to overrepresentation of an an one due by (and in female-biased heterochromatin. caused females calls be is the SNP could signal in accurate inversions this more seen overlapping samples. that in not possible resulting male is the females, is in in in It which overrepresented biases calling populations. specific are SNP every sex the for SNPs in suggesting Strangely, bias separately, telomere found population are X each variants. frequencies chromosome examining the higher synonymous X when at while consistent found of are are 9B), change results chromosome Figure significant X These frequency the no (Supplementary on allele find females SNPs raw euchromatic we in of Again, the frequency allele compared comparison. minor also population female We versus male annotation. 97.5 population total the a in enrichments in and sexes populations ST ST ttetlmr fteXcrmsm SplmnayFgr B,wihw n hncmaigall comparing when find we which 8B), Figure (Supplementary chromosome X the of telomere the at .04 99 0.0004, = th tal. et ecnieo F of percentile 09 nkes2011b). Unckless 2009; th ecnie=004) n n osgicn nihet (GO enrichments significant no find and 0.0143), = percentile th ecniefrF for percentile ST oee,w ofiddvrec ntegnsa h eoeeo h X the of telomere the at genes the in divergence find do we However, . ST ST .04 99 0.0004, = sms ftedvretgnscretyhv ofunctional no have currently genes divergent the of most as , tal. et 8 th ecnie=000) u eaanfidapa of peak a find again we but 0.0501), = percentile sdd/Sbsdsaitc oso htgenes that show to statistics dN/dS-based used .innubila D. ST smale-killing ’s ST p .09.Loigat Looking 0.0029). = -value eas fdifferences of because < .5 nthe in 0.05) Wolbachia ST Posted on Authorea 17 Jun 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159242077.72607009 — This a preprint and has not been peer reviewed. Data may be preliminary. h ih w lt hwtedffrnei vrg ttsi Drcino eeto n eeto ffc)for Effect) Selection and Selection of (Direction statistic average in difference the show plots two right The (Hill DiNV ment recurrent with Long-term (likely 6: genes. interactions metrics. Figure specific these host-pathogen two using or by adaptation. signal one local driven detect in to be to occurring opposed recent adaptively, to only evolving instead too appears category, are be adaptation functional development may whole cuticle adaptation the and the across resistance Alternatively, recurrent antifungal genes not signaling in Toll is involved are it adaptation genes background while the for than that signals higher suggest Antifungal significantly find categories test we only test permutation while the (Permutation Again, 10, before, 0.137). Figure As = (Supplementary -value asymptotic (asymptotic background 2017). the proteins the calculated Messer cuticle we than asymptotic and end and calculated (Haller this antifungal also To genome in we 2017). the bias, Messer across (Messer this and tests categories (Haller for based genome account McDonald-Kreitman the To using across selection 2012). of Petrov inference bias and can mutations resamples deleterious resamples 61% slightly 100% 10, 6, Figure resamples (Supplementary (Figure immune 54% genes each 6, antifungal for (Figure background in genes the selection from positive difference recurrent average the (Chapman identify category confound to can approach which genome, resampling the control-gene across (Charlesworth vary can selection structure for population t-value scans and selection (GLM signature. of that populations efficacy rates, homogenizes between Mutation populations statistics and statistics, between proteins these flow MK-based genes cuticle of either and as distribution for ubiquitous well the populations as in between p-value categories, differences differences 0.211, immune significant significant all For no no find and 2013). we Rij (Dostert proteins, van infection antifungal viral and upon Merkling four expression 2009; its genome): Elftherianos to the due in infection highest viral virus the the among also of are Listericin relative in (which close evolving values A fast effect and are selection 2020). AMPs and Unckless and DoS pathway and positive Toll Hill the 2011a; why (Unckless explain wild in the AMPs Toll-regulated in suppresses individuals of (Hill divergence adaptive. 40-50% acid indeed amino is of divergence rates Interestingly, this elevated that significantly suggest have results to These groups functional 2019). only the (GLM also proteins were signaling Toll evolution background, 3.66 adaptive the = than of higher signatures significantly 2.581 groups = some t-value 1.128, functional have a two = did found t-value fit only (GLM genes then we We background antifungal expected. the and selection. than from effect genes of differences selection excess cuticle or an DoS find identify higher we (DoS significantly to survey with 2012) groups this ontology al., Eyre-Walker, In gene et and identify (Stoletzki (Eilertson to (DoS) effect model selection linear selection of SnIPRE direction and statistic 2011) based McDonald-Kreitman the calculated the in genes innubila evolving D. fastest the among were pathways (Hill defense immune some in involved Listericin . tal. et > h ettoposso siae ttsis(ieto fSlcinadSlcinEet o ahgene. each for Effect) Selection and Selection of (Direction statistics estimated show plots two left The p n eeto effect selection and 0 vle=0005 upeetr al ) napeiu uvyw on htteecategories these that found we survey previous a In 3). Table Supplementary 0.00025, = -value Hffan20;Tkd n kr 2005). Akira and Takeda 2003; (Hoffmann ) 09.W losuh oietf htgnsaeeovn u orcretpstv eeto in selection positive recurrent to due evolving are genes what identify to sought also We 2019). coadKeta ae ttsisfrimn aeoisin categories immune for statistics based McDonald-Kreitman noeo l ouain,adi hsi soitdwt niomna atr.T hsedwe end this To factors. environmental with associated is this if and populations, all or one in .innubila D. > p l r Msrgltdb olsgaig(n diinlyJKSA ntecs of case the in JAK-STAT additionally (and signaling Toll by regulated AMPs are All . .4fralppltos upeetr al ) hs ehp eeto tteelc is loci these at selection perhaps Thus, 3). Table Supplementary populations, all for 0.34 vle=0096 upeetr al )adatmcoilppie AP,GMt-value GLM (AMPs, peptides antimicrobial and 3) Table Supplementary 0.00986, = -value p vle=003 n Ms(emtto test (Permutation AMPs and 0.033) = -value tal. et sbree yDoohl nuiandvrs(iV,aNdvrsta infects that Nudivirus a (DiNV), nudivirus innubila Drosophila by burdened is 09.Cnitn ihorrslspeiu eut,w n osgaue of signatures no find we results, previous results our with Consistent 2019). > > )adTl inln ee Fgr ,9.%resamples 99.1% 6, (Figure genes signaling Toll and 0) o 0 fgnsi hs aeois u sagopsoe osignificant no showed group a as but categories) these in genes of 80% for 0 > tal. et .melanogaster D. )btd gi n xrml ihlvl fpstv eeto nAMPs in selection positive of levels high extremely find again do but 0) 03 tjc n an20)T okaon hs eepoe a employed we this, around work 2005).To Hahn and Stajich 2003; α > (Palmer ) efidn vdneo ihrrtso adaptation of rates higher of evidence no find we 0), 9 p Listericin vle=029 upeetr al ) nfact, In 4). Table Supplementary 0.259, = -value tal. et .innubila. D. 08 iladUcls 00,wihmight which 2020), Unckless and Hill 2018; tal. et p a enipiae ntersos to response the in implicated been has vle=005.Tgte hs results these Together 0.035). = -value 05 Zambon 2005; .innubila D. ieAP hwdconsistently showed AMPs Five α o l ucinlcategories functional all for p vle=023 Cuticle 0.243, = -value tal. et n uil develop- cuticle and .innubila D. > α o l functional all for tal. et ) Segregating 0). 05 me and Imler 2005; > )o cuticle or 0) Bomanins 09)as 2019)) genome tal. et < p Posted on Authorea 17 Jun 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159242077.72607009 — This a preprint and has not been peer reviewed. Data may be preliminary. socrig tcudb rmrl ae irtn,a sse nohrnon- other in seen is as migrating, males Searle 2), in primarily structure 1992; & be population Burchsted 1 could of Figures little it evidence Supplementary very occurring, some in 2, is is but previously & 11), found there 1 as Figure Interestingly, (Figures (Supplementary mitochondria, development content genome the 2). cuticle repetitive nuclear the Figure the the in Supplementary in across primarily including 2, structure adaptation, (Figure population local for genes individuals. of support immune wild-caught evidence antifungal of some and resequencing find genes We genome whole 1). using Figure Supplementary islands of Sky innubila Arizonan populations the D. four to of of endemic history species species phylogeographic a the in characterized import sought place We We desert. also take populations. of processes can local expanses these in migration large variants that by Strong fit separated extent most to the the variants) depression. of examine fixation beneficial inbreeding both the to potentially preventing prevent can and (including Migration and variants variation nonadaptive environments. populations locally new local between differing allowing their spread adaptation, to be such adapt hinder to likely and are facilitate populations Charlesworth divided 1992; or Burchsted isolated and Excoffier (Rankin 2004; adaptation Gillespie drive 2003; can change environmental and Migration Discussion gene. nearby sampled randomly a and gene each ol lob asdb te atr,sc setniedpiainaddvrec nMle lmn B element Muller on divergence and duplication extensive as such inversions factors, detecting of other inversions limitations by common (Marzo the caused and to data be due read large also characterized short are could been with F all have elevated regions that the not repetitive driving may in in How- not inversions hypothesis, are assertion. causal inversions this this actual the segregating supports support the suggesting putative 3A) inversions populations, several (Figure all putative of chromosomes in detection the other characterized abnormalities Our all of the to few 2-5). explain relative Figures ever, B could (Supplementary element and Muller here structure on B inversions population element locations with Muller between on associated seen often migration are through inversions that transmitted Segregating suggests instead this history, are population estimated and our maintained (Charlesworth given and ancestrally deep, aren’t not variants are times coalescent available, data xaddit t urn ag uigo olwn h rvosgailmxmm(iue1, (Figure maximum glacial previous the following or during range current its into expanded tal. et 2003). tal. et 09 Ma 2009; tal. et 09 Porretta 2009; tal. et .innubila D. 03 va n rxl 04.Bsdo h polymorphism the on Based 2014). Fryxell and Avgar 2013; tal. et De n ank 05.Ti ugssta fgn flow gene if that suggests This 2005). Jaenike and (Dyer tal. et 10 08 Chakraborty 2008; 02 White 2012; rspiainnubila Drosophila Drosophila tal. et Drosophila tal. et 03.Seiswt somewhat with Species 2013). 07.Teeeae F elevated The 2017). on cosfu forests four across found pce Rni and (Rankin species ST essetthat suspect We . mycophagous a , tal. et ST Posted on Authorea 17 Jun 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159242077.72607009 — This a preprint and has not been peer reviewed. Data may be preliminary. aito sipratfrpplto oeln n vnfrftr osraineot (Excoffier efforts conservation future for even (Excoffier and expansions modelling Coe or population changes for range important is undergone variation inversions recently a segregating have by of species driven al. history many possibly et and structure Because population dispersal elements. and the selfish 2005) into factors (Survey and climate glimpsed what changing have and the we behave, to Here due and disperse mycophageous time. over species of demography non-commensal species population how different drive of of to understanding seem dispersal our limits and This phylogeography 2016). the concerning the al. work to et genomic limited been the has of species most date, To (Dobzhansky humans Watanabe on 1968; work recent Spassky as and well the MackayDobzhansky as 2004; to populations (Gillespie wild-caught in related mutations alleles individual lethal selection deleterious sequences recessive of purge that signature would Pool studies a that inbreeding few be of of could generations X one the is of Ours telomere the Thus with 2008). associated Dyer Wolbachia However, and signals male-killing Jaenike 3A). strong 2005; the (Figure the Jaenike both inversions and by elements several Dyer parasitized selfish accumulated Trivers 2004; simultaneously these have (Dyer and sexes females Often to (Burt the of components in appears 30-35% 2006). effects genetic telomere Trivers differential synergistic A and of with of Muller (Burt populations factor breakdown the telomere selfish the the the and a Given prevent that near 2006), to time. suggest chromosome inversions over samples, could X accumulate male and this the also considering sexes region, on only between this located survey divergence in 2001-2017 is Supplementary driving SNPs the factors A, of with the bias element even between sex (Muller 1.2e-16), association chromosome = an X p-value supporting 82.411, the = of of no F t Some telomere with fact, correlation genes the In populations. between few at 8). genes a primarily Figure diverging to sexes, the limited between with F is overlap genes elevated change little with frequency and genes other, allele genome- the frequencies each Excessive no allele see to in we selection. association but fluctuations known force directional seasonal the selective Alternatively, out strong on a 2012). swamp (Coe impact been (Arechederra-Romero may area have such little plausibly same of could the been which signature in have 2011 wide species in may fire mammal’s forest there and 8). extensive bird decades, an Figure most few (Supplementary unlike 2017 past time, and the of 2001 in from changed of interesting samples habitat has between an environment divergence provide of the may signatures Though and few adaptation very local find We for area well-studied plausible described the populations a to the is contrast as populations beyond) Dobzhansky other (and among 1937; populations or polymorphism Sturtevant island inversions Sky and if same (Dobzhansky the disentangle which here in to autosomes, that population necessary by other noting segregate worth is to these is study compared it Further strains we However, F all duplicated, elevated ation. 1). was this in Table B causing B (Supplementary Muller are element are case factors of Muller inversions the proportion detect of large not to coverage a is used mean If (Ye pairs elevated duplications occurred. read see have detect split would may to and misidentification broken used some the signal suggesting fact, the In to similar divergence. very just as misanalysed being tal. et tal. et 05 Lack 2015; 09 Porretta 2009; .innubila D. neto Ucls 2011b). (Unckless infection 2012). 02.Teecs fpttvl eeeiu lee akn akt al tde fsegregating of studies early to back harkens alleles deleterious putatively of excess The 2012). .innubila D. tal. et ST nteCiiaus eutn nfwcagsi eeto rsue nti hr period short this in pressures selection in changes few in resulting Chiricahuas, the in tal. et 05 Machado 2015; ssgicnl orltdo ulreeetAbtentetosres(Pearson’s surveys two the between A element Muller on correlated significantly is Drosophila ST 02 White 2012; r led eaebae u otemale-killing the to due female-biased already are .pseudoobscura D. addffrn nall rqec)btentm onsoelpwt divergent with overlap points time between frequency) allele in differing (and melanogaster ST n on vdneo hne npplto itiuin potentially distributions population in changes of evidence found and n h eetv n/rdmgahcpesrsdiigti differenti- this driving pressures demographic and/or selective the and tal. et .pseudoobscura D. tal. et tal. et uegop(Pool supergroup 05,wt oewr nother in work some with 2015), inversions. 03,w eiv xmnn o hshsaetdgenomic affected has this how examining believe we 2013), 94 Gao 1974; Wolbachia 11 tal. et ergtsfrivrin nMle lmn and C element Muller on inversions for segregates tal. et tal. et 93 Fuller 1963; tal. et 09 Rausch 2009; n efihXcrmsm.Alternatively, chromosome. X selfish a and 2015). Drosophila 02 oladLnly21;Behrman 2013; Langley and Pool 2012; tal. et tal. et Sophophora 02.Itrsigy hr was there Interestingly, 2012). tal. et Wolbachia tal. et n hrfr visseveral avoids therefore and 06.Tu,teinversion the Thus, 2016). 02 Chen 2012; 93 aikvc1967; Marinkovic 1963; .innubila D. pce (Fuller species neto on in found infection tal. et tal. et Drosophila tal. et ol be could 2016), 2009; 2012; tal. et Posted on Authorea 17 Jun 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159242077.72607009 — This a preprint and has not been peer reviewed. Data may be preliminary. sn h eeae oa C l ihSPffanttos ecetdascn C otiigonly containing VCF second a created we annotations, SNPeff with file VCF total generated the Using structure and statistics in summary differences genetic fixed Population and polymorphisms derived identify to the pipeline species to outgroup SRR8651760) from (SRA: information genomic the SNPs. to mapped 3,240,198 Simultaneous of with also annotation. genome us another leaving 168Mbp or errors), possible non-coding the This of (as non-synonymous, across individuals. population annotation of single indels site) the 5% a small per over used in from We singleton samples and coverage) a multiple 0 SNPs as for with found 4,522,699 or score SNPs quality, removed with PHRED low of us composite sites (a left (e.g. score absent filtering sites quality and GATK 950 a than with lower sites (McKenna GATK remove least using to at SNPs 2016) called across we coverage coverage, reasonable 5x with DePristo least samples sequenced at 318 with the For 2001 samples from population flies the across wild 1). polymorphisms 38 Table Nucleotide (Supplementary and genome 2017 euchromatic misidentified from the being populations) of of 80% suspected per we flies individuals 84 and - genome), non-repetitive the of as 80% McKenna indels for ; coverage around (Http://broadinstitute.github.io/picard 5x realigned GATK than and and duplicates, Picard (Li optical using DePristo SAMTools and file 2010; sequencing using BAM removed indexed mapped and each and marked in sorted -l groups, Hubley and read 20 and added 2009), (Smit -q we Durbin customLibrary) sanger and -lib (Li -t -gff removeMEM (parameters: -gccalc to -s Sickle cutadapt Hill the (parameters: using 2013-2015; RepeatMasker masked using sequences We and data quality previously 2011). all Fass generated low trimmed and removed 2018), (Joshi and (Buffalo 50) 2011) Scythe (Martin using barcodes sequences (˜ SRA). adapter kit NCBI removed the prep HiSeq in We library Illumina deposited alignment an DNA be and of Nextera lanes to mapping the four Data filtering, of on 1, Sample libraries Table version the (Supplementary modified sequenced end) We a paired reagents. using (150bp conserve 4000 to samples Germantown, meant DNA Inc., size) Qiagen 383 insert (USA 350bp these 40 kit of of Tissue DNA library Puregene the DNA early Gentra and prepared matched Qiagen also the phenotypically sex We the which by either USA). using males) MD, flies in DNA 171 sorted baits extracted and -80 We and females the at (172 genized over set. frozen flies flies was flash 343 collect sorted and bait KS. We to AZ Lawrence the Tucson, in net after in Kansas sweep days of Arizona a three University of and used the University We one the between at location. afternoon species per late baits or (Coe 5 morning flies) least (SR, 53 at ( mountains feet longitude, mountains mushrooms Huachuca with Rita -110.340 ˜7,900 the button Santa in latitude white (PR, Peak the store-bought 31.632 Forest Miller of elevation, in and National flies) Canyon feet 96 Prescott Madera ˜5,900 longitude, -110.881 in (HU, flies), latitude 96 flies), 31.729 elevation, longitude, 96 feet ˜4,900 -112.559 longitude, latitude -109.237 34.586 latitude elevation, 31.871 elevation, feet 11 the and wild collected We sequencing and isolation DNA collection, Methods .innubila D. tal. et th fSpebr21:teSuhetrsac tto nteCiiau onan C,˜5,400 (CH, mountains Chiricahua the in station research Southwest the 2017: September of olwn olcindet nmlu apn.Ti etu ih280 with us left This mapping. anomalous to due collection following 01 hc eeae utpesri C l.W hnue Ctos(Narasimhan BCFtools used then We file. VCF strain multiple a generated which 2011) tal. et tal. et Drosophila 01.W hnrmvdidvdaswt o oeaeo the of coverage low with individuals removed then We 2011). 09.W hnmpe h hr ed otemasked the to reads short the mapped then We 2019). .innubila D. .innubila D. ttefu onanu oain cosAioabtente22 the between Arizona across locations mountainous four the at n Ne (Cingolani SNPeff and eoeadcle iegneuigteGT aito calling variation GATK the using divergence called and genome grcsbisporus Agaricus .innubila D. .innubila D. 12 eeec eoe using genome, reference olce n20 rmC.W rprdagenomic a prepared We CH. from 2001 in collected .falleni D. lcdi ag ie bu 0mi diameter, in 30cm about piles large in placed ) tal. et .innubila D. 02 oietf Nsa synonymous, as SNPs identify to 2012) SA R8571 and SRR8651761) (SRA: .innubila D. ° tal. et eoesipn ndyieto ice dry on shipping before C .innubila D. tal. et .innubila. D. . 09.Floigmapping, Following 2009). .innubila D. .innubila D. .innubila D. ouainsmls we samples, population 02.Bisconsisted Baits 2012). .innubila D. eoeuigBWA using genome ete homo- then We Esequences TE nd idfle (48 flies wild eoe(less genome .phalerata D. tal. et ethen We . fAugust of 2010; tal. et tal. et Posted on Authorea 17 Jun 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159242077.72607009 — This a preprint and has not been peer reviewed. Data may be preliminary. rsnnmu sln)SP.W hncmae hs oyopim otedffrne dnie to identified differences (replacement) the non-synonymous to only polymorphisms retained these compared and then SNPeff We by annotations falleni SNPs. with (silent) VCF synonymous total or the each filtered of We frequency the estimated selection we recurrent inversions of merged Signatures and filtered population. remaining total strains the the across Using within inversions inversion tools. these two of the presence/absence of the one within in breakpoints overlapping with significantly inversions a and merged (using and ends samples both score filtered quality at manually VCF 1000bp next GATK We a calls. Then with (Ye inversion and Pindel genome. quality individuals using reference low inversions of the called 1% also to than We compared fewer 200. inverted than in lower found or inversions deleted removed duplicated, and potentially filtered we are which genome for the looked (Rausch in Delly and used gene we sample, per each For the dS from and popu- dS Dxy pairwise and average all diversity the for Inversions pairwise found 97.5 SNP population then upper We within the per script. in as Dxy enrichments python well calculated gene custom as next a 1990), we using Miller information, outgroups, and 97.5 outgroup (Nei the with comparisons in VCF lation windows total 1kbp the above identifying Using peaks by chromosome. as peaks per shown confidence, ratio for high likelihood surveyed sweeps a composite We selective with base. detect background. occurred each to have genomic for sweeps population the counts selective each where allele in regions showing polymorphism (Huber with file, overlapping called windows frequency (Huber total allele Sweepfinder2 1kbp the folded using on in a also population Sweepfinder2 to we each used file cases, in then VCF We some sweeps polarized selective in the for chromosomes reformatted look other We 97.5 to versus upper attempted B the next 97.5 taking and We chromosome, A the by Muller chromosome in chromosomes analysis genes the this among repeated on categories differences to gene Due particular for enrichments 2.5 identify (Gramates Flybase to from analysis groups ontology populations gene innubila downloaded D. We across divergence adaptive local of Signatures for Following sampling). polymorphism, iterations silent 400000 burn-in, only innubila iterations (Falush 100000 using Structure 1-10, separately = using (k chromosome (Frichot samples populations each ten across and for structure in one assignment between backwards population population size of the population extent effective repeating the the estimated estimate also to We 2015), of Fu rates location. and mutation each (Liu estimated for StairwayPlot previously time in with inputs SFS as silent 2013), population the used We al. (Korneliussen et VCF the polymorphism using synonymous (Danecek population, the VCFtools each parse using to population ANGSD al. each Using et variants. in all gene containing each F VCF for and the genome 1989) the (Tajima across D populations) Tajima’s theta, Watterson’s (Narasimhan base, BCFtools using polymorphism synonymous th ecniefrF for percentile 2019). 04,w eeae yoyosufle iefeunysetafrthe for spectra frequency site unfolded synonymous generated we 2014), and hoooeadDN omnmz entropy. minimize to DiNV and chromosome χ tal. et 2 test, .phalerata D. 04,w aulyasse hc ubro uppltosbs t h aafreach for data the fits best subpopulations of number which assessed manually we 2014), p -value ST aiasDadPiws iest essalohrgns(Subramanian genes other all versus Diversity Pairwise and D Tajima’s , oplrz hne oseicbace.Seicly esuh odtriesites determine to sought we Specifically, branches. specific to changes polarize to < tal. et .falleni D. .5.W lofitrdadrmvdlreivrin hc eeol on with found only were which inversions large removed and filtered also We 0.05). 06.W eomte h eut n okdfrgnsnihoigor neighboring genes for looked and results the reformatted We 2016). th and ecnie essalohrgenes. other all versus percentile, tal. et .phalerata D. 02 ognrt utpesml C l dniyn regions identifying file VCF sample multiple a generate to 2012) 13 eoe sotrust the to outgroups as genomes tal. et tal. et ST 09 nteesm ape n gi removed again and samples same these in 2009) Wi n okra 94 vru l other all (versus 1984) Cockerham and (Weir tal. et 06.W acltdpiws iest per diversity pairwise calculated We 2016). 07.W hnue eeenrichment gene a used then We 2017). th ecnieo ahchromosome. each of percentile Drosophila .innubila D. .innubila D. th th tal. et Shie tal. et (Schrider ecnieand percentile uooe for autosomes ecniefor percentile eoe(Hill genome tal. et tal. et tal. et 01 and 2011) 2005). 2016). 2003), D. D. Posted on Authorea 17 Jun 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159242077.72607009 — This a preprint and has not been peer reviewed. Data may be preliminary. aiisadcmae h rprino h eoecmoe fec uefml cossrisi the in strains across superfamily average each super- the of known get by composed to sequences genome randomly times grouped the of 2 we number of strain identification, the each proportion TE with resampling Following the 168Mbp, genome, compared content. of the and TE size of families strains coverage genome each 1-fold a of to assumed estimate equal approach reads Our sampled (SRA: populations. SRA the across NCBI (Goubert dnaPipeTE the used on We GCA available (SRA: is database genome information Methods NCBI Supplementary sequencing the on All available are used request. GCA genome upon all available while SRP187240), is data All TH AI139154 to R01 grant and Availability GM114714 postdoctoral Data R00 K-INBRE Grants NIH a by and by funded Smith also isolation, supported was genome Brittny was work RLU. in thank This work to assistance We GM103418). for This P20 GM103638) Grant Station. P20 sequencing. (NIH Grant Research and (NIH Southwest completed the Core preparation D were the Sequencing of calculate library Collections Genome and sections to manuscript. CMADP Schlenke the early scripts KU on Todd providing on the comments for from feedback and Kelly, inference providing assistance John genetic and and with population by Dyer manuscript on provided Kelly advice the help thank as of appreciate well to idea like greatly We Glor, especially the Richard would proposing manuscript. We Chapman, for Joanne Wessinger. Blumensteil, Caroyln Guinsberg Justin and Paul from Orive discussion Maria helpful MacDonald, with Stuart completed significant was a work showed This interest of categories functional for if Acknowledgements assess spectrum to asymptotic frequency test calculate site asymptotic to permutation synonymous asymptotic in AsymptoticMK a difference calculate and in used to used non-synonymous then We 2017) then the we Messer interval. generated which and We category, (Haller gene AsymptoticMK category. each used gene also each we category for immune results, each these sampled for confirm randomly differences To a these (Chapman and of on genes average based immune the replicates, focal finding 10000 our gene, over between background statistic 100kbp) each (within in nearby difference the calculated then We Statistic fit We multiple considering categories. immune adaptation, multiple of in estimates classed high be excessively focal to with the gene categories of a functional chromosome allowing covariates: identify gene, the to background on a GLM mutations as a of it number classed (Eilertson (Gramates total or ontologies mutations the gene effect gene using of selection FlyBase gene a Using number in each calculate gene. occurring expected for and substitutions effect mutations and these selection synonymous observed of SnIPRE and these number the non-synonymous used expected to of the also predict relative number We per to total categories sites 2011). the defined Eyre-Walker replacement account user and and into Stoletzki silent taking (McDonald 2002; model, polymorphic (DoS) (Eilertson Eyre-Walker selection and SnIPRE and of in fixed Smith direction values of 1991; specifically counts Kreitman statistics, the and McDonald-Kreitman-based used estimate We to phylogeny. gene the of our branch in polymorphic are which 005876895.1). ∼ opulation P + α eegroup Gene rmters fcategories. of rest the from tal. et .innubila D. tal. et 02,wihrfae coadKeta ae ttsisa linear a as statistics based McDonald-Kreitman reframes which 2012), +( 05 oqatf h xetta eeiieeeetcnetdiffered content element repetitive that extent the quantify to 2015) eegroup Gene ouain raesbtttoswihfie ln the along fixed which substitutions are or populations tal. et tal. et 2019). 14 ∗ 07,w otdec eeit aeoyo immune of category a into gene each sorted we 2017), opulation P )+ Chromosome tal. et 02.W acltdthe calculated We 2012). + α Chromosome n 5 confidence 95% a and 004354385.1, .innubila D. : osition P XY as α Posted on Authorea 17 Jun 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159242077.72607009 — This a preprint and has not been peer reviewed. Data may be preliminary. oee,tefeunyo Eisrin a infiatyhge neoi ein oprdt nrn and introns to compared regions exonic in assumption. higher deleterious significantly the was with insertions consistent TE 6.23e-11), Petrov of = 1989; p-value frequency Langley -6.538, the and = However, (Charlesworth t-value deleterious (GLM mildly regions least non-coding at be Muller al. to on assumed suggests et content usually This repetitive are of 0.01488). insertions accumulation = TE the p-value allowing A 19763, is element W= region Muller Test inverted of Sum A. the regions Rank element inverted in (Wilcoxon recombination the genome of in the lack densities of the insertion regions higher TE the significantly particular at at in that are than chromosome evidence TEs any one find find any not do on do though We abundant more populations. significantly other are find to orders all we compared to 1.662e-24) population = compared PR p-value to the when 11.472, related In population closely CH families 2.555e-17). TE the = 32), a in p-value of GTGT 9.204, abundances = and CH of higher t-value CTC the expansion CASAT 9.52e-05). (GLM CAACAA, = in an populations satellite sequences p-value other and repeats repeat 3.914, the for = simple 7.31e-05) simple of t-value selection the = GLM of expansions relaxed and 11D, p-value expansion see Figure with 3.978, (Supplementary an we activity population = by effective Specifically, PR repeat t-value driven reducing the of in GLM species primarily elements bursts these 11D, are TIR 2.856, for Figure in = changes bottleneck t-value (Supplementary resulting These recent D, population selection, & more insertions. 11C a of Figure of with efficacy (Supplementary removal keeping SR and in and size is HU This population to compared 4.291e-03). populations = PR p-value and consistent CH 3.745e-04), = the (Hill p-value in 3.555 purifying genome = reference t-value via 11C, the Figure removed (Supplementary with orders and repeat This deleterious other to 0.733). mildly compared (Goubert = average p-value dnaPipeTE on -0.341, Using are = t-value insertions Population, between TE spectra * frequency order population, insertion selection. SNP TE the every * in * in Frequency difference Frequency ˜ no implies ˜ Count with Count GLM 1.889e-60), = (GLM 11A, p-value Figure populations -16.401, (Supplementary = variants t-value synonymous TE, of or SFS the to compared number the by primarily Kofler little disperse TE find to 7479 we to seem to polymorphism, Similar do nuclear 1913 11B). strains Like from Figure though individuals varied genome. (Supplementary insertions, resequenced the Strains insertions TE of with of shared sequences. portion by repeats, repetitive non-repetitive structure simples matching the population in reads and strain of satellites per 38.4% varying insertions to with 4.4% along from families varying TE different (Kofler PopoolationTE2 154 (Goubert using dnaPipeTE line using per samples insertions our TE in across content or families repetitive TE the non-coding characterized individual We intronic, considering exonic, When was end). Results insertion or Supplementary the start of if downstream and or population up innubila (500bp order, gene TE a by flanking grouping insertions, of (Kofler PopoolationTE2 annotation using genome 2008) the Hubley and (Smit al. RepeatModeler et previous the between the shared of families TE confirmed (e-valueWe blast strain. reciprocal each in a present used families also We populations. Helitron-2N1 01.In 2011). 90.Atrcnrigta hyddntdffri otn,w aldT netosi ahsri across strain each in insertions TE called we content, in differ not did they that confirming After 1990). .innubila D. tal. et eue h eBs Enmsadietfiain (Bao identifications and names TE RepBase the used we , .melanogaster D. 02 Kofler 2012; .innubila D. DVir eeec eoeadtedaieEantto sn ls (e-value blast using annotation dnaPipeTE the and genome reference GMtvle=1.8,pvle=279-8 and 2.789e-28) = p-value 12.381, = t-value (GLM tal. et Calsot n age 99 Charlesworth 1989; Langley and (Charlesworth Edniyi oe nrgosflniggnso ihngnscmae to compared genes within or genes flanking regions in lower is density TE , tal. et 2015), tal. et 05,w n infiatyhge est fR I elements TIR & RC of density higher significantly a find we 2015), .innubila D. 09.Tedniyo eeiiecneti lohge eoewide genome higher also is content repetitive of density The 2019). tal. et 06,te egdteotu n acltdtefrequency the calculated and output the merged then 2016), abr infiatecs flwfeunyT insertions TE frequency low of excess significant a harbors tal. et 15 Tetris .innubila D. < D(L -au .5,pvle=8.832e-08) = p-value 5.554, = t-value (GLM HD .0001 (Altschul 0.00000001) 06.Tereference The 2016). Dvir GMtvle=1.4,pvle=2.889e- = p-value 13.641, = t-value (GLM GMtvle=184 -au 0.633), = p-value 1.854, = t-value (GLM tal. et 2015). Chapaev3-1 tal. et .innubila D. tal. et 97 Petrov 1997; tal. et 90 oietf TE identify to 1990) PM < 0-8 (Altschul 10e-08) eoecontains genome GMtvle= t-value (GLM 05 n called and 2015) tal. et 2011; D. Posted on Authorea 17 Jun 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159242077.72607009 — This a preprint and has not been peer reviewed. Data may be preliminary. p nteMle element. Muller the on bp) 2: Figure Supplementary value). chromosome K X optimal the (estimated and K=3 B) chromosomes. a Muller all with op- (excluding for (estimated autosomes locations K=3 al. all between a structure summarizes et with little plot very autosomes, this to that all due Note from value). polymorphisms K synonymous timal sampled 100,000 for locations population. A. population. 1: Figure (Garrido-Ramos we Supplementary conflict deleterious, genetic mildly is or expansion even adaptation satellite genome, in is that involved Lower the expansion is either possibility 2017; satellite dynamics in third evolutionary and A content local case machinery. repetitive with regulation the associated repeat of was of this rescue regulation of (Charlesworth migratory if the However, expect repeats majority would limits expansion. simple a which its or where variation to satellites bottleneck, leading recessive particular population segregating of the fixed proportion have following higher effect a founder had al. a chance stochastic to by the founders due to CH due occurred possibly have population, may each This of in content content insertions repeat bottlenecks. repetitive the deleterious population in the recessive of differences Overall more effect major have samples. some may are inbred so there in this and seen flies are which caught than populations, wild innubila population all are across the these 5.34e-05), in = as segregating p-value observed 4.040, have = may t-value we GLM 11A, Figure (Supplementary UTRs 03,btti sulkl ie h eeflwbtenppltos lentvl,tebtlnc could bottleneck the Alternatively, populations. between flow gene the given unlikely is this but 2003), 03 o siaigpplto tutr ewe oain o 6plmrhsso h autosomes, the on polymorphisms 16 for locations between structure population estimating for 2003) per ob idydltros ihT netossae ewe oain ymgain Despite migration. by locations between shared insertions TE with deleterious, mildly be to appears C. B. tal. et eut fSrcuesfwr (Falush software Structure of Results ouainsz itr nteLg0saeof scale Log10 the on history size Population 2018). s ygn cosalMle lmnsfrec ouain oae ylc (in loci by located population, each for elements Muller all across gene by Fst ouainsz itr of history size Population tal. et 16 03 o siaigpplto tutr between structure population estimating for 2003) rspiainnubila Drosophila rspiainnubila Drosophila D. eut fSrcuesfwr (Falush software Structure of Results akad ntm o each for time in backwards akad ntm o each for time in backwards Drosophila et Posted on Authorea 17 Jun 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159242077.72607009 — This a preprint and has not been peer reviewed. Data may be preliminary. ttsisfrec ouain h aiasDpo otisadse iet hw0. show background to genomic line the dashed show a to contains line plot dotted D a Tajima’s has The plot Each population. each population. for each statistics for category, immune each in 4: Figure Supplementary 0. of D Tajima’s a shows line dashed grey The element. Muller the on bp) (in loci 3: Figure Supplementary ouaingntcsaitc Piws iest,Tjm’ n s)frgenes for Fst) and D Tajima’s diversity, (Pairwise statistics genetic Population aiasDb eears l ulreeet o ahpplto,lctdby located population, each for elements Muller all across gene by D Tajima’s 17 Posted on Authorea 17 Jun 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159242077.72607009 — This a preprint and has not been peer reviewed. Data may be preliminary. hsncmaio o ahpplto,det osgicn ieecsbtencomparisons. between differences significant no to due population, each for comparison chosen B. A. 6: Figure Supplementary 5: Figure Supplementary D XY e eefrec ouain nta fsoigalpiws oprsn,w hwoerandomly one show we comparisons, pairwise all showing of Instead population. each for gene per encp ubrprpplto o ee fitrs nF in interest of genes for population per number copy Mean ihnpplto ariedvriyprgn costhe across gene per diversity pairwise population Within 18 .innubila D. ST peaks. genome. Posted on Authorea 17 Jun 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159242077.72607009 — This a preprint and has not been peer reviewed. Data may be preliminary. oiin lososFtbtenalmlsadfmlsfo 07 ycrmsm n position. and chromosome by 2017, from females and males all between Fst shows Also position. 8: Figure Supplementary population. and population. PR chromosome the by Separated score. likelihood Sweepfinder2. using estimated 7: Figure Supplementary B. ou n1.-9b fMle lmn ,t hwsrnetslciesepi each in sweep selective strongest show to D, element Muller of 18.5-19Mbp on Focus opst ieiodsoefraslciesepi kpwnoso h genome the of windows 1kbp in sweep selective a for score likelihood Composite s fgnsbtenC ape rm20 n 07 ycrmsm and chromosome by 2017, and 2001 from samples CH between genes of Fst 19 A. eoewd composite wide Genome Posted on Authorea 17 Jun 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159242077.72607009 — This a preprint and has not been peer reviewed. Data may be preliminary. h einad9%cndneitra scluae cosalfntoa aeois hl ntespecific the in 2017). while Messer Asymptotic categories, and functional for Haller intervals all 2012; Petrov across confidence and calculated 95% (Messer than is the higher interval significantly categories confidence are functional 95% * a and ( with median test marked permutation the Categories a categories. following background for intervals the confidence 95% with teins, 10: Figure Supplementary 2017 total on 2017 and (based males frequencies 2017 allele all between minor and the Chiricahua, in 2017 and difference samples. Chiricahua female average 2001 Shows between Comparisons SNPs). sample). 1000 sliding SNPs, 9: Figure Supplementary io leefeunydffrnecrears h eoe(vrgdoe 2000 over (averaged genome the across curve difference frequency allele Minor Asymptotic α o muectgre,ctcedvlpetpoen n l pro- all and proteins development cuticle categories, immune for < .5.0i akdwt ahdln.I h Al category, ‘All’ the In line. dashed a with marked is 0 0.05). 20 α r acltduigAsymptoticMK using calculated are Posted on Authorea 17 Jun 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159242077.72607009 — This a preprint and has not been peer reviewed. Data may be preliminary. aeu frpttv otn o ahsri,a on ihdaieE tan r ree yttlTE total by ordered are Strains order. dnaPipeTE. TE with by colored found bars as with strain, least, each to for most content from content repetitive of up made dnie sn PopoolationTE2 using identified color). and innubila D. intronic). in non-coding, TEs region, for spectra quency 11: Figure Supplementary C. tan hw itepplto tutr.Srisaelble yterpplto bt shape (both population their by labelled are Strains structure. population little shows strains enT neto est e M idw(ldn M)frec ouainof population each for 1Mb) (sliding window 1Mb per density insertion TE Mean .innubila D. B. h rnpsbeeeetcnetof content element transposable The rnil opnn nlss(hwn C )o Eisrin across insertions TE of 2) & 1 PCs (showing analysis component Principle . Eisrin r ooe yterorder. their by colored are insertions TE eaae yT re n h oaino neto eg coding (e.g. insertion of location the and order TE by separated 21 rspiainnubila Drosophila D. rprino h genome the of Proportion . A. neto fre- Insertion .innubila, D. Posted on Authorea 17 Jun 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159242077.72607009 — This a preprint and has not been peer reviewed. Data may be preliminary. upeetr al 4: Table Supplementary functions. and components processes, by 3: Table Supplementary population. each for eachbackground for accessions 2: Table SRA contains Supplementary Also autosomes. chromosome, X for strain. coverage of summary including study, 1: Table Supplementary L o ouaingntcsaitc nimn eectgre eaiet the to relative categories gene immune in statistics genetic population for GLM umr fGMfreeae coadKita ttsisG aeoisin categories GO statistics McDonald-Krietman elevated for GLM of Summary umr fgn nooyercmnsfrF for enrichments ontology gene of Summary umr of Summary rspiainnubila Drosophila 22 ysDAcletdadsqecdfrthis for sequenced and collected DNA fly’s ST nec ouain separated population, each in Posted on Authorea 17 Jun 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159242077.72607009 — This a preprint and has not been peer reviewed. Data may be preliminary. n h ffcso igencetd oyopim,Spff Nsi h eoeo rspiamelanogaster Drosophila of genome the in 6 SNPs Fly SnpEff: iso-3. polymorphisms, iso-2; nucleotide w1118; single Nguyen strain T. of Coon, effects M. the Wang, L. ing 32 L. Bioinformatics Platts, A. Schlesinger applications. P., sequencing F. Cingolani, cancer Barnes, and germline B. for indels Shaw, and R. variants Schulz-Trieglaff, structural O. X., in Chen, distributions element Transposable 1997 147 Sniegowski, D. Genetics P. of and . genetics Langley H. population C. B., The Charlesworth, 1989 23 Langley, genetics of H. review 34 C. Structure Annual Systematics Geographic and and and Evolution, B., Genetic Ecology, of Charlesworth, of Effects Review The 2003 Annual Barton, Variation. H. Neutral N. variation on and genetic Charlesworth of D. maintenance drives B., selection Charlesworth, Balancing 2019 Unckless, L. variation R. genetic and in hidden Hill Extensive T. 2017 R., J. Emerson, in Chapman, J. elements J. functional and of Kalsow structure S. the Zhang, shapes X. Zhao, Conflict. R. in M., Genes Chakraborty, in 2006 Trivers, variation R. Seasonal and 2015 A., Schmidt, Burt, S. P. and Heschel 2018 M. V., S. Buffalo, O’Brien, two R. in K. traits Watson, history S. eukaryoticlife S. in elements Canadian L., repetitive E. of migration. database Behrman, a mammal Update, 6 of Repbase 2015 DNA benefits Kohany, Mobile O. adaptive and genomes. the Kojima K. On K. W., 2014 Bao, Fryxell, M. 92 J. Zoology and of invasive Street, Journal the of G. phylogeography T., and variability Avgar, Genetic 2005 in Powell, E. and pp. mussel, Wilson Fire, zebra J. 2 Gosling, National E. Horseshoe Chiricahua I., 2011 Astanei, the the to of Trip Magazine. Impacts Field Geology the Consortium Arizona of Science 6 Discussion Fire Microbiology Southwest in Monument: Frontiers 2012 species. L., invasive global Arechederra-Romero, 2015 tool. Vasconcelos, a search M. of alignment V. ecophysiology local and and Basic phylogeography, Leao 1990 bution, N. Lipman, P. J. T., D. J. and Antunes, Myers 215 W. Biology E. Molecular Miller, of W. Gish, Journal W. F., S. Altschul, Bibliography population. each for SnIPRE 3: Data Supplementary population. each for VCFtools 2: Data Supplementary GWAS. in and statistics 1: Data Supplementary 5: Table Supplementary innubila D. Drosophila . risn oyopa(Pallas) polymorpha Dreissena niirba etds eoeBooyadEouin11 Evolution and Biology Genome peptides. antimicrobial : 1993-1995. Scythe : 481-490. Drosophila : . 4-9. umr fgn nooyercmnsfrD for enrichments ontology gene of Summary : C l o Nsin SNPs for file VCF coadKeta ttsiscluae o ahgn in gene each for calculated statistics McDonald-Kreitman ouaingntcsaitc acltdfrec eein gene each for calculated statistics genetic Population 251-287. : : 80-92. 403-410. pce.Junlo vltoayBooy28 Biology Evolutionary of Journal species. o cl14 Ecol Mol . Drosophila .innubila D. 23 yidoprossraciborskii: Cylindrospermopsis tal. et o.r 50 Doi.Org . : 1655-1666. 02Apormfranttn n predict- and annotating for program A 2012 , sdi siaino ouaingenetic population of estimation in used , tal. et : : 114967. 2691-2701. Drosophila 06Mna ai eeto of detection Rapid Manta: 2016 , XY nec population. each in rzn elg Magazine Geology Arizona : 1691-1704. : 99-125. rnpsbeelements. transposable eiwo h distri- the of review .innubila D. .innubila D. : Drosophila 1220-1222. : 473. using using , Posted on Authorea 17 Jun 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159242077.72607009 — This a preprint and has not been peer reviewed. Data may be preliminary. ulr .L,G .Hye,S ihrsadS .Shee,21 eoiso aua ouain:How Populations: Natural of Genomics Genetics. 2016 in. Schaeffer, Inversions W. Chromosomal of S. Evolution and the Richards of Shape S. estimation Genes Haynes, efficient Expressed D. and Differentially G. Fast Fran¸cois, L., 2014 O. Z. Fuller, and Bouchard 196 Genetics G. coefficients. Trouillon, ancestry T. individual 164 Mathieu, Genetics F. genotype frequencies. E., multilocus Frichot, allele using of structure correlated population Review and of Annual Inference loci 2003 Linked Pritchard, Expansions. K. data: Range J. and Poisson of Stephens Consequences a M. D., Genetic Using Falush, 2009 Inference 40 Petit, Selection Systematics J. and SnIPRE: R. Evolution, and 2012 Ecology, Foll Bustamante, M. D. L., 8. C. Biology Excoffier, Computational and PLoS in Booth male-killing Model. G. embryonic Effects J. Random of E., modulation K. and Eilertson, Expression 2005 Jaenike, J. 59 and evolution Minhas S. innubila M. Drosophila A., K. association: host-parasite Dyer, structured spatially a of dynamics 1518-1528. Evolutionary 2005 Jaenike, innubila J. Drosophila 168 and Genetics in a., Genomes. Endosymbiont K. Parasite Male-Killing Dyer, and a Host by the Infection From Stable Evidence of Evolutionarily Molecular response 2004 antiviral A., Galiana-Arnoux the natural K. for D. Dyer, of sufficient Troxler, not Genetics L. but Irving, 1963 required P. Wallace, is Jouanguy, pathway B. E. and C., Spassky Dostert, of B. population Pavlovsky, marginal O. isolated an Hunter, 91-103. of Genetics S. XXXI. A. populations. H., of T. Chromosomes Dobzhansky, In Inversions 1937 Sturtevant, Deleterious 23 H. and netics A. Heterotic XL: and Populations T., of Natural Dobzhansky, Populations of in Genetics Lethals The Recessive 1968 43 of Spassky, genetics Effects B. Nature and Maguire data. T., R. sequencing Dobzhansky, J. DNA Garimella, next-generation V. using K. genotyping Poplin, Banks R. and discovery Banks, E. E. Albers, A., M. A. DePristo, C. Abecasis, 27 disparity G. Bioinformatics the Auton, unifying VCFtools. A. sites: linked P., at selection 14 Danecek, of due Genet signatures are Rev Genomic Nat speciation 2013 Payseur, of species. A. islands among 23 B. genomic and Ecol D., that Mol Agriculture. A. suggests flow. of Cutter, Vulnerability Reanalysis gene Department the 2014 reduced States and Hahn, not United Change diversity, W. 1-208. Climate reduced M. of to pp. and Assessment Southwest, E., An the T. 2012 of Friggens, Cruickshank, Islands M. Sky M. the and in Finch Wildlife M. 144 of D. Journal J., Geographical S. The Coe, Expansion. Desert and Activities Human 65 416-423. 1978 Insectology L., of of invasion J. Bulletin the Cloudsley-Thompson, of management. review pest A integrated 2012 for Anfora, agenda G. research and Ioriatti C. A., Cini, : 28-64. : 838-848. n male-killing and potnte o utlvlslcin vlto;itrainljunlo organic of journal international Evolution; selection. multilevel for opportunities : : 2156-2158. : 262-274. Wolbachia : 481-501. : rspiapseudoobscura Drosophila 973-983. vlto;itrainljunlo rai vlto 59 evolution organic of journal international Evolution; . 24 : 3133-3157. : tal. et 1567-1587. Drosophila. rspiasuzukii Drosophila rspiapseudoobscura Drosophila tal. et tal. et 01Tevratcl omtand format call variant The 2011 , : eeis59 Genetics . 1443-1455. 01Afaeokfrvariation for framework A 2011 , 05TeJkSA signaling Jak-STAT The 2005 , : rspiapseudoobscura. Drosophila 149-160. a muo 6 Immunol Nat : nErp n draft a and Europe in rspiainnubila Drosophila 411-425. : 491-498. eeis48 Genetics . : 946-953. Ge- : : : : Posted on Authorea 17 Jun 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159242077.72607009 — This a preprint and has not been peer reviewed. Data may be preliminary. fsiolsaidcdml iln nDoohl eaoatr eeiy(dn)102 (Edinb) expression the Heredity against melanogaster. genotype Drosophila host of in files. Effects killing fastQ 2009 Fukatsu, male for T. -induced and tool of Shimada trimming M. Anbutsu, quality-based H. adaptive, D., window, Kageyama, sliding A Sickle: 2011 Fass, 1.33. J. and N., Joshi, 21 male-killing to Biology resistance Evolutionary No of 2008 Dyer, Journal while A. K. sweeps and selective J., SE) Jaenike, Reyolds recent J., Detecting Rolff 2009 2016 (eds Elftherianos, immunity Nielsen, and I. 25 R. and Ecol Mol J., and selection. Imler, Hellmann background and I. rate DeGiorgio, mutation for M. controlling D., C. Huber, Partridge C. T. Lundblad, Http://broadinstitute.github.io/picard, K. Reviews Science Cooper, Quaternary Africa. J. Southern in R. years 25,000 22 G. past the over Lee-Thorp, variability climatic A. millennial-scale J. of K., response 188 Holmgren, immune Nat The Am 2003 Directions. A., Future Bradburd J. and G. virus. Hoffmann, Solutions, Antolin, DNA F. Practical M. Pitfalls, Lotterhos, an Adaptation: E. Local in K. of Kelley, haplotypes L. competing J. two S., of Hoban, evolution Recurrent of 2020 genome Unckless, The Biorxiv Evolution R. and and 2019 Biology T., Unckless, Molecular Hill, genes. L. immune R. in 405 selection Nature and from of ages. Koseva adaptation patterns ice B. Quaternary of the T., genetics of Hill, legacy population genetic molecular The sweeps: 2000 G., Soft Hewitt, 2005 McDonald– 169 Pennings, Genetics Asymptotic variation. S. the genetic P. for standing Tool and Web-Based J., A 7 Hermisson, asymptoticMK: Genetics 2017 Genomes, Messer, Genes, W. G3: Test. P. Kreitman and biology C., Systematic B. 3.0. Haller, PhyML of performance the assessing Hordijk 59 phylogenies: W. Anisimova, maximum-likelihood Antonazzo M. estimate G. Lefort, to V. Urbano, Dufayard, M. J.-F. 45 J. S., Research Guindon, Santos, Acids Dos Nucleic G. future. the Marygold, to J. Looking S. S., L. Gramates, ( mosquito 1205. fever yellow ( Mavingui the P. with mosquito Moro, analysis tiger comparative V. Asian C. Vieira, the C. of Modolo, 232. L. Guide. C., Concise Goubert, 8. A (Basel) Genetics: Genes number Population Topic. average 2004 Evolving the J., An of Gillespie, DNA: estimate Satellite An 2017 2015 199 A., Przeworski, Genetics M. humans. M. Garrido-Ramos, by and carried Ober mutations C. lethal Stephens, recessive M. of Waggoner, D. Z., Gao, : : 2311-2326. 307-321. : 1-45. ee albopictus Aedes : Drosophila 1570-1577. : Picard : 2335-2352. 49-68. . eetm ihdaieEfo a eoi ed and reads genomic raw from dnaPipeTE with repeatome ) samdlfrsuyn niia eecs netinfection Insect defences. antiviral studying for model a as Drosophila : : 1569-1575. D663-D671. 25 ee aegypti Aedes aue426 Nature . Wolbachia tal. et : 1243-1254. rspiainnubila Drosophila tal. et .Gnm ilg n vlto 7 Evolution and Biology Genome ). 05D ooasml n annotation and assembly novo De 2015 , : tal. et 142-156. 00Nwagrtm n methods and algorithms New 2010 , fe huad fyaso infection. of years of thousands after : : 1-36. 33-38. 06FnigteGnmcBasis Genomic the Finding 2016 , : 907-913. tal. et tal. et eel lineage-specific reveals 07Fyaea 25: at FlyBase 2017 , : : 379-397. 475-482. 03Persistent 2003 , : 1192- Posted on Authorea 17 Jun 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159242077.72607009 — This a preprint and has not been peer reviewed. Data may be preliminary. az,M,M ugadA uz 08TeFlbc-ieeeetGlloblnst h uefml of of superfamily Academy P National the the to of Proceedings belongs genus. Galileo Drosophila element Foldback-like the The within widespread 2008 is Ruiz, and A. transposons and Technical DNA Puig reads. M. sequencing M., high-throughput Marzo, from sequences adapter removes Cutadapt Notes 2011 M., Martin, akw .A,adP ’rd,2006 O’Grady, P. and A., T. Markow, of Populations Natural in Fecundity Genetics Affecting . Loads Ayroles Genetic F. 1967 J. D., Marinkovic, Barbadilla, A. Stone, a. E. Richards, gaster S. C., F. T. Mackay, n/a. ahd,H . .O egad .R ’re,E .Bhmn .S Schmidt S. P. Behrman, L. E. O’Brien, in R. variation K. latitudinal of Bergland, fixed genomics O. of population A. effects of E., extent H. in genomic Machado, flow the of gene Evaluation interspecific 2007 and Noor, persimilis variation F. Drosophila a. intraspecific M. on and differences Haselkorn inversion S. T. ( a., C. Orangutans Machado, of History new Degenhardt Demographic 8. D. ideas, and ONE J. old PLoS Speciation Musharoff, evolution: S. Rich DNA Eilertson, Satellite a K. 2018 Kelley, Reveals Barbash, L. A. J. X., D. Ma, and 49 Clark Dev G. Genet A. Opin McGurk, Curr P. genetics approaches. M. Nature S., spectra. frequency S. SNP Lower, using changes size population Exploring 2015 47 Fu, Y.-X. and Ruan X., J. Liu, 25 Fennell, England) T. (Oxford, Wysoker, Bioinformatics A. SAMtools. and Handsaker, sequences. B. whole-genome transform. individual H., Burrows-Wheeler from Li, history with population alignment human of read Inference 475 2011 short Nature Durbin, R. accurate and and H., Li, Fast 2009 25 Durbin, England) (Oxford, R. Bioinformatics and H., Corbett-Detig 199 B. Li, Genetics R. population. Taylor, 623 range of W. ancestral resource Crepeau, single genomic W. a population M. com- A Cardeno, human nexus: M. cosmopolitan genome C. two B., made J. endemics Lack, Afrotropical two How 2004 Sequencing the Silvain, Generation mensals: J.-F. Next and of D., Analysis Lachaise, ANGSD: 2014 Nielsen, 15 R. Bioinformatics and BMC Albrechtsen Data. A. S., T. Korneliussen, 11 Genet Schl PLoS C. . and Nolte V. Evolution R., and Kofler, Biology Molecular Pool-Seq. using Schl elements C. and posable Daniel G. R., Kofler, in insertions Schl element C. transposable 8 and of dynamics Betancourt complex J. uncovers A. R., Kofler, : : 1-16. 555-559. : eei eeec ae.Ntr 482 Nature panel. reference genetic 1-12. : 61-71. : 493-496. rspiamelanogaster Drosophila : e1005406. eeis175 Genetics . : tee,21 ep n oeo rnpsbeeeetatvt in activity element transposable of mode and Tempo 2015 ¨ otterer, 356. tee,21 ooltoT2:cmaaiepplto eoiso trans- of genomics population comparative : PoPoolationTE2 2016 ¨ otterer, : 1289-1306. : : 1754-1760. - Drosophila 70-78. .simulans D. : tee,21 eunigo oldDASmls(Po-e ) Pool-Seq ( Samples DNA pooled of Sequencing 2012 ¨ otterer, 173-178. .simulans D. : ud oseisidentification. species to guide a : 1229-1241. aaoegahcrdl.Gntc 120 Genetica riddle. palaeogeographic : 26 2078-2079. rspiamelanogaster Drosophila tal. et and 09Tesqec lgmn/a format alignment/map sequence The 2009 , .melanogaster D. tal. et rspiamelanogaster Drosophila : 1-12. og pygmaeus Pongo tal. et 03Pplto eoi Analysis Genomic Population 2013 , 02The 2012 , rspiapseudoobscura Drosophila eoe,icuig17from 197 including genomes, tal. et tal. et oeua Ecology Molecular . rspiapseudoobscura Drosophila 05The 2015 , 05Comparative 2015 , and rspiamelano- Drosophila : 17-39. lSGenetics PloS . og abelii Pongo Drosophila Drosophila : n/a- and ). Posted on Authorea 17 Jun 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159242077.72607009 — This a preprint and has not been peer reviewed. Data may be preliminary. akn .A,adJ .A ucse,19 h oto irto nIscs nulRve fEntymology of Review Annual . in Migration of Cost The 1992 Burchsted, modern albopictus. A. a C. Aedes 37 J. of mosquito and history A., tiger Glacial M. Asian Rankin, 2012 the Urbanelli, of S. 7 modelling and One distribution Somboon PLoS species P. and Bellini, phylogeography R. invader: Mastrantonio, V. D., Porretta, Cardeno M. C. Stevens, 1-24. A. K. Sugino, P. Sub-Saharan R. of Corbett-detig, B. of R. E., DPGP3. genomics J. 2013 Pool, Population Langley, Gonz´alez, 2011 H. J. C. and and J., Lenkov Pool, K. Lipatov, natural M. across in Fiston-Lavier, elements impacts A.-S. transposable change climate a., of D. fingerprint Petrov, coherent globally A 421 2003 Nature Yohe, systems. G. and C., Parmesan, Vermeulen M. Jansen, W. P. Overheul, NF- J. of G. the Joosten, between J. Divergence H., 2007 W. Palmer, Machado, a. C. and Schaeffer W. 1417-1428. S. Substitutions Garfield, Nucleotide a. of D. pseudoobscura Number F., Average 125 a. Estimating Genetics M. for Data. Noor, Method Restriction From Simple Populations A Between 1990 and Miller, Within J. and M., Nei, Bioinformatics data. 1987 M., sequencing Nei, next-generation Tyler-Smith from C. autozygosity sweeps. Xue, detecting selective Y. soft for 32 Scally, by approach A. adaptation model Danecek, rapid Markov P. of V., genomics Population Narasimhan, Frequent 28 2013 Evolution under Petrov, & Extensions A. 110 Ecology its Sciences D. in and Trends of and Test Academy W., National McDonald-Kreitman in P. the The Messer, strategies of 2012 Proceedings defense Petrov, Solutions. Antiviral A. and RNAi: D. Problems Adaptation: Beyond and W., 2013 P. Rij, Messer, 59 van Physiology P. Insect R. of 175 Journal and Genetics mosquito. sweep. H., selective a S. around disequilibrium Merkling, linkage of 20 structure Learning The 2007 Organizational G., & McVean, International Management the Knowledge of Capital, Proceedings Intellectual data. on sequencing DNA Conference next-generation Cibulskis analyzing K. Sivachenko, for A. framework Banks, MapReduce E. Hanna, in M. locus A., Adh McKenna, the in at 841-843 evolution protein pp. Adaptive Islands, 1991 Sky Kreitman, 351 2009 M. and Knowles, H., diffe- L. J. L. population McDonald, and of Huang statistics H. the E., and J. Biorxiv McCormack, selection adaptation. Background local 2018 detecting Whitlock, for C. consequences rentiation: M. and 105 R., America Matthey-Doret, of States United the of Sciences : : : 533-559. 1749-1751. 652-654. κ inligb N iu of virus DNA a by signalling B : e44515. oeua vltoaygenetics evolutionary Molecular and rspiamelanogaster: Drosophila .persimilis D. : 37-42. rspiamelanogaster. Drosophila : eoesqecsi eaint hoooa nesos eeis177 Genetics inversions. chromosomal to relation in sequences genome 659-669. Drosophila. fia iest n o-fia ditr.Po eeis8 Genetics PLoS Admixture. Non-African and Diversity African : 159-170. : 2957-2962. oubauiest press. university Columbia . oeua ilg n vlto 28 Evolution and Biology Molecular 27 tal. et : 1-5. tal. et 00TeGnm nlssTokt A Toolkit: Analysis Genome The 2010 , tal. et : 873-879. 06BFol/o:Ahidden A BCFtools/RoH: 2016 , tal. et 08Idcinadsuppression and Induction 2018 , 02Pplto Genomics Population 2012 , nylpdao islands of Encyclopedia : Drosophila 1633-1644. : : Drosophila 1297-1303. 8615-8620. : Drosophila 1395-1406. Nature . and : : . Posted on Authorea 17 Jun 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159242077.72607009 — This a preprint and has not been peer reviewed. Data may be preliminary. ht,T . .E ekn,G ekladJ .Sal,21 dpieeouindrn nogigrange ongoing an during evolution Adaptive 2013 Searle, B. J. and ( Structure. Heckel vole Population G. bank invasive of Perkins, the Analysis E. expansion: the S. A., for T. F-Statistics White, Estimating a 1984 in Cockerham, Chromosomes C. Third 38 C. of Evolution and Variability S., Genetic B. The Weir, 1974 Mukai, T. of and Population Yamaguchi Local O. K., Lee T. J. Watanabe, 2 Dreves, Management J. Pest A. Integrated Goodhue, of E. Journal R. from Bolda, male-killing P. M. Male-Killing of B., a 66 emergence D. Evolution of Walsh, the Mechanisms. Maintenance Independent in 2011 Male-Killing Jaenike, chromosome and J. Dependent X Male-Killing and the L., of R. Unckless, role 291 potential Biol Theor The J endosymbionts. 2011b mutualistic L., of R. 17 Virus Immunology Unckless, DNA International immunity. A innate 2011a polymorphism. in L., DNA receptors R. Toll-like Unckless, by 2005 hypothesis Akira, mutation S. and neutral K., the Takeda, testing for method 123 Statistical Genetics 1989 35 F., Geology Arizona Tajima, Geology. Arizona 2005 G., 102 A. PNAS Survey, Ebert profiles. L. expression B. genome-wide Mukherjee, interpreting S. for 15550. approach Mootha, knowledge-based K. A V. analysis: Tamayo, P. Evolution and A., Biology Subramanian, Molecular index. neutrality the of Estimation 2011 history.28 Eyre-Walker, human A. in and selection N., and demography Stoletzki, of effects the 22 Disentangling Evol 2005 Hahn, Biol W. Past Mol M. in the and evolution Over E., protein Woodrat J. Adaptive the Stajich, 2002 Eyre-Walker, in A. Size and Body C., of G. Evolution N. 270 1995 Smith, Science Brown, H. Change. J. Climate and RepeatMasker. of Betancourt Years pp. L. 25,000 Open-4.0, J. RepeatMasker A., 2013-2015 F. Hubley, Smith, R. and Open-1.0. A., RepeatModeler F. 2008 A. Hubley, Smit, R. and Herman A., 276 S. F. Sci A. J. Markova, Biol Smit, S. Proc phylogeography. Rambau, mammal V. small spontaneous R. from of Kotlik, insights consequences P. genomic B., and J. Rates 194 2013 Searle, Genetics Hahn, W. melanogaster. M. Drosophila and in Lynch events M. mutational Houle, D. R., D. 453 Schrider, Nature Wu change. Q. climate Neofotis, anthropogenic P. to Vicarelli, impacts M. Karoly, D. 28 C., Bioinformatics neo- Rosenzweig, Benes analysis. V. to split-read Stutz, state and M. transition paired-end A. a integrated Schlattl, by A. as Zichner, genes T. duplicated T., of Rausch, Subfunctionalization 5 2005 biology Liberles, evolutionary BMC a. functionalization. D. and S., Rastogi, : 63-70. : : 585-595. 1358-1370. :Ivsv eto ieigSf ri xadn t egahcRneadDmg Potential. Damage and Range Geographic its Expanding Fruit Soft Ripening of Pest Invasive ): : 63-73. rspiamelanogaster Drosophila ydsglareolus Myodes Drosophila : 1-7. : : eeis82 Genetics . 99-104. 2012-2014. : 28. LSOE6 ONE PLoS . 28 nIead o cl22 Ecol Mol Ireland. in ) : 353-357. : tal. et : 1-6. : tal. et 63-82. 937-954. : 02DLY tutrlvratdiscovery variant structural DELLY: 2012 , 4287-4294. tal. et 08Atiuigpyia n biological and physical Attributing 2008 , : tal. et : e26564. i333-i339. Wolbachia Drosophila 2011 , 09TeCli rneo Britain: of fringe Celtic The 2009 , tal. et : rspiasuzukii Drosophila 2971-2985. 05Gn e enrichment set Gene 2005 , in : 678-689. aue415 Nature . rspiainnubila Drosophila : 1022-1024. (Diptera: : : 15545– 1-14. By Posted on Authorea 17 Jun 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159242077.72607009 — This a preprint and has not been peer reviewed. Data may be preliminary. abn .A,M adkmr .N ahraadL .W,20 h olptwyi motn o an for important is pathway Toll The 2005 Wu, P. L. and Vakharia detect N. in to V. 25 response approach Nandakumar, Bioinformatics antiviral growth M. reads. pattern A., short a R. paired-end : Zambon, from Pindel insertions 2009 sized Ning, medium Z. and and deletions Apweiler 2865-2871. large R. of Long, points Q. Schulz, break H. M. populations K., natural Ye, in structure haplotype and Subdivision 2003 Charlesworth, D. of and Lauga B. I., S. Wright, rbdpi lyrata Arabidopsis Drosophila oeua clg 12 Ecology Molecular . rceig fteNtoa cdm fSine 102 Sciences of Academy National the of Proceedings . : 1247–1263. 29 : 7257-7262. :