Speciation with Gene Flow Lecture Slides By
Total Page:16
File Type:pdf, Size:1020Kb
Genomic studies of speciation and gene flow Why study speciation genomics? Long-standing questions (role of geography/gene flow) How do genomes diverge? Find speciation genes Genomic divergence during speciation 1. Speciation as a bi-product of physical isolation 2. Speciation due to selection – without isolation evolution.berkeley.edu Genomic divergence during speciation 1. Speciation as a bi-product of physical isolation 0.8 0.4 frequency d S 0.0 80 100 120 140 160 180 Transect position (km) Cline theory - e.g. Barton and Gale 1993 2. Speciation due to selection – without isolation evolution.berkeley.edu Genomic divergence during speciation 1. Speciation as a bi-product of physical isolation T i m ? e 2. Speciation due to selection – without isolation evolution.berkeley.edu Wu 2001, JEB Stage 1 - one or few loci under disruptive selection Gene under selection Genome FST Feder, Egan and Nosil TiG Stage 2 - Divergence hitchhiking Genome FST Feder, Egan and Nosil TiG Stage 2b - Inversion Inversion links co-adapted alleles Genome FST Feder, Egan and Nosil TiG Stage 3 - Genome hitchhiking Genome FST Feder, Egan and Nosil TiG Stage 4 - Genome wide isolation Genome FST Feder, Egan and Nosil TiG Some sub-species clearly in stage 1 lWing pattern “races” of Heliconius melpomene Heliconius erato Heliconius melpomene 1986 1986 0.8 2011 0.8 2011 n n 1 1 0.4 0.4 frequency 10 frequency 10 b D 50 50 0.0 0.0 80 100 120 140 160 180 80 100 120 140 160 180 0.8 0.8 0.4 0.4 frequency frequency r D C 0.0 0.0 80 100 120 140 160 180 80 100 120 140 160 180 0.8 0.8 0.4 0.4 frequency frequency d N S 0.0 0.0 80 100 120 140 160 180 80 100 120 140 160 180 Transect position (km) 0.8 0.4 frequency b Y 0.0 80 100 120 140 160 180 Transect position (km) Some sub-species clearly in stage 1 lWing pattern “races” of Heliconius melpomene B (red/orange patterns) Yb (yellow/white patterns) S. H. Martin et al. Genome Res. 23, 1817–1828 (2013). O. Seehausen et al. Nat. Rev. Genet. 15, 176–92 (2014). Some sub-species clearly in stage 1 lCarrion and hooded Crows FST Poelstra, J. W. et al. Science 344, 1410–4 (2014). And an example with multiple islands? Malinsky et al., Science 350, 1493 (2015). Other species have islands…but are they real? Ellegren, et al. Nature 491, 756- (2012). Other species have islands…but are they real? Anopheles gambiae and A. coluzzi Formerly M and S forms of A. gambiae Clarkson et al. 2014 Nature Communications Other species have islands…but are they real? Seehausen et al., Nature Reviews Genetics, 2014 What do patterns of Fst really mean? • Fst measures relative divergence • Peaks indicate regions of higher than expected between population divergence, given the within population divergence • Peaks can therefore result from reduced diversity within species • This could be due to lower Ne within species (selective sweeps, background selection) • So peaks NOT NECESSARILY due to reduced gene flow Note that sometimes sweeps within species = speciation genes Sweeps across the species barrier can also lead to Fst peaks Double peaks?? Nicolas Bierne, Daniel Berner and others Anopheles M-S divergence Relative divergence higher in low recombination regions - not significant for absolute divergence see also: Charlesworth 1998 MBE Measures of divergence… More recently see papers by Reto Burri No evidence for higher Dxy in wing pattern loci lWing pattern “races” of Heliconius melpomene B (red/orange patterns) Yb (yellow/white patterns) S. H. Martin et al. Genome Res. 23, 1817–1828 (2013). O. Seehausen et al. Nat. Rev. Genet. 15, 176–92 (2014). Suggestion that we use absolute measures of divergence? Understanding genomic divergence No single statistic will capture the complex history of mutation, migration and selection Patterns need to be interpreted in the specific context of the study species Much better to use explicit tests for gene flow Need to design sampling so the expectations in the absence of gene flow are clear and testable The key is to identify ‘control’ populations that are not influenced by admixture Explicit tests for gene flow: Neanderthal genome •Isolated DNA from bones 38,000 yrs old in Croatia •We diverged from Neanderthals around 270-440,000yrs ago •Evidence for gene exchange with humans (1-4% of genome?) Green et al., 328:710 Science 2010 Explicit tests for gene flow: ABBA- BABA test EXPECT: OBSERVE: 50% ABBA 103612 ABBA Green et al. 2010 Science 328:710-722 50% BABA 94029 BABA Explicit tests for gene flow: ABBA- BABA test Explicit tests for gene flow: ABBA- BABA test Explicit tests for gene flow: Combining multiple signals 1) Derived alleles at high frequency shared with Neanderthal 2) High divergence to Africa but low to Neanderthal 3) Long haplotype blocks The genomic landscape of Neanderthal ancestry in present-day humans - Sankararaman et al. Nature 2014 Sampled 10 complete high coverage genomes per population Explicit tests for gene flow: Heliconius butterflies Many sources of reproductive isolation: Female hybrids are sterile Different host plant use Different habitat preference Strong assortative mating Using Simon Martin’s Twisst method to characterise relationships ABBA-BABA statistics Simon Martin melG melW cyd outgroup Martin et al., MBE 2015 Downloaded from genome.cshlp.org onDownloaded September from 18,Downloaded genome.cshlp.org2013 - Published from genome.cshlp.org by on Cold September Spring Harbor18, on 2013 September Laboratory - Published 18, Press 2013 by Cold - Published Spring Harbor by Cold Laboratory Spring Harbor Press Laboratory Press Downloaded from genome.cshlp.orgDownloaded from ongenome.cshlp.org September 18, Downloaded2013on September - Published from 18, by genome.cshlp.org2013 Cold - PublishedSpring Harbor by on Cold SeptemberLaboratory Spring Harbor18,Press 2013 Laboratory - Published Press by Cold Spring Harbor Laboratory Press 300+ offspring for each cross type John Davey x x x PstI RAD sequencing Linkage maps built (site every ~10kb) with Lep-MAP Davey et al., Evol Letters 2017 617 Figure 2. Four-taxon617 maximum-likelihoodFigure617 2. Four-taxonFigure trees 2. Four-taxon maximum-likelihood for 100 kb maximum-likelihood windows trees for 100 trees kb windowsfor 100 kb windows 618 (A) Trees were superimposed618 (A) using 618Trees Densitree were(A) superimposedTrees (Bouckaert were superimposed using 2010). Densitree There using were (Bouckaert Densitree 2848 trees 2010).(Bouckaert for theThere H. 2010). were There2848 treeswere for2848 the trees H. for the H. 617 Figure617 2. Four-taxonFigure 2. maximum-likelihood Four-taxon617 maximum-likelihoodFigure trees 2. Four-taxonfor 100 treeskb windows maximum-likelihood for 100 kb windows trees for 100 kb windows 619 cydno – H. melpomene619 datasetcydno 619(left) – H. and cydnomelpomene 2453 – forH. melpomenedatasetthe H. timareta(left) dataset and – 2453H. (left) melpomene for and the 2453 H. datasettimareta for the (right). H.– H.timareta melpomene – H. melpomenedataset (right). dataset (right). 618 (A) Trees618 were(A) superimposed Trees were superimposed using618 Densitree(A) usingTrees (Bouckaert Densitreewere superimposed 2010). (Bouckaert There using 2010).were Densitree 2848 There trees were (Bouckaert for 2848 the H.trees 2010). for Therethe H. were 2848 trees for the H. 620 Tree-lengths were equalized620 Tree-lengths so620 that allTree-lengths treeswere couldequalized bewere superimposed, so equalized that all trees so and that could then all betreesa randomsuperimposed, could jitter be superimposed, was and then a randomand then jitter a random was jitter was 619 cydno619 – H. melpomenecydno – H. dataset melpomene (left)619 anddatasetcydno 2453 (left) –for H. andthe melpomene H.2453 timareta for datasetthe – H. H. timareta (left)melpomene and – 2453 H. dataset melpomene for the(right). H. datasettimareta (right). – H. melpomene dataset (right). 621 added to all branch621 lengthsadded to show621 to alldensity.added branch Treesto lengths all supportingbranch to show lengths each density. to of show the Trees four density. supporting possible Trees topologies eachsupporting of the eachfour possibleof the four topologies possible topologies 620 Tree-lengths620 wereTree-lengths equalized were so thatequalized620 all treesTree-lengths so couldthat all be treeswere superimposed, couldequalized be superimposed, soand that then all a trees random and could then jitter be a randomwassuperimposed, jitter was and then a random jitter was 622 are coloured accordingly:622 blueare colouredfor622 the speciesare accordingly: coloured tree, red accordingly: blue for forthe the geography bluespecies for tree,tree,the species redgreen for for tree,the the geographyred control for the tree, geography green fortree, the green control for the control 621 added621 to all branchadded lengthsto all branch to show621 lengths density.added to show Trees to alldensity. supporting branch Trees lengths each supporting ofto theshow four each density. possible of the Trees fourtopologies supporting possible topologies each of the four possible topologies 623 tree and black for 623unresolvedtree 623trees.and black (B)tree Thefor and unresolvedfour black topologies for trees.unresolved scored, (B) The trees. along four (B) with topologies The the four number scored,topologies and along scored, with alongthe number with theand number and 622 are coloured622 accordingly:are coloured blueaccordingly: for622 the species blueare colouredfor tree, the red species accordingly: for the tree, geography red blue for for the tree, the geography greenspecies for tree,tree, the