
B D T E Tracy Heath Ecology, Evolution, & Organismal Biology Iowa State University @trayc7 http://phyloworks.org 2016 Workshop on Molecular Evolution Woods Hole, MA USA O Overview of divergence time estimation • Relaxed clock models – accounting for variation in substitution rates among lineages • Tree models – lineage diversification and sampling break BEAST v2 Tutorial — Divergence-time estimation under birth-death processes • Dating Bear Divergence Times with the Fossilized Birth-Death Process dinner AT-S E Phylogenies with branch lengths proportional to time provide more information about evolutionary history than unrooted trees with branch lengths in units of substitutions/site. Oligocene Miocene Plio Pleis 0.03 substitutions/site 20 0 Time (My) (silhouette images from http://phylopic.org) AT-S E Phylogenetic divergence-time estimation Historical biogeography Molecular evolution • What was the spacial and climatic environment of ancient angiosperms? • How has mammalian body-size (Nabholz, Glemin, Galtier. MBE 2008) changed over time? Diversifcation • How has the infection rate of Anolis fowleri (image by L. Mahler) HCV in Egypt changed over (Antonelli & Sanmartin. Syst. Biol. 2011) time? Trait evolution • Is diversification in Caribbean (Mahler, Revell, Glor, & Losos. Evolution 2010) anoles correlated with ecological Epidemiology opportunity? • How has the rate of molecular (Lartillot & Delsuc. Evolution 2012) evolution changed across the (Stadler et al. PNAS 2013) Tree of Life? Understanding Evolutionary Processes D T E Goal: Estimate the ages of interior nodes to understand the timing and rates of evolutionary processes Model how rates are Equus Rhinoceros Bos Hippopotamus distributed across the tree Balaenoptera Physeter Ursus Canis Felis Homo Simiiformes calibrated node Pan Gorilla Pongo Describe the distribution of Macaca Callithrix Loris Galago speciation events over time Daubentonia Varecia Eulemur Lemur Hapalemur Propithecus Lepilemur Cheirogaleus Mirza External calibration M. murinus M. griseorufus Microcebus M. myoxinus M. berthae information for estimates of M. rufus1 M. tavaratra M. rufus2 M. sambiranensis absolute node times M. ravelobensis Cretaceous Paleogene Neogene Q 100 80.0 60.0 40.0 20.0 0.0 Time (Millions of years) T G M C Assume that the rate of evolutionary change is ABC constant over time 10% 10% (branch lengths equal 20% percent sequence 200 My divergence) 10% 400 My (Based on slides by JeffThorne; http://statgen.ncsu.edu/thorne/compmolevo.html) T G M C ABC We can date the tree if we 10% know the rate of change is 10% 1% divergence per 10 My 20% 200 My! 200 My 10% 400 My (Based on slides by JeffThorne; http://statgen.ncsu.edu/thorne/compmolevo.html) T G M C If we found a fossil of the ABC MRCA of B and C, we can 10% use it to calculate the rate 10% of change & date the root 20% ! of the tree 200 My 10% 400 My (Based on slides by JeffThorne; http://statgen.ncsu.edu/thorne/compmolevo.html) R G M C Rates of evolution vary across lineages and over time Mutation rate: Variation in • metabolic rate • generation time ABC • DNA repair 10% 10% Fixation rate: 20% Variation in 200 My 10% • strength and targets of 400 My selection • population sizes U A Sequence data provide Data information about branch Sequence lengths In units of the expected # of substitutions per site branch length =rate time ⇥ Phylogenetic Relationships 0.2 expected substitutions/site R T 5 The sequence data 4 provide information branch length = 0.5 about branch length 3 for any possible rate, 2 time = 0.8 there’s a time that fits Branch Rate rate = 0.625 the branch length 1 perfectly 0 0 1 2 3 4 5 Branch Time Methods for dating species divergences estimate the substitution rate and time separately (based on Thorne & Kishino, 2005) B D T E length = rate length = time = (r1, r2, r3,...,r2N 2) R − = (a1, a2, a3,...,aN 1) A − N =number of tips B D T E length = rate length = time = (r1, r2, r3,...,r2N 2) R − = (a1, a2, a3,...,aN 1) A − N =number of tips B D T E Posterior probability f ( , ,θ ,θ ,θs D, ) R A R A | Vector of rates on branches R Vector of internal node ages A θ ,θ ,θs Model parameters R A D Sequence data Tree topology B D T E f( , ,θ ,θ ,θs D)= R A R A | f (D , ,θs) f( θ ) f( θ ) f(θs) | R A R | R A | A f(D) f(D , ,θ ,θ ,θs) Likelihood | R A R A f( θ ) Prior on rates R | R f( θ ) Prior on node ages A | A f(θs) Prior on substitution parameters f(D) Marginal probability of the data B D T E Estimating divergence times relies on 2 main elements: • Branch-specific rates: f ( θ ) R | R • Node ages & Topology: f ( θ ) A | A Tree Rate Matrix Site Rates Branch Rates DNA Data B D T E Estimating divergence times relies on 2 main elements: • Branch-specific rates: f ( θ ) R | R • Node ages & Topology: f ( θ ) A | A Tree Rate Matrix Site Rates Branch Rates DNA Data M R V Some models describing lineage-specific substitution rate variation: • Global molecular clock (Zuckerkandl & Pauling, 1962) • Local molecular clocks (Hasegawa, Kishino & Yano 1989; Kishino & Hasegawa 1990; Yoder & Yang 2000; Yang & Yoder 2003, Drummond and Suchard 2010) • Punctuated rate change model (Huelsenbeck, Larget and Swofford 2000) • Log-normally distributed autocorrelated rates (Thorne, Kishino & Painter 1998; Kishino, Thorne & Bruno 2001; Thorne & Kishino 2002) • Uncorrelated/independent rates models (Drummond et al. 2006; Rannala & Yang 2007; Lepage et al. 2007) • Mixture models on branch rates (Heath, Holder, Huelsenbeck 2012) Models of Lineage-specific Rate Variation G M C The substitution rate is constant over time All lineages share the same rate branch length = substitution rate low high Models of Lineage-specific Rate Variation (Zuckerkandl & Pauling, 1962) R-C M To accommodate variation in substitution rates ‘relaxed-clock’ models estimate lineage-specific substitution rates • Local molecular clocks • Punctuated rate change model • Log-normally distributed autocorrelated rates • Uncorrelated/independent rates models • Mixture models on branch rates L M C Rate shifts occur infrequently over the tree Closely related lineages have equivalent rates (clustered by sub-clades) branch length = substitution rate low high Models of Lineage-specific Rate Variation (Yang & Yoder 2003, Drummond and Suchard 2010) L M C Most methods for estimating local clocks required specifying the number and locations of rate changes a priori Drummond and Suchard (2010) introduced a Bayesian method that samples over a broad range of possible random local branch length = substitution rate clocks low high Models of Lineage-specific Rate Variation (Yang & Yoder 2003, Drummond and Suchard 2010) A R Substitution rates evolve gradually over time – closely related lineages have similar rates The rate at a node is drawn from a lognormal distribution with a mean equal to the parent rate (geometric brownian branch length = substitution rate motion) low high Models of Lineage-specific Rate Variation (Thorne, Kishino & Painter 1998; Kishino, Thorne & Bruno 2001) P R C Rate changes occur along lineages according to a point process At rate-change events, the new rate is a product of the parent’s rate and a Γ-distributed multiplier branch length = substitution rate low high Models of Lineage-specific Rate Variation (Huelsenbeck, Larget and Swofford 2000) I/U R Lineage-specific rates are uncorrelated when the rate assigned to each branch is independently drawn from an underlying distribution branch length = substitution rate low high Models of Lineage-specific Rate Variation (Drummond et al. 2006; Rannala & Yang 2007; Lepage et al. 2007) I/U R Lineage-specific rates are uncorrelated when the rate assigned to each branch is independently drawn from an underlying distribution Models of Lineage-specific Rate Variation (Drummond et al. 2006; Rannala & Yang 2007; Lepage et al. 2007) I M M Dirichlet process prior: Branches are partitioned into distinct rate categories The number of rate categories and assignment of branches to categories branch length = substitution rate are random variables under c1 c2 c3 c4 c5 the model substitution rate classes Models of Lineage-specific Rate Variation (Heath, Holder, Huelsenbeck. 2012 MBE) M R V These are only a subset of the available models for branch-rate variation • Global molecular clock • Local molecular clocks • Punctuated rate change model • Log-normally distributed autocorrelated rates • Uncorrelated/independent rates models • Dirchlet process prior Models of Lineage-specific Rate Variation M R V Are our models appropriate across all data sets? Polypteriformes Chondrostei Holostei sloth bear Elopomorpha Osteoglossomorpha brown bear Clupeomorpha 100•100•100•1.00 0.88 Denticipidae 100•100•100•1.00 5.39 t 7 [0.66–1.17] [4.2–6.86] 2.75 polar bear t 4 Gonorynchiformes Ostariophysi [2.1–3.57] t 6 100•100•100•1.00 Cypriniformes cave bear Gymnotiformes 85•93•93•1.00 5.05 Siluriformes t 5 [3.9–6.48] Asian Characiformes 99•97•94•1.00 t 9 4.08 black bear 12.86 100•100•100•1.00 Esociformes mean age (Ma) [3.11–5.27] Taxa [9.77–16.58] t 3 American 76•94•97•1.00 4.58 1 Salmoniformes MP•MLu•MLp•Bayesian 14.32 black bear Teleostei t 8 [3.51–5.89] 5 [9.77–16.58] Galaxiiformes t x 10 sun bear 50 node 95% CI Osmeriformes 100 19.09 500 Stomiiformes American giant t 2 [14.38–24.79] 1000 Argentiniformes 100•100•100•1.00 5.66 short-faced bear 5000 Myctophiformes t 10 [4.26–7.34] 10000 20000 spectacled Aulopiformes 35.7 bear Percopsif. + Gadiif. t 1 giant panda Polymixiiformes Acanthomorpha Zeiforms harbor seal Clade r ε ΔAIC Lampriformes Global expansion of C4 biomass Major temperature drop and increasing seasonality 1.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages114 Page
-
File Size-