Clinical and Epidemiological Virology, Rega Institute, Department of Microbiology and Immunology KU Leuven, Belgium.

Estimating evolutionary rates and divergence times…. …and a bit of model testing

Philippe Lemey1 and Marc Suchard2

1.Rega Institute, Department of Microbiology and Immunology, K.U. Leuven, Belgium. 2.Departments of Biomathematics and Human Genetics, David Geffen School of Medicine at UCLA. Department of , UCLA School of Public Health

SISMID, July 19-21, 2017

MOLECULAR SEQUENCES Alignment Methods ALIGNMENT Sequence Evolution Models PHYLOGENETICS Phylogenetic Methods EVOLUTIONARY TREE (time scale = genetic )

Molecular Clock Models PHYLOGENETICS EVOLUTIONARY TREE (time scale = years) Coalescent Models POPULATION GENETICS

EPIDEMIOLOGY Molecular phylogenies

๏ most molecular phylogenies ‣ are unrooted (or the rooting is due to prior information) ‣ have branch lengths representing genetic change

Molecular phylogenies

๏ the ideal molecular phylogeny

10% differencesGenetic ‣ is rooted (implies a branching t1 order) ‣ has branch lengths in units of time (an evolutionary history) t2 5% ๏ how do we construct one of these trees? t3

t4 A constant evolutionary rate through time

• to obtain a timed 100 phylogeny, the shark evolutionary model 80 must assume a 60 carp platypus relationship between 40 chicken

(to human)(to cow the accumulation of 20 no.substitutions genetic diversity and 0

time 0 400 500 300 100 200 time to common ancestor (myr)

• Zuckerkandl and Pauling (1962): the rate of amino acid replacements in animal haemoglobins was roughly proportional to real time, as judged against the fossil record

A constant evolutionary rate through time

• the molecular clock is 100 particularly striking shark when compared to 80 the obvious 60 carp platypus chicken differences in rates of 40

(to human)(to cow morphological 20 no.substitutions evolution... 0 0 400 500 300 100 200 time to common ancestor (myr) The molecular clock is not a metronome

• if mutation every MY with Poisson

‣ 95% of the lineages 15MY old have 8-22 substitutions

‣ 8 substitutions also could be < 5 MY old

‣ Molecular Systematics, p532.

And there is no global molecular clock

nucleotide substitutions per site per year

-9 -8 -7 -6 -5 -4 -3 -2 -1 10 10 10 10 10 10 10 10 10

hbv picornaviridae calciviridae flaviridae

rnaviruses togaviridae coronaviridae rhabdoviridae paramyxoviridae orthomyxoviridae salmonellaenterica

plantchloroplast dna reoviridae

drosophilanuclear dna birnaviridae mammaliannuclear dna and retroviridae ecoli humancell t lymphotropic virus And there is no global molecular clock

• different genes, 100%

different profiles Fibrinopeptides 75% • variation in mutation Hemoglobin

rate? 50% • variation in selection Cytochrome c 25% genes coding for % genetic divergence some molecules Histone IV under very strong 300 600 900 1200 1500 stabilizing selection Time since divergence (Myr)

calibrating the molecular clock From substitution units to time units

nodes with Contemporary sample time point calibrations probabilistic calibrations

95% C.I. 22 Mya 20-30 Mya

7 Mya 95% C.I. 5-15 Mya

now

Node Calibrations

Fossils biogeography

Kauai 0 60 120 km (5.1 My) Oahu (3.7 My) Maui-Nui (W. Molokai) (1.9/1.6 My) 0.01

Hawaii (0.43 My)

Main Hawaiian Islands host-pathogen co-divergence (K-Ar ages) Felis catus FdPV1 100 100 100 Puma concolor PcPV1 93 100 89 95 83 100 Lynx rufus LrPV1 100 100 100 Panthera leo PlpPV1 100 100 100 100 Panthera uncia UuPV1

PlPV1

COPV Calibration using times

contemporary sample, serial sample, with time divergence no time structure structure

Tip calibration: two major applications

RNA viruses ๏ Substitutions accumulate evolve quickly: between the times of sampling 10-3 - 10-5 substitutions per ๏ Serially sampled sequences or site per year. heterochronous sequences

ancient DNAMeasurably population evolving sets of radiocarbon-dated specimens LETTER https://doi.org/10.1038/s41586-018-0097-z

Ancient hepatitis B viruses from the Bronze Age to the Medieval period Barbara Mühlemann1, Terry C. Jones1,2, Peter de Barros Damgaard3, Morten E. Allentoft3, Irina Shevnina4, Andrey Logvin4, Emma Usmanova5, Irina P. Panyushkina6, Bazartseren Boldgiv7, Tsevel Bazartseren8, Kadicha Tashbaeva9, Victor Merz10, Nina Lau11, Václav Smrčka12, Dmitry Voyakin13, Egor Kitov14, Andrey Epimakhov15, Dalia Pokutta16, Magdolna Vicze17, T. Douglas Price18, Vyacheslav Moiseyev19, Anders J. Hansen3, Ludovic Orlando3,20, Simon Rasmussen21, Martin Sikora3, Lasse Vinner3, Albert D. M. E. Osterhaus22, Derek J. Smith1, Dieter Glebe23,24, Ron A. M. Fouchier25, Christian Drosten2,26, Karl-Göran Sjögren18, Kristian Kristiansen18 & Eske Willerslev3,27,28*

Hepatitis B virus (HBV) is a major cause of human hepatitis. There between phases of chronic infection9. Studies based on heterochronous is considerable uncertainty about the timescale of its evolution sequences, sampled over a relatively short time period, find higher sub- and its association with humans. Here we present 12 full or partial stitution rates, whereas rates estimated using external calibrations tend ancient HBV genomes that are between approximately 0.8 and 4.5 to be lower, leading to a wide of estimated HBV substitution rates thousand years old. The ancient sequences group either within or (7.72 × 10−4–3.7 × 10−6 substitutions per site per year)10–12. Human in a sister relationship with extant human or other ape HBV clades. HBV is classified into at least nine genotypes (A–I) based on sequence Generally, the genome properties follow those of modern HBV. The similarity of at least 92.5% within genotypes13, with a heterogeneous root of the HBV tree is projected to between 8.6 and 20.9 thousand global distribution7,8 (Fig. 1a). Attempts to explain the origin of geno- years ago, and we estimate a substitution rate of 8.04 × 10−6– types using human migrations have been inconclusive. The hypothesis 1.51 × 10−5 nucleotide substitutions per site per year. In several that HBV co-evolved with modern humans as they left Africa 60–100 cases, the geographical locations of the ancient genotypes do not thousand years ago (ka) has been contested owing to the basal phyloge- match present-day distributions. Genotypes that today are typical netic position of genotypes F and H, which are found exclusively in the of Africa and Asia, and a subgenotype from India, are shown to Americas6. HBV also infects non-human primates, and the human and have an early Eurasian presence. The geographical and temporal other great ape HBVs are interspersed in the phylogenetic tree, possibly patterns that we observe in ancient and modern HBV genotypes owing to cross-species transmission14. Given the variability of estimated are compatible with well-documented human migrations during substitution rates, the incongruence of the tree topology with some the Bronze and Iron Ages1,2. We provide evidence for the creation of human migrations and the mixed topology of the non-human primate HBV genotype A via recombination, and for a long-term association and human HBV sequences in the phylogenetic tree, there remains of modern HBV genotypes with humans, including the discovery considerable uncertainty about the evolutionary history of HBV. of a human genotype that is now extinct. These data expose a Recent advances in the sequencing of ancient DNA (aDNA) have complexity of HBV evolution that is not evident when considering yielded important insights into human evolution, past population modern sequences alone. dynamics15 and diseases16,17. However, ancient sequences have been HBV is transmitted perinatally or horizontally via blood or genital recovered for only a handful of exogenous human viruses, including fluids3. The estimated global prevalence is 3.6%, ranging from 0.01% influenza virus (sample approximately 100 years old)18, variola virus (UK) to 22.38% (South Sudan)4. In high endemicity areas, in which (sample approximately 350 years old)19 and HBV (samples approxi- prevalence is over 8%, 70–90% of the adult population show evidence of mately 340 and 450 years old)20,21. The knowledge gained from these past or present infection5 (http://www.who.int/mediacentre/factsheets/ cases emphasizes the general importance of ancient sequences for the fs204/en/). The young and the immunocompromised are most likely to direct study of long-term viral evolution. HBV has several characteris- develop chronic HBV infection, which can result in high viraemia over tics that make it a good candidate for detection in an aDNA virus study: years to decades3. Approximately 257 million people are chronically its extended high viraemia during chronicity3, the relative stability of infected and around 887,000 people died in 2015 owing to associated its virion22, and its small, circular and partially double-stranded DNA complications (http://www.who.int/mediacentre/factsheets/fs204/en/). genome8. Despite the prevalence and public health impact of HBV, its origin Shotgun sequence data were previously generated from 167 Bronze and evolution remain unclear6,7. Inference of HBV nucleotide substi- Age1 and 137 predominantly Iron Age2 individuals from central to west- tution rates is complicated by the fact that the virus genome consists of ern Eurasia with a sample age range of approximately 7.1–0.2 thousand four overlapping open reading frames8, and that mutation rates differ years (kyr) old. We identified reads that matched the HBV genome in

1Center for Pathogen Evolution, Department of Zoology, University of Cambridge, Cambridge, UK. 2Institute of Virology, Charité, Universitätsmedizin Berlin, Berlin, Germany. 3Centre for GeoGenetics, Natural History Museum, University of Copenhagen, Copenhagen, Denmark. 4Archaeological Laboratory, Faculty of History and Law, A. A. Baitursynov Kostanay State University, Kostanay, Kazakhstan. 5Saryarka Archaeological Institute, Karaganda State University, Karaganda, Kazakhstan. 6Laboratory of Tree-Ring Research, University of Arizona, Tucson, AZ, USA. 7Department of Biology, School of Arts and Sciences, National University of Mongolia, Ulaanbaatar, Mongolia. 8Laboratory of Virology, Institute of Veterinary Medicine, Mongolian University of Life Sciences, Ulaanbaatar, Mongolia. 9National Academy of Sciences, Bishkek, Kyrgyzstan. 10Pavlodar State University, Pavlodar, Kazakhstan. 11Centre for Baltic and Scandinavian Archaeology, Schleswig, Germany. 12Institute for History of Medicine and Foreign Languages of the First Faculty of Medicine, Charles University, Prague, Czech Republic. 13Margulan Institute of Archaeology, Almaty, Kazakhstan. 14N. N. Miklouho-Maklay Institute of Ethnology and Anthropology, Russian Academy of Sciences, Moscow, Russia. 15South Ural Department, Institute of History and Archaeology UBRAS, South Ural State University, Chelyabinsk, Russia. 16Department of Archaeology and Classical Studies, Stockholm University, Stockholm, Sweden. 17Matrica Museum, Százhalombatta, Hungary. 18Department of Historical Studies, University of Gothenburg, Gothenburg, Sweden. 19Department of Physical Anthropology, Peter the Great Museum of Anthropology and Ethnography, Saint-Petersburg, Russia. 20Laboratoire d’Anthropobiologie Moléculaire et d’Imagerie de Synthèse, CNRS UMR 5288, Université de Toulouse, Université Paul Sabatier, Toulouse, France. 21Department of Bio and Health Informatics, Technical University of Denmark, Kongens Lyngby, Denmark. 22Research Center for Emerging Infections and Zoonoses, University of Veterinary Medicine Hannover, Hannover, Germany. 23Institute of Medical Virology, Justus Liebig University of Giessen, Giessen, Germany. 24National Reference Centre for Hepatitis B and D Viruses, German Center for Infection Research (DZIF), Giessen, Germany. 25Department of Viroscience, Erasmus Medical Centre, Rotterdam, The Netherlands. 26German Center for Infection Research (DZIF), Braunschweig, Germany. 27Cambridge GeoGenetics Group, Department of Zoology, University of Cambridge, Cambridge, UK. 28Wellcome Trust Sanger Institute, Hinxton, UK. 29These authors contributed equally: Barbara Mühlemann, Terry C. Jones, Peter de Barros Damgaard, Morten E. Allentoft. *e-mail: [email protected]

N A T U R E | www.nature.com/nature © 2018 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. incorporating sampling time: naive method

observed number of substitutions or genetic divergence d

sampling time 1 sampling time 2 t1 t2 substitution rate, µ = d / |t1 - t2|

incorporating sampling time: naive method

ancestral diversity

troot t2 t1 incorporating sampling time: naive method

d1

d2

troot t2 t1

µ = (d1 - d2) / (t1 - t2)

• can be rearranged: d1 di = µ (ti - troot)

d2 E[di] = µ . ti - µ . troot d3

gradient is: µ troot t3 t2 t1 y-intercept is: - µ . troot

µ = di / (ti - troot) x-intercept is: troot Estimating the time-scale

• H1N1/09 ‘Swine Flu’ • Rate: 3.14E-3 mutations/genomic site/year Evolutionary• tMRCA: 2009.041 Rates of dsDNA Viruses · doi:10.1093/molbev/msq088 MBE (15-Jan-2009) • Correlation: 0.83 • R2: 0.69

A DNA virus (smallpox)

Variola, Poxviridae, 190kb genome Sampling 1946-1977

Rate estimate: 8.2 x 10-6 Subs/Site/Year

FIG. 3. Genetic distance versus sampling year for the dsDNA viruses (clockwise from top left): HPV-16, HSV-1, BK virus (BK), VARV, HAdV-B, HAdV-C, and VZ virus (VZ). The regression coefficient (R2) estimates the fit of the data to a strict molecular clock by testing the degree of influence sampling time has over the amount of pairwise diversity in the data. This analysis supports the presence of temporal structure in the data for VARV and HAdV-B, while suggesting the presence of temporal structure for HSV-1 and VZ. No evidence for temporal structure within the sampled period was found for the HPV-16, BK, and HAdV-C data sets using this method.

6 toward zero (fig. 4). Similarly, when the set rate was 10À the true values; however, substantial deviations from the 7 subs/site/year or lower, our tools were unable to recover known values occurred when the rates were set to 10À 8 a rate that was close to the true value, and the or 10À subs/site/year, with correspondingly larger 95% 95% HPDs ranged from the highest possible rate supported HPDs (fig. 5). The rates estimated from the HSV-1-like data by the data to a value approaching zero (fig. 4). We con- simulated with branch-rate heterogeneity also closely mir- sider these widely skewed posterior distributions of the rate rored those from the data simulated under a strict clock. to signify a lack of significant temporal structure in the The mean substitution rates estimated from all HSV-1-like data, an effect not seen in estimates based on our real data data were higher than the true values and again associated where values close to zero were not observed. with wide, long-tail posterior distributions that tended to- To determine if the high substitution rates recovered for ward zero (fig. 5). As before, we consider these distributions the dsDNA viruses analyzed in the first part of this study to indicate a lack of temporal structure in the data at these could be a result of deviations from the molecular clock low evolutionary rates. model coupled with low temporal signal in the data, we added branch-rate heterogeneity (i.e., relaxed clock behav- ior) to the sets when the mean rate was set Discussion 6 7 8 to 10À , 10À , and 10À subs/site/year and reestimated the Based on the distribution of rates from our synthetic data rate of substitution. The posterior mean and 95% HPDs es- sets, we are able to make a number of general conclusions timated from these synthetic data sets were similar to about the use of heterochronous data to estimate the sub- those returned from the data simulated under a strict clock stitution rates and divergence times of potentially slowly (fig. 5). When the true rate of the VARV-like data was set at evolving dsDNA viruses. In particular, given a data set con- 6 10À subs/site/year, the resultant estimates were close to taining a large enough number of variable sites (such as the

2045 Salmonella Typhimurium

6|Virus Evolution, 2016, Vol. 2, No. 1

Diagnostic tool

- divergence accumulation 6|- outliersVirus Evolution, 2016, Vol. 2, No. 1 TempEst

recombinant

Figure 3. Heterochronous data . (A) Root-to-tip regression plot, with ancestor traces shown, of 355 HA gene sequences belonging to the Classical swine lineage of influenza A/H1 virus. The outlier (blue circle) is strain A/Swine/North Carolina/98225/01(AF455676) which is thought to be a recombinant sequence (Lam vaccine lineageet al. 2013). The group of outliers below the regression line (red circles) represents a vaccine lineage that has spent time in laboratory storage before resuming onward transmission (e.g., EU502884, DQ058215, HQ541680). Hence, this lineage has undergone less divergence from the tree root than expected. (B) Root-to-tip regression plot, with‣ ancestorRambaut traces A. shown,et al. (2016) of 614 HA Virus gene Evolution sequences of, 2(1) human, vew07. influenza A/H3N2 virus. The outlier (green circle) is strain A/Victoria/1968 (CY015508). This sequence has been retracted from GenBank, possibly the strain information does not match the sequence given, which appears to be of more recent provenance. Phylogenies for each regression plot are shown below, with outliers highlighted (scale bar represents substitutions per site).

uncertainty, multiple trees can be sampled from a phylogenetic Buonagurio, D. A. et al. (1986) ‘Evolution of Human Influenza A posterior distribution or bootstrap distribution, each of which is Viruses over 50 Years: Rapid, Uniform Rate of Change in NS then analysed separately in TempEst (see Gray et al. 2013 for an Gene’, Science, 232: 980–2. example). However, this approach will be automated in future Drummond, A. J. et al. (2006) ‘Relaxed Phylogenetics and Dating versions of the software. with Confidence’, PLoS Biology, 4: e88. et al. (2003a) ‘Measurably Evolving Populations’, Trends in Ecology and Evolution, 18: 481–8. Acknowledgments , Pybus, O. G., and Rambaut, A. (2003b) ‘Inference of Viral Figure 3. Heterochronous data quality control. (A) Root-to-tip regression plot, withAlexei ancestor Drummond, traces shown, of Marc 355 HA Suchard, gene sequences Philippe belonging Lemey, to the Classical and swine Evolutionary Rates from Molecular Sequences’, Advances in lineage of influenza A/H1 virus. The outlier (blue circle) is strain A/Swine/NorthSimon Carolina/98225/01(AF455676) Frost made significant which is contributions thought to be a recombinant to development sequence (Lam Parasitology, 54: 331–58. et al. 2013). The group of outliers below the regression line (red circles) represents a vaccine lineage that has spent time in laboratory storage before resuming onward Drummond, A., and Rodrigo, A. G. (2000) ‘Reconstructing transmission (e.g., EU502884, DQ058215, HQ541680). Hence, this lineage has undergoneof Path-O-Gen/TempEst. less divergence from the tree A.R. root was thansupported expected. (B) Root-to-tip by EU Seventh regression plot, Genealogies of Serial Samples Under the Assumption of a with ancestor traces shown, of 614 HA gene sequences of human influenza A/H3N2Framework virus. The outlier Programme (green circle) is (FP7/2007-2013)strain A/Victoria/1968 (CY015508). under This Grant sequence has been retracted from GenBank, possibly the strain information does not matchAgreement the sequence given,no. 278433-PREDEMICS which appears to be of more and recent ERC provenance. Grant Phylogenies agree- for Molecular Clock Using Serial-Sample UPGMA’, Molecular Biology each regression plot are shown below, with outliers highlighted (scale bar represents substitutions per site). and Evolution, 17: 1807–15. ment no. 260864. O.G.P. received funding from the European Drummond, A. J., and Suchard, M. A. (2010) ‘Bayesian Random Research Council under the European Union’s Seventh uncertainty, multiple trees can be sampled from a phylogenetic Buonagurio, D. A. et al. (1986) ‘Evolution of Human Influenza A Local Clocks, or One Rate to Rule them all’, BMC Biology, 8: Framework Programme (FP7/2007-2013)/ERC grant agree- posterior distribution or bootstrap distribution, each of which is Viruses over 50 Years: Rapid, Uniform Rate of Change in NS 114. ment no. 614725-PATHPHYLODYN. then analysed separately in TempEst (see Gray et al. 2013 for an Gene’, Science, 232: 980–2. et al. (2012) ‘Bayesian Phylogenetics with BEAUti and the example). However, this approach will be automated in future Drummond, A. J. et al. (2006) ‘Relaxed Phylogenetics and Dating Conflict of interest: None declared. BEAST 1.7’, Molecular Biology and Evolution, 29: 1969–73. versions of the software. with Confidence’, PLoS Biology, 4: e88. Duchene, S. et al. (2015) ‘The Performance of the Date- et al. (2003a) ‘Measurably Evolving Populations’, Trends in Randomisation Test in Phylogenetic Analyses of Time-Structured Ecology and Evolution, 18: 481–8. References Virus Data’, Molecular Biology and Evolution,32:1895–906. Acknowledgments , Pybus, O. G., and Rambaut, A. (2003b) ‘Inference of Viral Biek, R. et al. (2015) ‘Measurably Evolving Pathogens in the Felsenstein, J. (2003) Inferring Phylogenies. Sunderland, MA: Evolutionary Rates from Molecular Sequences’, Advances in Alexei Drummond, Marc Suchard, Philippe Lemey, and Genomic Era’, Trends in Ecology and Evolution, 30: 306–13. Sinauer Associates. Simon Frost made significant contributions to development Parasitology, 54: 331–58. of Path-O-Gen/TempEst. A.R. was supported by EU Seventh Drummond, A., and Rodrigo, A. G. (2000) ‘Reconstructing Framework Programme (FP7/2007-2013) under Grant Genealogies of Serial Samples Under the Assumption of a Agreement no. 278433-PREDEMICS and ERC Grant agree- Molecular Clock Using Serial-Sample UPGMA’, Molecular Biology and Evolution, 17: 1807–15. ment no. 260864. O.G.P. received funding from the European Drummond, A. J., and Suchard, M. A. (2010) ‘Bayesian Random Research Council under the European Union’s Seventh Local Clocks, or One Rate to Rule them all’, BMC Biology, 8: Framework Programme (FP7/2007-2013)/ERC grant agree- 114. ment no. 614725-PATHPHYLODYN. et al. (2012) ‘Bayesian Phylogenetics with BEAUti and the Conflict of interest: None declared. BEAST 1.7’, Molecular Biology and Evolution, 29: 1969–73. Duchene, S. et al. (2015) ‘The Performance of the Date- Randomisation Test in Phylogenetic Analyses of Time-Structured References Virus Data’, Molecular Biology and Evolution,32:1895–906. Biek, R. et al. (2015) ‘Measurably Evolving Pathogens in the Felsenstein, J. (2003) Inferring Phylogenies. Sunderland, MA: Genomic Era’, Trends in Ecology and Evolution, 30: 306–13. Sinauer Associates. Time structure via tip calibration

Contemporary sample Serial sample time no time structure with time structure

1980

1990

2000

‣ Rambaut A. (2000) Bioinformatics, 16, 395-399.

Relaxing the molecular clock Clock versus non-clock

• unconstrained (unrooted) Felsenstein model: Felsenstein (1981) JME, 17: 368 - 376 ‣ each branch has its own rate independent of all others ‣ time and rate are confounded and can only be estimated as a compound parameter (branch lengths) • strict molecular clock: Zuckerkandl & Pauling (1962) in Horizons in Biochemistry, pp. 189–225 ‣ all lineages evolve at the same rate ‣ allows the estimation of the root of the tree and dates of individual nodes

Need for a relaxed molecular clock

• the unrooted model of phylogeny and the strict molecular clock model are two extremes of a continuum. • dominate phylogenetic inference • but both are biologically unrealistic: ‣ the real evolutionary process lies between these two extremes ‣ model misspecification can produce positively misleading results

‣ Pybus (2006) Genome Biol. 4, e151 ‘strict’ molecular clock

1 parameter virus1 (rate of evolution) virus2 virus3 virus4 virus5 virus6 virus7 virus8 virus9 virus10

2012 2013 2014

‘local’ molecular clock

high rate virus1 virus2 virus3 low rate virus4 virus5 virus6 virus7 virus8 virus9 virus10

2012 2013 2014 host specific local clock

high rate human pig‣human human human low rate bird‣pig human pig pig pig bird bird bird

2012 2013 2014

autocorrelated relaxed clock

high rate virus1 virus2 virus3 low rate virus4 virus5 virus6 virus7 virus8 virus9 virus10

2012 2013 2014 lognormal uncorrelated relaxed clock

high rate virus1 virus2 virus3 low rate virus4 low rate high rate virus5 virus6 virus7 virus8 virus9 2 parameters (mean rate and virus10 variance in rate among branches)

2012 2013 2014

P1: IKB CB502-10 CB502-Salemi & Vandamme CB502-Sample-v3.cls March 18, 2003 13:50 Char Count= 0

Relaxed clocks: (1) local molecular clocks

268 David Posada E

t4 C t2 D A B t3 Nonclocklike phylogenetic tree Clocklike phylogenetic tree n Btaxa = 5 n taxa = 5 t1 A E

b4 b4 E ‣ specify H0 beforehandb1 b3 C B b b6 2 D ‣ problem ofb 8identifiability b b b5 3 b 6 A b 2 b5 7 D b7 b B A 1 C ‣ Yoder and Yang (2000) Mol Biol & Evol 17: 1081-1090. unrooted tree rooted tree 2n − 3 independent branches n − 1 independent branches

All b1, b2, b3, b4, b5, b6, and b7 Only b1, b3, b4, and b6, need to be estimated for example, need to be estimated, because under the molecular clock: b2 = b1 b5 = b1 + b3 − b6 b7 = b6 b8 = b4 − b5 − b6 Figure 10.5 Number of free parameters in clock and nonclock trees. Under the free rates model ( nonclock), all the branches need to be estimated (2n 3). Under the molecular clock, = − only n 1 branches have to be estimated. The difference in the number of parameters − among a nonclock and a clock model is n 2. −

Maximum-likelihood methods can estimate the branch lengths of a tree by enforc- ing or not enforcing a molecular clock. In the absence of a molecular clock (the free-rates model), 2n 3 branch lengths must be inferred for a strictly bifurcating − unrooted phylogenetic tree with n taxa (Figure 10.5B). If the molecular clock is enforced, the tree is rooted, and just n 1 branch lengths need to be estimated (see − Figure 10.4 and Chapter 1). This should appear obvious considering that under a molecular clock, for any two taxa sharing a common ancestor, only the length of the branch from the ancestor to one of the taxa needs to be estimated, the other one be- ing the same. Statistically speaking, the molecular clock is the null hypothesis (i.e., the rate of evolution is equal for all branches of the tree) and represents a special case of the more general that assumes a specific rate for each branch (i.e., free-rates model). Thus, given a tree relating n taxa, the LRT can be used to evaluate whether the taxa have been evolving at the same rate (Felsenstein, 1988). In practice, a model of nucleotide (or amino-acid) substitution is chosen and the branch lengths of the tree with and without enforcing the molecular clock are estimated. To assess the significance of this test, the LRT can be compared with a χ 2 distribution with (2n 3) (n 1) n 2 degrees of freedom, because − − − = − the only difference in parameter estimates is in the number of branch lengths that needs to be estimated. true (model) tree

0.005 0.010 0.015

strict clock

1.0

relaxed clock

0.61

host-specific local clock

1.0

50.0 40.0 30.0 20.0 10.0 0.0 0.0 0.005 0.010 0.015 years before present substitutions/site/year

Bayesian local clocks

Worobey et al., Nature, 2014; 508(7495): 254–257 Autocorrelated relaxed clocks

• rates for each branch are drawn from a distribution centered on the rate of the ancestor

‣ but what is the rate ? at the root? r7 h ‣ A prior degree of 3 ? r5 r6 ‣ not currently possible to do phylogenetic h2 r3 r4 inference r1 h1 r2 AA GA AC GC ? 2 ri ~ LogNormal(rA(i),σ Δti )

‣ e.g., Thorne JL, Kishino H, Painter IS (1998) Mol Biol & Evol 15: 1647-1657.

Uncorrelated relaxed clocks

• rates for each branch are drawn independently from an identical distribution:

[7] h3 r ~ Exp(λ) r5 r6 r ~ LogNormal(µ,σ 2) [8] 0.6 0.5 h2

0.4 € r3 r4 r ~ Gamma(α,β) [9] 0.3 r1 h1 r2 dlnorm(x, 0, 1) 0.2 AA GA AC GC

0.1 € 0.0

0 1 2 3 4 5 x € ‣ Drummond et al. (2006) Plos Biology 4: e88. Bayesian evolutionary analysis sampling trees

• Given sequence data that is temporally A C G T spaced estimate true values of: A C ‣ substitution parameters (µ and Q) G T ‣ ancestral genealogy (g = E g , tY ) tree topology µ Q dates of divergence ‣ population history (θ) Ne

time

P(g,µ,θ,Q|D)= 1_ Pr{D|g,µ,Q}fg (g|θ)fµ (µ)fθ (θ)fQ (Q) Z “relaxed phylogenetics and dating with confidence” t = { t 1, t 2, ... , t 2n-1 } -λr i R = { r 1, r 2, ... , r2n-1 } f (R|g) = f (R) = Π λe i = 1

Uncorrelated relaxed clocks: example

‣ Phylogenetic inference

‣ measuring autocorrelation

‣ measuring clock-likeness 0.6 0.5 0.4 0.3 dlnorm(x, 0, 1) 0.2 0.1 0.0

0 1 2 3 4 5 x Evaluating clock-like behaviour? 0.6 0.5 0.4 0.3 dlnorm(x, 0, 1) 0.2 0.1 0.0

0 1 2 3 4 5 x

mean stdev

Bayesian model testing

• Goal: finding the most appropriate model for your data

• Over-fitting: too many parameters, the model is too complex • Under-fitting: too few parameters, the model is too simple

• Don’t compare all possible model combinations (evolutionary model, clock models, coalescent tree prior, …) to one another!

• Test/compare those models if that is part of the hypothesis your testing, or if your hypothesis test is sensitive to the model choice Model testing using Bayes factors

• A Bayesian alternative to classical hypothesis testing: the (a summary of the evidence provided by the data in favor of one scientific theory, represented by a , as opposed to another; Kass & Raftery, 1995).

p(Y|M1) • Bayes factor B01 = p(Y|M0)

• When two models M0 and M1 are being compared, one defines the Bayes factor in favor of M1 over M0 as the ratio of their respective marginal likelihoods

• When there are unknown parameters, the Bayes Factor B01 has in a sense the form of a likelihood ratio

Model testing using Bayes factors

• However, the densities are • Posterior: obtained by integrating p(θ|Y,M) = over parameter space: p(Y|θ,M) p(θ|M) p(Y|M) = p(Y|θ,M) p(θ|M) dθ p(Y|M) ∫θ

• So for model fit, the marginal likelihood p(Y|M) or integrated likelihood, i.e. the normalizing constant (cancels out in the calculation of the MH acceptance ratio), is of primary importance, but awfully hard to calculate. Reminder: MHG MCMC Sampling

The algorithm starts from a random state (θ) and ‘proposes’ a new state (θ*) The new state is accepted with probability:

p (θ*|D) p (θ| θ*) R = min 1, x ( p (θ|D) p (θ*| θ) )

= min 1, p (D|θ*) p (θ*)/p(D) f (θ| θ*) x Ψ* ( p (D|θ) p (θ)/p(D) f (θ*| θ) )

the two marginal likelihoods cancel out Ψ and don’t have to be computed !

f (D|θ*) f (θ*) f (θ| θ*) = min 1, x x ( f (D|θ) f (θ) f (θ*| θ) ) Likelihood ratio Prior ratio Proposal ratio

Calculating marginal likelihoods

Methods of general applicability:

• the posterior estimator (pAME; Aitkin, 1991) • the arithmetic mean estimator (AME/ILP; but a misnomer) • the importance sampling estimators, and particularly the estimator (HME) (Newton and Raftery, 1994) • the stabilized harmonic mean estimator (sHME) (Redelings and Suchard, 2005)

No additional analysis required

• path sampling (Gelman, 1998; Ogata, 1989), applied in phylogenetics (Lartillot and Philippe, 2006) • stepping-stone sampling (Xie et al., 2011) Additional analysis required • generalised stepping-stone sampling (Fan et al., 2011; Baele et al., 2016) path sampling and stepping-stone sampling

• requires samples from a series of power posteriors, along a path between prior and posterior:

qβ(θ) = p(Y | θ,M)βp(θ | M) reduces to the posterior when β = 1 reduces to the prior when β = 0

path sampling and stepping-stone sampling Generalised stepping-stone sampling

requires samples from a series of power posteriors, along a path between reference/working prior and posterior:

qβ(θ) = [p(Y | θ,M)p(θ | M)]βp0(θ | M)1-β

• reduces to the original SS method if the reference/working distribution is equal to the actual prior • in practice, samples from the posterior distribution (β = 1) are used to parameterize the joint reference/working distribution p0(θ|M) • we will use kernel (KDE) to construct reference/ working priors for each of the parameters being estimated

GSS: decreased run time

• GSS does not need to explore the prior, which avoids computing the likelihood for highly unlikely parameter values, which may lead to numerical instabilities • combined with a “shorter” path to be traversed, this leads to a drastic performance increase (dependent on the actual reference/working prior) Bayesian vs model averaging

• Test/compare those models if that is part of the hypothesis your testing, or if your hypothesis test is sensitive to the model choice

Model selection refers to the problem of using the data to select one model from the list of candidate models

Model averaging refers to the process of estimating some quantity under each model and then averaging the estimates according to how likely each model is.

Random local clocks

➡ local clocks So, can we handle the - specify H0 a priori uncertainty in the number - problem of identifiability and locations of a small number of local clocks? ➡ uncorrelated relaxed clocks - Rate changes do not necessarily occur regularly or on every branch - Small number of significant changes

δi= 0 • three local clocks

1 0 0 • two rate

Fast Rodent changes Slow Apes

0 1 0 0

➡ How to explore 22n-2 clock models? Random local clocks

➡ Using LocalBayesian Clock stochastic Comparison search variable selection with : formulate a prior that such that many rate changes (indicators) are 0 but allowDouzery the data (2003)to determine which ones are required to explain (most of the) rate variation using MCMC 3NuclearGenesfrom42Mammals(GTR+Γ) ➡ Three mtDNA nuclear genes from Posterior 42 mammalsPrior (Douzery, 2003) ➡ 5-12 local clocks Probability Posterior Prior 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0 2 4 6 8 10 12 # of Rate Changes

ConsistentProbability results (5-12 Drummond and 0.0 0.1 0.2 0.3 0.4 0.5 0.6 local0 clocks). 2 4 6 8 10 12 Suchard, 2010. # of Rate Changes RLC model provides an automated approach to discover local clocks and their uncertainty. PhyloGroup, September 2007 – p.9

Random local clocks

➡ Testing whether a branch accommodates a rate change using Bayes factors

๏ Data D is assumed to have been arisen under one of two models, or one of two hypotheses H1 and H2.

Posterior Prior so that Probability

posterior Bayes prior odds 0.0 0.1 0.2 0.3 0.4 0.5 0.6 odds factor 0 2 4 6 8 10 12 # of Rate Changes

๏ Prior probabilities pr(H1) and pr(H2) = 1 - pr(H1). Posterior probabilities pr(H1 |D) and pr(H2 |D) = 1 - pr(H1 |D) Extensions for testing evolutionary rate hypotheses © Jennifer Gardy

Pybus and Rambaut, NGR, 2009 INTRA-HOST EVOLUTION INTER-HOST EVOLUTION

-2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 1960 1965 1970 1975 1980 1985 1990 1995 months years Lemey et al 2006 AIDS Rev

2 2 R = 0.67 R = 0.89 Root-to-tip divergence Root-to-tip divergence

Time (year) Time (months since seroconversion) Independent parameter estimation

Prior

µ1 µ2 µ3 ... µn

Posterior

0.0 0.0 0.0 0.0

Hierarchical phylogenetic models

µ,σ

Prior

µ1 µ2 µ3 ... µn

Posterior

0.0 0.0 0.0 0.0

Edo-Matas et al., MBE, 2011 Hierarchical model with fixed effects

HIV-1 Hierarchical Phylogenetic Hypothesis Testing · doi:10.1093/molbev/msq326 MBE

AB 04 04 − − 04 9e 04 9e − − 7e 04 7e 04 − − 5e evolutionary rate (subs./site/mth) Edo-Matas et al., MBE, 2011evolutionary rate (subs./site/mth) 04 5e 04 − − 3e 3e progressors LTNP WT progressors LTNP WT

FIG. 2. Evolutionary rate estimates using an HPM applied separately to four patient groups (progressors, LTNP, WT, and D32). Evolutionary rate estimated under strict clock model (A). Mean evolutionary rate estimated under relaxed-clock model (B). LTNP (Long-term non-progressors); WT (CCR5 wt/wt); D32 (CCR5 wt/D32).

For nucleotide analyses, we apply this hierarchical known grand mean and error variance and specify that Hierarchical model with setupfixed to the stricteffects clock evolutionary rate (on the log aprioribLTNP and bD32 are normally distributed with scale), the mean evolutionary rate parameter of the log- mean 0 and a variance of 1/2. We choose 1/2, as, before HIV-1 Hierarchical Phylogenetic Hypothesis Testing · doi:10.1093/molbev/msq326normal relaxed clock (log), the constant populationMBE size seeing the data, we believe that if a factor does result in (log) of the demographic prior, the GTR substitution pa- different evolutionary parameters across population rameters (log), and the (log) of the dis- groups, process parameters should differ by at most an AB crete gamma distribution modeling rate variation among order of magnitude on their original scale. The introduc- 04 sites. 04 For codon model analyses, a hierarchical transition/ tion of HPMs into BEAST necessitates the development of − − transversion rate parameter and a hierarchical dN/dS rate MCMC transition kernels to efficiently explore that space ratio (Goldman and Yang 1994)replacetheGTRmodel of the grand mean and , model indicator, and 04 9e 04 9e − parameters.− random-effects variance parameters. Given our judicious

7e prior choices, the full conditional distributions of these Hierarchical Estimation with Population-Specific Fixed parameters are in standard form: multivariate normal, bi- 04 7e 04 − Effects− nomial, and inverse gamma, respectively. This enables us 5e

evolutionary rate (subs./site/mth) to build highly effective Gibbs samplers (Casella and evolutionary rate (subs./site/mth) For hypothesis-testing purposes, we extend the HPM to in- clude across-population fixed effects. Each patient belongs George 1992; Suchard et al. 2003)overthejointspace 04 5e 04 − to− one of four fixed population groups that we can desig- of these parameters. Suchard et al. (2003) provide de- 3e 3e progressors LTNP WT nateprog usingressors two LTNP indicator WT factors: LTNPi 5 0(1) for short tailed derivations of the full condition distributions (long)-term progressors and D32i 5 0(1) for deletion 32 and their Gibbs samplers (Suchard et al. 2003). We imple- FIG. 2. Evolutionary rate estimates using an HPM applied separately to four patient groups (progressors, LTNP, WT, and D32). Evolutionary rate estimated under strict clock model (A). Mean evolutionary rate estimated underabsent relaxed-clock (present) model patients. (B). LTNP (Long-term Our HPM non-progressors); assumes WT ment these Gibbs samplers as regular BEAST ‘‘operators’’ Edo-Matas et al. · doi:10.1093/molbev/msq326 (CCR5 wt/wt); D32 (CCR5 wt/D32). that are nowMBE accessible to interested readers through logh 5 b d b LTNP d b D32 e ; i 0 þ LTNP LTNP i þ D32 D32 i þ i BEAST’s XML model specification language. Supplemen- Table 1. Estimates of the LTNP and D32 Effects on Nucleotide Substitution Rates, Codon Substitution Rates, and dN/dS. For nucleotide analyses, we apply this hierarchical knownwhere grandb0 is an mean unknown and error grand variance mean, anddLTNP specifyand d thatD32 are bi- tary Material online to this paper reports the transition Evolutionary Parameter Effect Support/Size LTNP Effect D32 Effect setup to the strict clock evolutionary rate (on the log apriorinary indicatorbLTNP and variables,bD32 arebLTNP normallyand bD32 distributedare conditional with effec- kernels’ XML syntax and gives examples on their use to Nucleotide substitution rate deffect 5 1 0.72 0.27 scale), the mean evolutionary rate parameter of the log- meantive 0sizes, and and a varianceeiBFareeffect ofindependent 1/2. We choose and normally1/2, as,2.6 before distributed implement0.4 HPMs. b d 5 1a 20.275 (20.524, 20.016) 20.007 (20.940,0.920) normal relaxed clock (log), the constant population size seeingrandom the variablesdata, weeffect believe withj effect mean that if 0 a and factor an does estimable result variance. in To assign statistical significance to differences between (log) of the demographic prior, the GTR substitution pa-Codon substitutiondifferent rate evolutionary Posterior parameters probability deffect 5 across1 population 0.726 0.324 The inclusion of theBFeffect indicator variables follows from2.6 a Bayesian 0.5 a population groups, we employ Bayes factors (BFs) (Jeffreys rameters (log), and the shape parameter (log) of the dis- groups, process parametersbeffect deffect 5should1 differ by at2 most0.265 (2 an0.523,0.019) 20.012 (20.700,0.692) stochastic search variablej selection approach (Kuo and 1998; Suchard et al. 2001) that report how much the data crete gamma distribution modeling rate variation amongdN/dS orderMallick of magnitude 1998; ChipmanPosterior on their probability et originalal. 2001deffect) scale.5 that1 The simultaneously introduc- 0.502 esti- 0.393 BFeffect 1.0 change0.6 our prior opinion (here, 1:1 odds) about the inclu- b d 5 1a 0.083 (20.101,0.25) 20.005 (20.228,0.242) sites. For codon model analyses, a hierarchical transition/ tionmates of HPMs the posterior into BEASTeffect probabilitiesj effect necessitates of the all possibledevelopment linear of models a sion of each factor. These BFs are straightforward to esti- transversion rate parameter and a hierarchical dN/dS rateThese are effectiveMCMCthat sizes conditional may transition or on the may effect kernels being not included toinclude efficiently (the binary LTNP effect explore indicator or Ddeffect32 thatbeing status space 1). For the effects. rates, these effective sizes are in log space. LTNP (Long- term non-progressor); D32 (CCR5 wt/D32). ratio (Goldman and Yang 1994)replacetheGTRmodel ofWhen the grand an indicator mean and equals effect 1, size, this model effect indicator, is included and in the mate through the variable selection procedure, as the BF parameters. indicate arandom-effects bettermodel, fit ofdemonstrating the relaxed-clock variance model, parameters. that with the ln evolutionary BFs GivenEdo-Matasfect our returns processjudicious et a CIal., thatparam- MBE, does not 2011 includeequals 0. This the approach posterior ap- odds that a factor indicator equals of 7.8, 6.1,prior 4.4,eter and choices, differs 4.2 in favor with the of thehigh full relaxed conditional probability clock for pro-betweendistributionspropriately patient ofthese controls population for the nonindependence1 divided missed by the in the corresponding prior odds. The posterior Hierarchical Estimation with Population-Specific Fixedgressors, LTNP,parametersgroups. WT, andLemey areD32, in etrespectively. standard al. (2009) form:Thediscuss fact multivariate that Bayesiangroup-by-group normal, stochastic bi- analyses search and rejectsodds the follownull hypothesis immediately of from the marginal posterior prob- a strict clocknomial, could often and not inverse be rejected gamma, for individual respectively. pa- Thisno difference enables between us LTNP and progressor patients. Effects tient analysisvariable also indicates selection the HPM in draws further on detail.increased There was no support in favor ofability a D32 effect. that Even a factor after indicator equals 1 that we estimate For hypothesis-testing purposes, we extend the HPM tostatistical in- powerto buildWe of HPMs complete highly to reject effective simpler this HPM models. Gibbs model Because samplers withconditioning ( variableCasella and on selection the effect indicatorthrough equaling 1 the to estimate posterior expectation of the factor indica- clude across-population fixed effects. Each patient belongsof the increasedGeorgethrough model 1992 fit, assigning; weSuchard employ relaxedindepende et al. clocks 2003 innt)overthejointspace fur- Bernoullithe potential prior probabil- effect size, the posteriortor. InD cases32 effect-size where an estimate of this expectation ap- ther codonof model these analyses parameters. and hypothesisSuchard tests et incorpo- al. (2003)parameterprovide distribution de- remained centered close to 0 with to one of four fixed population groups that we can desig- ity distributions on dLTNP and dD32.Thesedistributions proaches very closely to 0 or 1, an estimator based on a nate using two indicator factors: LTNP 5 0(1) for shortrating fixedtailed effects. derivations of the full conditionsymmetric distributions CIs. In the codon analysis, the same effects were i Analyses usingplace a codon equal model probability revealed comparable on each co- factor’stested inclusion on both the and substitution ex- rateRao-Blackwellization and d /d . A very sim- procedure is available (Casella and and their Gibbs samplers (Suchard et al. 2003). We imple- N S (long)-term progressors and D32i 5 0(1) for deletiondon 32 substitutionclusion. rate differences We further between assume progressors/ diffuseilar priors LTNP on effect the wasun- observed forRobert the codon 1996 substitution). absent (present) patients. Our HPM assumes LTNP andment between these WT andGibbsD32 samplers comparedas with regular the nu- BEASTrate, ‘‘operators’’ although the CIs now included 0. Interestingly, the cleotide analysesthat are (fig. 2B now vs. supplementary accessible to fig. interested1A, Supple- readersconditional through effect size of LTNP versus progressor on codon logh 5 b d b LTNP d b D32 e ; 1609 i 0 þ LTNP LTNP i þ D32 D32 i þ i mentary MaterialBEAST’sonline). XML modelHierarchical specificationdN/dS estimates, language.substitutionSupplemen- rate remains very similar to the effect size on however, weretary Materialcomparableonline for the to four this patient paper groups reportsnucleotide the transition substitution rate. Furthermore, there was more where b0 is an unknown grand mean, dLTNP and dD32 are( bi-supplementary fig. 1B, Supplementary Material online). support against than in favor of a D32 effect. Finally, no nary indicator variables, bLTNP and bD32 are conditional effec- kernels’ XML syntax and gives examples onsupport their for use an to LTNP effect or D32 effect was observed tive sizes, and ei are independent and normally distributedHypothesisimplement Testing Using HPMs. HPMs Incorporating on the hierarchical dN/dS estimates. random variables with mean 0 and an estimable variance.Across-Population Fixed Effects To assign statistical significance to differencesDiscussion between The inclusion of the indicator variables follows from a BayesianThe fourpopulation different groups groups, considered we employ previously Bayes are not factors (BFs) (Jeffreys stochastic search variable selection approach (Kuo andcomprised of independent patient sets; some patients fall In this study, we adopted an HPM approach to estimate in more than1998 one; Suchard group. Hence, et al. direct 2001 comparison) that report of the how muchwithin-host the dataHIV evolutionary parameters and test evolu- Mallick 1998; Chipman et al. 2001) that simultaneously esti- change our prior opinion (here, 1:1 odds) about the inclu- mates the posterior probabilities of all possible linear modelsmarginal parameter estimates fit to each group indepen- tionary hypotheses regarding host susceptibility and dis- dently doession not ofgenerate each independent factor. These estimates. BFs are For straightforwardmore ease progression. to esti- We sought to investigate whether the that may or may not include LTNP or D32 status effects. appropriatemate hypothesis through testing the of variable difference, selection the HPM forprocedure,CCR5 as wt/ theD32BF genotype, which is associated with a lower When an indicator equals 1, this effect is included in thethe evolutionary rate was extended to accommodate fixed viral load set point and a slower HIV-1 disease progression model, demonstrating that the evolutionary process param-effects (seeequals Materials the and posterior Methods), enabling odds that estimation a factor of indicator(de Roda equals Husman et al. 1997; Ioannidis et al. 2001), also eter differs with high probability between patient populationhierarchical1 parameters divided by across the all corresponding patients. Successfully, prior hi- odds.impacts The posterior the evolutionary rate of the virus by limiting target groups. Lemey et al. (2009) discuss Bayesian stochastic searcherarchicalodds estimation follow with immediately fixed effects across from theall patients marginal posteriorcell or CCR5 prob- availability. Furthermore, we wanted to eval- variable selection in further detail. resulted inability even further that shrinkage a factor of indicator individual patient equals es- 1 thatuate we the estimate contribution of CCR5 availability and CCR5 use on We complete this HPM model with variable selectiontimates comparedthrough with the hierarchal posterior models expectation applied to sepa- of thethe factor selection indica- pressure directed against the viral envelope rate groups (fig. 3B). BF comparison of the fixed-effects HPM protein by estimating dN/dS. through assigning independent Bernoulli prior probabil-model withtor. a model In cases that assumes where either an estimate completely oflinked this expectationHPMs have ap- been used for HIV evolutionary enquiry be- ity distributions on dLTNP and dD32.Thesedistributionsor unlinkedproaches parameters very (ln BF closely of 51.7 and to 57.2,0 or respectively) 1, an estimatorfore, based but this on is thea first study that develops HPMs to es- place equal probability on each factor’s inclusion and ex-provides strongRao-Blackwellization evidence that the shrinkage procedure is accompanied is availabletimate (Casella evolutionary and rate, dN/dS, and demographic clusion. We further assume diffuse priors on the un-by improvedRobert goodness 1996 of). fit. The main results of the fixed- parameters. In an HPM framework, we assume that the pa- effect HPM analyses are listed in table 1. For the nucleotide tient-specific HIV-1 evolutionary parameters can be drawn analysis, the LTNP versus progressor and WT versus D32 ef- from a population distribution. Estimations of the evolu- fects were employed to model the evolutionary rates. tionary process1609 based on a limited sample from each Through examining the posterior distribution of the rate in- patient are riddled with noise, and the improvement dicators (deffect), we estimate the posterior probability for of an HPM follows from the reduced uncertainty on in- including the LTNP versus progressor effect at 0.72 resulting dividual patient estimates. BF comparison further confirms in a moderate BF support of 2.6 in agreement with the a considerable improvement in goodness of fit of the HPM group-by-group hierarchical rate estimates obtained above. with respect to a completely linked and unlinked model. Importantly, the rate decrease attributable to this fixed ef- This can be explained by the fact that the completely linked

1612 Pybus and Rambaut, NGR, 2009 INTRA-HOST EVOLUTION INTER-HOST EVOLUTION

-2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 1960 1965 1970 1975 1980 1985 1990 1995 months years Lemey et al 2006 AIDS Rev

2 2 R = 0.67 R = 0.89 Root-to-tip divergence Root-to-tip divergence Local clocks with random effects Time (year) Time (months since seroconversion)

B G Mixed effects model: F

A log µi = θi +βXi

B (i is red or black)

H

I

C h3

D r r 6

0.6 5 L 0.5 h h C 0.4 1 2 r3 r4 0.3 r r1 2 dlnorm(x, 0, 1) K 0.2 AA GA AC GC

E 0.1 0.0

0 1 2 3 4 5 x

Vrancken et al., PloS Comp Bio, 2014, 10(4): e1003505 Rates of HIV evolution within and between hosts

B G Mixed effects model: F

A log µi = θi +βXi

B H pol env I

C Rate (10-3 subst./site/yr) D

Xi=0 (within 5.70 10.37 L host) (4.02-6.21) (8.06-12.76)

C Xi=1 2.21 3.80 (transmitted (1.57-2.99) (2.32-5.20) K lineage) E ln Bayes factor (ratetransmitted < ratewithin)

>7.50 >6.29

Vrancken et al., PLoS Comp Bio, 2014

What drives the tempo of pathogen evolution?

Pathogen factors

Mutation rate Life cycle/ dynamics

Host factors Life history Seasonality Metabolic rate etc.

Historical factors

Pathogen phylogeny 50 years

Courtesy of D. Streicker

Bat rabies virus evolutionary rates

Streicker et al., 2012. PLoS Pathogens Fixed-effect hierarchical phylogenetic models

Climate Basal metabolic rate Torpid metabolic rate Prior Coloniality Seasonal activity log µ = β0 + δP1βP1P1+ δP2βP2P2+…δPNβPNPN Long-distance migration

µ1 µ2 µ3 ... µn

Posterior

0.0 0.0 0.0 0.0 Edo-matas et al., 2011. MBE

Fixed-effect hierarchical phylogenetic models

Climate Basal metabolic rate Torpid metabolic rate Prior Coloniality Seasonal activity log µ = β0 + δP1βP1P1+ δP2βP2P2+…δPNβPNPN Long-distance migration

µ1 µ2 µ3 ... µn

Posterior

0.0 0.0 0.0 0.0 Edo-matas et al., 2011. MBE Fixed-effect hierarchical phylogenetic models

Climate Basal metabolic rate BSSVS Torpid metabolic rate Prior Coloniality Seasonal activity log µ = β0 + δP1βP1P1+ δP2βP2P2+…δPNβPNPN Long-distance migration

µ1 µ2 µ3 ... µn

Posterior

0.0 0.0 0.0 0.0 Edo-matas et al., 2011. MBE

Bat rabies virus evolutionary rates

Streicker et al., 2012. PLoS Pathogens