<<

Evolution and

02-715 Advanced Topics in Computaonal A Coalescence at the most recent common ancestor Genealogy Coalescence in Population Genetics

• Stochasc model for gene frequencies in finite populaons

• Assumpons – The populaon size N is constant from generaon to generaon – are diploid (2N copies of each gene) – All members of each generaon reproduce simultaneously: generaons do not overlap – Random mang – frequencies are not perturbed by migraon or selecon Coalescence in Population Genetics

• The model traces the ancestry of each individual at the current generaon backward

• An individual at generaon t randomly samples two individuals at generaon t-1 as parents

• In finite number of generaons, all individuals will coalesce into a single individual Coalescence

• Assume effecve populaon size of N: 2N copies in diploid organisms

• The probability that two genes share a common parent in the previous generaon: 1/(2N)

• The probability that the two genes do not share a common parent in the previous generaon: 1-1/(2N)

• The probability of coalescence at me T: 1 1 P(T) = (1− )T −1 2N 2N – For large N, we have T −1 1 − P(T) = e 2N € 2N

€ Write-Fisher Model

• Looking back in me, the stochasc model defines the probability Ft that two genes are idencal by descent in generaon t through Ft-1 1 1 F = (1− µ)2 + (1− )(1− µ)2 F t 2N 2N t −1

Both genes are copies of a Two genes are nonmutant copies single gene from of different genes in the previous generaon t-1 and neither generaon and those two genes € copy is mutant are idencal by descent – µ : rate – N : vs. Population Genetics

• Phylogenecs – Assumes a single correct species phylogeny that holds across – Ignores variaons among individuals of the same species or assumes a negligible variability within species – Reduces the enre populaon of a species into a single individual

• Populaon genecs – Usually concerned with within-species variaon in genomes – Individuals within a species are related by genealogies

Siepel, A. Res. 19(11):1929-41. 2009. Phylogenomics of primates and their ancestral populaons. Population-aware Phylogenetics

• Primate species – Divergence me is short relave to ancestral populaon sizes – Phylogenecs assumpons do not hold – Non-negligible populaon genec effects

• Interspecies comparison, taking into account selecve forces within species, ancestral populaons, modes of speciaon Phylogeny of Primates

Siepel, A. Genome Res. 19(11):1929-41. 2009. Phylogenomics of primates and their ancestral populaons. Darwin’s Phylogeny Genealogies in Wright-Fisher Model Population Genetic Interpretation of

• T: coalescent me

• τ: speciaon me Population Genetic Interpretation of Speciation

• τ>>2Ne: – τ+T is approximately τ – Divergence between individual as an esmate of speciaon me – the phylogenecs assumpon holds Population Genetic Interpretation of Speciation

• τ<

• τ~Ne: – Both ancestral populaon dynamics and interspecies divergence must be considered – Populaon-aware phylogenecs Three-Species Phylogeny

• Three species X, Y, and Z with speciaon me and coalescent me – X: human – Y: chimpanzee – Z: gorilla • Ancestral populaons: XY and XYZ

• Black phylogeny: discordance with the phylogeny among the three species

• Gray phylogeny: concordant with the phylogeny among the three species

• ILS: incomplete lineage sorng with deep coalescent – Gene tree and species tree can differ Three-Species Phylogeny

• Incomplete lineage sorng – The probability that X and Y will coalesce before the divergence from Z

where

Nxy: populaon size of XY Three-Species Phylogeny

• When Nxy, Nxyz are small, τxy and τxyz approximate the divergence me well

• Otherwise, the

coalescent me Txy, Txyz need to be taken into account Ancestral Recombination Graph for Three Individuals

• Different genealogies for different regions of a due to recombinaon

• Ancestral recombinaon graph represents a marginal genealogy that summarizes the genealogies across the whole chromosome

• Most recent common ancestor (MRCA): all chromosomes eventually coalesce to the Ancestral Recombinaon common ancestor Graph Ancestral Recombination Graph for Three Individuals

• Phylogenec ancestral recombinaon graph – Recombinaons and coalescence are constrained by the phylogenec tree

Phylogenec Ancestral Recombinaon Graph What if We Ignore Incomplete Lineage Sorting

• Aligned human (Hom), chimpanzee (Pan), gorilla (Gor), orangutan (Pon) sequences • Two different esmated lineages • Without consideraon of ILS, substuon rates are overesmated Coal-HMM (Hobolth et al., 2009)

• Four states corresponding to different phylogenies with ILS

• Transions to other states correspond to recombinaons Coal-HMM

• HC1 state (with no ILS) explains only ~50% of sites

• Remaining states explain the other 50% proporoned roughly equally Summary

• Populaon genecs considers the difference in genome sequences within a species, whereas phylogenomics considers the genome sequence difference across species

• For recently diverged species with relavely small populaon sizes, one needs to consider phylogenecs and populaon genecs ideas simultaneously