Introduction to Coalescent Theory Overview

Introduction to Coalescent Theory Overview

Overview • Intro to Coalescent Theory (Today) Introduction to Coalescent Theory • Genomic Imprinting, Mathematical Modeling, and Notions of Optimality in Jon Wilkins Evolution (Wednesday) Santa Fe Institute [email protected] • Statistical Inference in Complex Beijing CSSS 2007 Systems Ingredients of Natural Selection Population Genetics • Heritable variation • How is variation generated and maintained in a population? • Differential reproductive success • What can patterns of genetic diversity tell us • Causal connection between the two about the history of a population? – Demography (migration, reproduction, etc.) – Molecular events (mutation, recombination, etc.) – Natural selection (directional, purifying, etc.) 1 Why diversity? Neutral Theory • Muller - mutation drives deviations from the • Not just selective neutrality optimal phenotype • Dobzhansky - heterogeneous environments / • Constant population size frequency dependent effects • Well mixed population (panmictic) • Lewontin-Hubby experiments (mid 1960s) – Too much variation for either explanation • This is always true • Kimura - neutral theory Sampling with Replacement The Coalescent • Some alleles pass ACTT • Homologous genes on no copies to the share a common next generation, ancestor while some pass on T • DNA sequence more than one G diversity is shaped by t s a genealogical history P • All that we care about are the • Genealogies are ancestors of C G shaped by chance, sequences present demography, selection in our dataset ACGT ACGT ACTT ACTT AGTT Present 2 Balls in Boxes The Shapes of Genealogies • The coalescent • Time to the MRCA of a models genealogies pair of sequences is Probability = 1/2N * (1-1/2N)3 exponentially t s backwards in time a distributed with mean p E[T2] = 2N e h 2 t time of 2N generations Probability = 1/2N * (1-1/2N) n i s • Time to the next n • Follow ancestral o i t coalescent event for a a r Probability = 1/2N * (1-1/2N) lineages back until e n sample of n sequences e the most recent E[T3] = 2N/3 G is exponential with Probability = 1/2N common ancestor " n% E[T ] = 2N/6 mean 2N/ $ ' generations (MRCA) is reached 4 # 2& E[T ] = 2N/10 Present 5 ! Genealogies are highly variable The problem • The variance on • Want to infer the underlying processes that the length of each have shaped genetic diversity, but portion of the genealogy is large, on the order of N2 • The inherent stochasticity means that any • Variation in given genealogy is consistent with a wide topology as well range of demographic processes • Mutations are random on top of the genealogy • How do we estimate parameters, and how do we know how good our estimates are? 3 Estimating N The Structured Coalescent • Expected pairwise distance (π) – 2N times 2µ (= θ) • Expected number of polymorphisms (S) n#1 1 Location Location " $ • With geographically structured populations, all i 1 i = topologies are not equally likely ! The Structured Coalescent The Island Model N N N N N N Low Migration High Migration • Each migrant is equally likely to come from any deme • The relationship between genealogy and • Population structure, but no geography geography can be used to make inferences " # " 1 F = 0 = ST " 1+ 4Nm ! 4 A finite, linear habitat The Solution % % * * * 2 2 2 2 " f (x )cos(" x) #2$ " (t# 1 ) " j f j (x0 )sin(" j x) #2$ "* (t# 1 ) U (x,t) = i i 0 i e m i 2 + e m j 2 x &" + sin(" )cos(" ) &" * # sin(" * )cos(" * ) MRCA i=1 i i i j=1 j j j % % * * * 2 2 1 " f (y )sin(" y) 2 * 2 1 " i f i (y0)cos(" i y) #2$ m "i (t# 2 ) j j 0 j #2$ m " j (t# 2 ) Uy (y,t) = & e +& * * * e i=1 " i + sin(" i )cos(" i ) j=1 " j # sin(" j )cos(" j ) % 4N 2 1 ( 1)n (2 )2n#1 (n#1)! $ m " i + ' & # $ m" i (2n#1)! 1 1 2 2 2 2 cot(" ) = n=1 f (x ) = cos(" x)(e#(x#x0 ) / 4$ m + e#(2#x#x0 ) / 4$ m )dx i % n i 0 2 ) i 2 2 n 4' $ m #1 past 1+ &(#$ m " i ) (2m n=1 m=1 % 4N 2 * 1 ( 1)n (2 * )2n#1 (n#1)! $ m " j + ' & # $ m" j (2n#1)! 1 1 2 2 2 2 #tan(" * ) = n=1 f *(x ) = sin(" * x)(e#(x#x0 ) / 4$ m + e#(2#x#x0 ) / 4$ m )dx j % n j 0 2 ) j 2 2 * n 4' $ m #1 1+ &(#$ m " j ) (2m n=1 m=1 ! Not trivial to extend to > 1 dimension present Not trivial to extend to > 2 sequences Realistic Geography Coalescent Simulations • In most systems of interest, analytic solutions are too cumbersome • The coalescent provides an efficient framework in which to do simulations • Must understand how to relate the forward-time system to a corresponding backward-time process 5 Coalescent Simulations Ancestral Selection Graph • Dual Process • In the case of selective neutrality, the genealogical and mutational processes are separable – we can produce the genealogy, and then simply place • Lineages coalesce mutations on the tree afterwards " k% with probability $ ' /2N # 2& • But, if there is selection, then the reproductive success of an individual depends on its type • A lineage splits with ! probability sk/2 Extracting the genealogy Take-home messages • The coalescent provides a convenient approach to modeling evolutionary processes – Well suited to dealing with data • Analytic results are accessible only for very simple models • Move down the graph, allowing mutations to – In other cases, it produces efficient simulations occur • Choose the incoming branch with higher • Leaves the question of how to make inferences fitness – Come back on Friday 6.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    6 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us