NEUTRAL THEORY TOPIC 3: Rates and patterns of molecular evolution Neutral theory predictions A particularly valuable use of neutral theory is as a rigid null hypothesis. The neutral theory makes a wide variety of predictions, and one or more of these predictions may be tested in any given molecular dataset. Depending on which predictions (if any) are rejected, we gain considerable insight in the underlying process of evolution for the involved molecular data. The following four predictions are so widely applicable to the field of molecular evolution, that they are often viewed as principles of molecular evolution. 1. The level of within species genetic variation is determined by population size and mutation rate, and is correlated with the level of sequence divergence between species. 2. The rate of gene evolution (substitution) is inversely related to the level of functional constraint (purifying selection) acting on the gene. 3. The pattern of base composition (and codon usage in protein coding genes) at neutral sites reflects mutational equilibrium. 4. There is a constant rate of sequence evolution; i.e., a molecular clock. Each of these predictions is examined in detail in the following four sections of these notes. 1. Variation within and among species Neutral theory provides the bridge between microevolution, in populations, and macroevolution. The connection between the two is actually quite simple, which is one of the reasons why neutral theory has been so successful as a scientific theory. Neutral theory makes two clear predictions about genetic variation within and between species: 1. Equilibrium polymorphism (usually measured as heterozygosity) is controlled by only two parameters; population size (Ne) and mutation rate (µ). 2. Neutral population polymorphism is correlated with divergence between species. 1.1 Equilibrium polymorphism: We covered the first prediction in some detail in the last set of notes. Not surprisingly, there was early interest (1970’s) in comparing natural levels of heterozygosity inferred by protein gel-electrophoresis with those predicted under neutral theory. The results were surprising, in that natural levels of population polymorphism were lower than expected. This finding lead to two important developments in the field: (i) that the parameters Ne and µ are hard to estimate; and (ii) that there were some problems with the original theory that were corrected in what is now called NEARLY NEUTRAL THEORY (we will return to this topic later). Note that protein gel electrophoresis can underestimate polymorphism, and that more recent studies have revealed a general association between heterozygosity and mutation rate. This approach has low power as a means of testing for the expectations of neutral evolution, so most modern work in this area has focused on prediction 2. 1.2 Polymorphism and divergence are correlated: If genes are evolving neutrally, measures of polymorphism within a species should be proportional to the level of divergence between species. This is where the impact of the genetic code on the effect of a mutation becomes very important. Remember that all changes in protein coding sequences can be divided into two classes: (i) synonymous (S) and (ii) non-synonymous (NS). These types of mutation will be impacted differently by the effect of selection on the protein product of the gene. Under neutral theory, selection is not involved, so the ratio of S to NS polymorphism within a species is expected to be equal to the ratio of S and NS substitutions measured between species. If positive selection were acting on at least some mutations, their residence time in the population would we lower than neutral mutations. Hence the ratios would not be the same, and NS mutants would represent a smaller proportion of the within species polymorphism. Comparison of the ratio of synonymous and nonsynonymous polymorphism within species to divergence between species. Neutral theory suggests that the fraction of variation that is nonsynonymous within species should be the same as between species. Species 1 Species 2 Species 3 12:4 6:2 10:3 Polymorphism within a species 17:6 14:5 19:6 Substitutions between species Synonymous (S) Non-synonymous (NS) S:NS Polymorphic 28 9 3.1 Fixed 50 17 2.9 Data are hypothetical. Ratios are tested by using a G-test on the counts of S and NS. These hypothetical data are not significant. If positive selection were acting, residence times for NS would be lower within species and polymorphic S:NS > fixed S:NS. Tests for heterogeneity in the pattern of polymorphism to divergence are called NEUTRALITY TESTS. Tests need not be based on S and NS; amino acids can be divided into physiochemically radical (r) and conservative (c) and the c:r ratio can be tested for heterogeneity. Neutrality tests are powerful and useful. However there is an important caveat with the interpretation of a significant result. Rejection of strict neutrality does not distinguish between violation of the assumption of selective equivalence of alleles, and violation of another one of the involved assumptions of the model. For example, if the effect of selection changes over time due to changes in effective population size, as in nearly neutral theory, a significant result will be obtained from this test. We will return to this topic later. 2. The rate of gene evolution is inversely related to functional constraint Under neutral theory the substitution rate is determined by the mutation rate and the probability of fixation. It is well known that rates vary among genes (e.g., histones verses MHC) and within genes (e.g., introns verse exons). Such rate variation is consistent with neutral theory, even when mutation rates are the same. Remember that neutral theory only asserts that polymorphism is selectively equivalent; it does not require that the frequency of such polymorphism cannot change among sites, gene, or species. 2.1 Variation within genes: We begin with rate variation within genes because it is unlikely that mutation rates vary, and the interpretation of variation in substitution rates is easier. Consider a protein coding gene. It stands to reason that due to the genetic code mutations at some sites will have little effect on the encoded protein (e.g., 3rd codon positions) whereas mutations at other sites (e.g., 1st and 2nd codon positions) are very likely to affect the encoded protein. Consequently, the frequency of selectively equivalent alleles occurring at 3rd codon positions is expected to be much higher than 1st and 2nd codon positions. Hence, 3rd positions are expected to evolve more quickly than 1st and 2nd codon positions. The evolution of functional genes fits this pattern in the vast majority of known cases. For a real example see plot below. Mean number of substitution per site at the three codon positions of the epsilon-globin gene of primates. Two measures are presented: (i) the average over all pair wise comparisons between genes; and (ii) the sum of the branch lengths of the epsilon globin gene tree. Cebus Mean number of substitutions/site Saimiri Aotus 0.15 0.8 Callithrix Lagothrix 0.6 Brachyteles 0.1 Alouatta 0.4 Ateles 0.05 over tree Pan 0.2 pairwise subst/site Homo subst/sitea sum as Pongo 0 0 Macaca 123 Hylobates Tarsius Codon position Galago Otolemur mean pairwise subst rate Cheirogaleus Subst rate as a sum of branch lengths 0.01 Eulemur Under both measures of substitution rate, 3rd codon st nd Gene tree for primate epsilon globins positions evolve faster than 1 and 2 positions. Note: mean number of substitutions per site were computed in all cases by using the Jukes and Cantor (1969) correction. The previous logical argument, as well as the above plots of real data, demonstrates a well known principle of molecular evolution: The greater the functional constraint, the slower the rate of molecular evolution. Under neutral theory we can formulate this principle as a model (Kimura 1968). First we divide all mutations into three categories: (i) adaptive, (ii) deleterious, and (iii) neutral. The first category is assumed to occur very rarely, so their frequency is expected to be effectively zero. Hence the frequency of deleterious mutations is fD and the frequency of neutral mutations f0 = 1 - fD. Let µT equal the total mutation rate per site per unit time. Then the neutral mutation rate per site is: µ0 = µT f0 Hence, the rate of substitution per site per unit time is: k = µT f0 The rate of evolution depends on the “size (f ) of the selective sieve” 0 New mutations New mutations Fixation in a “slow gene” Fixation in a “fast gene” Kimura’s f0 is the fraction of mutations that passes through the “sieve”. So within genes we will assume that µT is the same for all sites. Clearly the value of f0 is largest for the 3rd codon positions of protein coding genes. However we know that not all mutations at 3rd codon positions are synonymous. Thus we might expect that f0 for synonymous positions is even rd larger than f0 for 3 codon positions, and this turns out to be generally true. rd positions: 0.40 Mean at 3 and rodents is higher for synonymousMean at synonymous sites sites: 0.61 45 40 mous sites 35 y 30 ynon S 25 ymous. 20 non 15 y 10 ostions number of proteins of number p 5 substitutions / site / 2x80 million years 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 More rd codon positions because some Rodent 3rd codon gene The average substitution rate between primates as compared with third codon positions. The results are based on a sample of 82 nuclear This result is consistent with neutral theory given that genes.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages15 Page
-
File Size-