<<

NEUTRAL THEORY TOPIC 3: Rates and patterns of molecular

Neutral theory predictions

A particularly valuable use of neutral theory is as a rigid null hypothesis. The neutral theory makes a wide variety of predictions, and one or more of these predictions may be tested in any given molecular dataset. Depending on which predictions (if any) are rejected, we gain considerable insight in the underlying process of evolution for the involved molecular data. The following four predictions are so widely applicable to the field of molecular evolution, that they are often viewed as principles of molecular evolution.

1. The level of within species genetic variation is determined by population size and rate, and is correlated with the level of sequence divergence between species.

2. The rate of evolution (substitution) is inversely related to the level of functional constraint (purifying selection) acting on the gene.

3. The pattern of base composition (and codon usage in protein coding ) at neutral sites reflects mutational equilibrium.

4. There is a constant rate of sequence evolution; i.e., a .

Each of these predictions is examined in detail in the following four sections of these notes.

1. Variation within and among species

Neutral theory provides the bridge between microevolution, in populations, and macroevolution. The connection between the two is actually quite simple, which is one of the reasons why neutral theory has been so successful as a scientific theory.

Neutral theory makes two clear predictions about genetic variation within and between species:

1. Equilibrium polymorphism (usually measured as heterozygosity) is controlled by only two

parameters; population size (Ne) and mutation rate (µ).

2. Neutral population polymorphism is correlated with divergence between species.

1.1 Equilibrium polymorphism: We covered the first prediction in some detail in the last set of notes. Not surprisingly, there was early interest (1970’s) in comparing natural levels of heterozygosity inferred by protein gel-electrophoresis with those predicted under neutral theory. The results were surprising, in that natural levels of population polymorphism were lower than expected. This finding lead to two important developments in the field: (i) that the parameters Ne and µ are hard to estimate; and (ii) that there were some problems with the original theory that were corrected in what is now called NEARLY NEUTRAL THEORY (we will return to this topic later).

Note that protein can underestimate polymorphism, and that more recent studies have revealed a general association between heterozygosity and mutation rate. This approach has low power as a means of testing for the expectations of neutral evolution, so most modern work in this area has focused on prediction 2.

1.2 Polymorphism and divergence are correlated: If genes are evolving neutrally, measures of polymorphism within a species should be proportional to the level of divergence between species. This is where the impact of the on the effect of a mutation becomes very important. Remember that all changes in protein coding sequences can be divided into two classes: (i) synonymous (S) and (ii) non-synonymous (NS). These types of mutation will be impacted differently by the effect of selection on the protein product of the gene. Under neutral theory, selection is not involved, so the ratio of S to NS polymorphism within a species is expected to be equal to the ratio of S and NS substitutions measured between species. If positive selection were acting on at least some , their residence time in the population would we lower than neutral mutations. Hence the ratios would not be the same, and NS mutants would represent a smaller proportion of the within species polymorphism.

Comparison of the ratio of synonymous and nonsynonymous polymorphism within species to divergence between species. Neutral theory suggests that the fraction of variation that is nonsynonymous within species should be the same as between species.

Species 1 Species 2 Species 3

12:4 6:2 10:3 Polymorphism within a species

17:6 14:5 19:6 Substitutions between species

Synonymous (S) Non-synonymous (NS) S:NS Polymorphic 28 9 3.1 Fixed 50 17 2.9

Data are hypothetical. Ratios are tested by using a G-test on the counts of S and NS. These hypothetical data are not significant. If positive selection were acting, residence times for NS would be lower within species and polymorphic S:NS > fixed S:NS.

Tests for heterogeneity in the pattern of polymorphism to divergence are called NEUTRALITY TESTS. Tests need not be based on S and NS; amino acids can be divided into physiochemically radical (r) and conservative (c) and the c:r ratio can be tested for heterogeneity.

Neutrality tests are powerful and useful. However there is an important caveat with the interpretation of a significant result. Rejection of strict neutrality does not distinguish between violation of the assumption of selective equivalence of alleles, and violation of another one of the involved assumptions of the model. For example, if the effect of selection changes over time due to changes in effective population size, as in nearly neutral theory, a significant result will be obtained from this test. We will return to this topic later.

2. The rate of gene evolution is inversely related to functional constraint

Under neutral theory the substitution rate is determined by the mutation rate and the probability of fixation. It is well known that rates vary among genes (e.g., histones verses MHC) and within genes (e.g., verse ). Such rate variation is consistent with neutral theory, even when mutation rates are the same. Remember that neutral theory only asserts that polymorphism is selectively equivalent; it does not require that the frequency of such polymorphism cannot change among sites, gene, or species.

2.1 Variation within genes: We begin with rate variation within genes because it is unlikely that mutation rates vary, and the interpretation of variation in substitution rates is easier. Consider a protein coding gene. It stands to reason that due to the genetic code mutations at some sites will have little effect on the encoded protein (e.g., 3rd codon positions) whereas mutations at other sites (e.g., 1st and 2nd codon positions) are very likely to affect the encoded protein. Consequently, the frequency of selectively equivalent alleles occurring at 3rd codon positions is expected to be much higher than 1st and 2nd codon positions. Hence, 3rd positions are expected to evolve more quickly than 1st and 2nd codon positions. The evolution of functional genes fits this pattern in the vast majority of known cases. For a real example see plot below.

Mean number of substitution per site at the three codon positions of the epsilon-globin gene of primates. Two measures are presented: (i) the average over all pair wise comparisons between genes; and (ii) the sum of the branch lengths of the epsilon globin gene tree.

Cebus Mean number of substitutions/site Saimiri Aotus 0.15 0.8 Callithrix Lagothrix 0.6 Brachyteles 0.1 Alouatta 0.4 Ateles 0.05 over tree Pan 0.2 pairwise subst/site Homo subst/sitea sum as Pongo 0 0 Macaca 123 Hylobates

Tarsius Codon position

Galago

Otolemur mean pairwise subst rate

Cheirogaleus Subst rate as a sum of branch lengths

0.01 Eulemur Under both measures of substitution rate, 3rd codon st nd Gene tree for primate epsilon globins positions evolve faster than 1 and 2 positions.

Note: mean number of substitutions per site were computed in all cases by using the Jukes and Cantor (1969) correction.

The previous logical argument, as well as the above plots of real data, demonstrates a well known principle of molecular evolution:

The greater the functional constraint, the slower the rate of molecular evolution.

Under neutral theory we can formulate this principle as a model (Kimura 1968). First we divide all mutations into three categories: (i) adaptive, (ii) deleterious, and (iii) neutral. The first category is assumed to occur very rarely, so their frequency is expected to be effectively zero. Hence the frequency of deleterious mutations is fD and the frequency of neutral mutations f0 = 1 - fD. Let µT equal the total mutation rate per site per unit time. Then the neutral mutation rate per site is:

µ0 = µT f0

Hence, the rate of substitution per site per unit time is:

k = µT f0

The rate of evolution depends on the “size (f0) of the selective sieve”

New mutations New mutations

Fixation in a “slow gene” Fixation in a “fast gene”

Kimura’s f0 is the fraction of mutations that passes through the “sieve”.

So within genes we will assume that µT is the same for all sites. Clearly the value of f0 is largest for the 3rd codon positions of protein coding genes. However we know that not all mutations at 3rd codon positions are synonymous. Thus we might expect that f0 for synonymous positions is even rd larger than f0 for 3 codon positions, and this turns out to be generally true. The average substitution rate between primates and rodents is higher for synonymous sites as compared with third codon positions. The results are based on a sample of 82 nuclear genes.

45 Mean at 3rd positions: 0.40 Primate Rodent 40

gene gene 35 Mean at synonymous sites: 0.61

30

25

20

15 t t1 0 proteins of number 10

5

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 More substitutions / site / 2x80 million years Ancestral gene 3rd codon postions Synonymous sites

Mean number of substitutions per site This result is consistent with neutral theory given that between primates and rodents is t = t + rd 0 f is smaller for 3 codon positions because some t . The unit of time is 2 × 80my; the time 0 1 mutations at such site will be nonsynonymous. since primates and rodents shared a common ancestor. Data from Bielawski, Dunn and Yang (2000) Genetics. 156:1299-1308.

Under this model the highest rate should occur when f0 = 1 for an entire gene. are expected to satisfy this expectation, and in fact they tend to exhibit the highest rates of evolution.

Comparison of mean substitution rates in different parts of genes and pseudo-genes. Data is from Li et al. (1985). Substitution rate is the mean number of substitutions per site per 109 years. Rates are an average over 3000 mammalian genes.

5 4.5 years

9 4 3.5 3 2.5 2 1.5 1 0.5

Substitutions per site per 10 0 I 5' 5' N S n 3' 3' P f u o y tr u f s la n n o o n la eu n t -s n n t n d k ra y ym s ra k o in n n o n in g g s o u s g e r la n s la r n e t ym s t e e g ed ed g s io o it io n r u es r n eg s eg i s i on it on es

This notion can be extended to other classes of sites within genes. Synonymous sites should have f0 = 1 as long as selection is acting only with respect to the protein product of a gene. In fact, the above figure illustrates that synonymous sites have a substitution rate comparable with that of pseudogenes. Non-synonymous sites will have f0 < 1 that depends on the level of functional constraint affecting the protein product. It is likely that f0 for nonsynonymous sites comprises the full range between 1 and zero, where some sites are under no constraint (f0 = 1), others tolerate no change (f0 = 0), and others fall somewhere in between.

The same notion can be extended to functional domains. The most functionally critical domains will have f0 << 1, whereas others will have larger values of f0.

The figure below illustrates that different domains of the gene that encodes the insulin protein evolve at different rates. Remember that following cleavage of the initial chain, the polypeptide folds to a state of lowest free energy; this state allows the formation of disulfide bonds. Once this occurs, a section of amino acids called the C-chain is removed, producing the mature insulin protein (in a conformation that is not at the lowest free energy). The C-chain, then, is not involved in the function of the mature proteins and consequently is expected to have a higher f0 as compared with the A and B chains, which have a functional role in the mature insulin proteins. As expected the substitution rate is higher (5 fold) in the C-chain. Interestingly, the substitution rate in the C-chain (1.1 per site per 109 years) is less than half the average for synonymous sites, suggesting that it still has considerable functional importance, presumably in ensuring that an appropriate conformation for formation of the disulfide bonds is reached before it is removed.

Substitution rate differs in different polypeptide domains of preproinsulin.

C chain: 1.1 × 10-9 / site / year 5 fold higher rate in C chain -9 A&B chains: 0.2 × 10 / site / year

Note that c-chain rate is still lower than in many other proteins, and at synonymous sites, so its amino acid sequence must still has considerable functional importance to the protein, probably in folding to lowest free energy so that disulfide bonds can be formed.

The principle that sequence conservation reflects functional constraint is fundamental to nearly all aspects of bioinformatics. Similarity searches of genetic databases rely on the assumption that functionally critical sites will be conserved, thereby leaving a sequence similarity signal that can be identified by the search algorithms. Moreover, methods that attempt to infer function of newly discovered programs rely on matching regions of sequence in a new gene and genes with conserved regions in other genes with a known function. The figure below illustrates regions of the beta-globin gene that have been conserved for over 450 million years.

Multiple sequence alignment of four vertebrate beta-globin genes representing 450 million years of evolution. Amino acids shaded in green represent sites that appear conserved for over those 450millin years.

Note: many of the shaded sites are located in the heme pocket or at the interfaces between globins subunits, consistent with the notion that sites most critical to protein function evolve at the slowest rates.

2.2 Differences among genes:

The notion of the size of the selective sieve; i.e., f0, is relevant to understanding difference among genes in the rate of substitution. Mutations in genes with a higher fraction of functionally important sites will have a higher probability of being deleterious; hence fD will be larger.

Consequently such genes will have small f0 and a relatively low mean rate of substitution per site.

As the fraction of functionally important sites decreases, fD decreases and f0 increases; the probability that a mutation will be fixed is high because, on average, the probability that a new mutation will be neutral is larger.

The nonsynonymous and rates allow us to estimate the size of f0. Hence, we can use these rates to compare the level of selection pressure acting on different genes. The box below provides an example of how the ratio of nonsynonymous to synonymous substitution rates are used to infer the level of selection acting on the neuroleukin gene of primates.

Under neutral theory:

1. The synonymous substitution rate (kS) is equal to the mutation rate.

2. The nonsynonymous substitution rate (kN) measures the substitution rate for neutral amino acid changes.

• Thus the ratio of these rates (kN / kS) represents the fraction of amino acid mutations that are neutral: this is f0 for amino acids

• The fraction of amino acid mutations that are deleterious (fD) must be 1 - (kN / kS).

Let’s take the Neuroleukin gene of primates as an example: kN = 0.016 kS = 0.300

• The fraction of amino acid changes that are neutral is 0.016/0.300 = 0.053, a small amount.

• Hence the fraction of amino acid changes that are deleterious is 1 - 0.053 = 0.95!

With this framework we can compare the level of selection pressure acting on different genes. The figure below compares a highly constrained gene with a gene evolving under relatively weak constraints for mammals. Note that both functional genes are much more constrained as compared with a .

Estimated level of function constraint for three nuclear genes of primates

A1 adenosine receptor Prolactin Pseudogene

Most mutations in a gene Less functionally constrained Pseudogenes are non- evolving under strong gene has more neutral functional so all mutations functional constraints are mutations are expected to be neutral deleterious

Fraction of deleterious mutations Fraction of neutral mutations

Estimates obtained from relative rates of synonymous and nonsynonymous substitution. Data is from Bielawski, Dunn, and Yang (2000) Genetics 156:1299-1308.

As expected, we observe considerable variation in nonsynonymous rates. This reflects different levels of purifying selection acting on proteins with different functional requirements. Synonymous are also variable, reflecting differences in mutation rates in different parts of mammalian . Note that sampling errors contribute to some of the variance in both synonymous and nonsynonymous substitution rates.

Distribution of nonsynonymous and synonymous substitution rates for 82 nuclear genes of primates.

60 Mean rate of nonsynonymous substitution: 50

0.045 / site / 80 million years 40

30 Mean rate of synonymous 20 substitution:

proteins of Number 10 0.201 / site / 80 million years 0 0.05 0.15 0.25

Data from Bielawski, Dunn, and 0.35 0.45 Yang (2000) Genetics, 0.55 substitutions/site/80 156:1299-1308. million years Method: GY94 under ML Nonsyonymous rate Synonymous rate

3. Mutation patterns

Neutral theory predicts that sites in the that are free from purifying selection will have or amino acid frequencies that reflect mutational equilibrium. For a review of mutational equilibrium see the notes for TOPIC 5.

Based on our knowledge of molecular biology, we expect that the nucleotide frequencies (A, C, G, and T) of pseudogenes, introns, 5’ and 3’ flanking regions of genes should be close to the equilibrium point. Our knowledge of the genetic code suggest that this is also likely to be true of the 3rd positions of codons, and even more so at the 3rd positions of 4-fold degenerate codons (i.e., four-fold degenerate sites of a gene). For simplicity, nucleotide frequencies are often summarized in terms %G + %C, or simply “GC content”.

Nucleotide frequencies in the human beta-globin gene differ among the three positions of the codon. Frequencies at positions 1 and 2 reflect selection acting on the protein product of the gene. Frequencies at position 3 reflect a strong influence of mutation pressure.

st nd 1 codon 2 codon 3rd codon position position position 0.45 0.45 0.45 3 0.4 0.4 0.4

0.35 0.35 0.35

0.3 0.3 0.3

0.25 0.25 0.25

0.2 0.2 0.2

0.15 0.15 0.15

0.1 0.1 0.1

0.05 0.05 0.05

0 0 0 12 34 A1234 C G T A C G T A1234 C G T

We have seen an example of this in the mitochondrial genome of vertebrates, where the third codon positions GC content is correlated with distance from the origin of replication. As the distance is also correlated with the expected amount of mutation by spontaneous decay that accumulates during the process of replication, this observation is believed to reflect different mutation equilibrium points in different parts of the genome.

GC content also is highly variable in the nuclear genomes of mammals, Drosophila and yeast, and the nucleoid chromosome of prokaryotes. The effect of this bias on protein coding genes is that some codons will be used more frequently than other codons in a given gene. In some cases this bias appears to reflect a mutational equilibrium, whereas in others is thought to play a role.

• Prokaryotes: Synonymous codon usage in prokaryotes appears to have a positive correlation with the frequency of its cognate tRNA. Such a relationship could result from natural selection for increased translational efficiency and accuracy. This relationship seems to depend on the expression level of the gene. In genes with low levels of expression, the correlation is weak, suggesting that selection is too weak to offset the effect of mutation pressure. There is general consensus that this is a type of WEAK SELECTION, as the selective difference between favourable and unfavourable codons must be very small.

• Mammals: Mammalian genomes exhibit a highly organized genomic structure where a small fraction of the genome encodes the majority of genes, and these gene rich regions are GC rich. Such regions are called ISOCHORES. There has been a long-running debate about the role of natural selection in the origin and maintenance of isochores. The consensus opinion today is that the GC content of isochores reflects mutation pressure arising from spontaneous mutations and the mutagenic consequences of recombination. Here differences in GC content reflect differences in equilibrium points rather than selection pressures.

• Drosophila: The case of Drosophila is reminiscent of the case of prokaryotes, in that biased use of synonymous codons seems to reflect selection pressure for increased translational accuracy. Note that codon bias is reduced in Drosophila pseudogenes and in species with small effective populations sizes.

The observation that codon usage might be determined, at least in part, by the effects of natural selection is NOT evidence against the neutral theory. In such cases synonymous mutations to and from favoured and un-favoured codons are less frequent due to selection; hence the principle that selection results in lowered rates of substitution holds.

4. Molecular clock

It is clear that evolutionary rates vary among regions within genes and among genes. If we take the rate of substitution as an average over all sites in a given gene, the neutral theory predicts that the rate of evolution of that gene should be approximately constant over time, where time is measured in generations.

Interestingly, the hypothesis of a clock-like tempo of evolution (the MOLECULAR CLOCK HYPOTHESIS) predates neutral theory. Zuckerkandl and Pauling (1965) noticed an approximately uniform rate of amino acid substitutions over time measured in years. The plot below illustrates clock-like evolution in the mitochondrial protein coding genes of teleost fishes.

Linear relation between mitochondrial substitution rate and time since common ancestor in teleost fishes

0.2

0.15

0.1

0.05

0 0 50 100 150 200 Mean number of substitutions / site Time (millions of years)

Linear relationship is expected under a uniform rate of substitution. Substitutions are the mean number of changes at first codon positions of all mitochondrial protein coding genes. Data were kindly provided by K. Dunn.

The notion of a molecular clock has always been controversial. Neutral theory suggest that generation time is the appropriate unit of time, and some believe that variation in generation times among different lineages of organisms has a significant affect on the pace of molecular evolution. The expectation is that molecule clocks should run faster in lineage with shorter generation times. This is known as the GENERATION –TIME EFFECT HYPOTHESIS. Interestingly, many studies of protein evolution since Zuckerkandl and Pauling indicate an approximately constant rate in terms of absolute time in years. Conflict between these opposing views still exists.

Note that some researches deny that any such pattern of general rate constancy exists at all! Some argue that the observation of a clock is an artefact of averaging rates over very long periods of time. Gillespie (1986) argued that substitutions might occur in an episodic fashion, occurring in clusters rather than at regular intervals. This model suggests that the observation of a molecular clock is only superficially consistent with neutral theory, and that the actual process of evolution is more generally non-neutral.

There are two reasons for such considerable interest and controversy over the molecular clock hypothesis.

1. Macromolecules that evolve at constant rates can be used to date evolutionary events that are not visible in the fossil record. This includes both species-level divergences where the fossil record is incomplete, or major genomic events such as gene duplications, etc. that are unavailable from any type of fossil data.

2. Rate variation among lineages, when it exists, should provide insights into the mechanisms of molecular evolution. Such variation could indicate a change in mutation rate, or a change in the substitution rate due to a relaxation of selection pressure or a period of positive Darwinian selection. So, even if the hypothesis is incorrect, it provides a valuable null model (i.e., nothing interesting happens) against which we can test for interesting patterns of evolution.

Neutral theory is the basis for indirect methods of estimating the mutation rate

Remember from our earlier subject of mutation (FOUNDATIONS TOPIC 5) that there are two general approaches to the measurement of mutation rates: (i) DIRECT METHODS, and (ii) INDIRECT METHODS. The indirect methods, which estimate the number of substitutions in lineages that have diverged from a common ancestor, are grounded in neutral theory.

Remember that the substitution rate (k) is equal to the product of the mutation rate (µ) and the fraction of such mutations that are neutral (f0). Hence, all indirect methods must select a dataset

(or subset of a dataset) where f0 is assumed to equal one. Reasonable candidates for indirect measurement of the mutation rate are pseudogenes, third codon positions of four-fold degenerate codons, or introns.

Nearly neutral theory

Tomoko Ohta was a student of during the development of the neutral theory. Ohta realized almost immediately that some fraction of mutations would probably be only mildly deleterious, and consequently their fate could be influenced both by drift and by natural selection. Ohta recognized that observed levels of heterozygosity in natural populations were much higher than expected if all were subject to selection, but lower than predicted by the strict neutral model.

Ohta realized that by simply allowing for a fraction of mutations with slightly deleterious effects, the neutral model could be extended so as to accommodate those observations. This modification was first called the SLIGHTLY DELETERIOUS MODEL (Ohta and Kimura 1971). Later, with the addition of slightly beneficial mutations, it became known as the NEARLY NEUTRAL THEORY.

SLIGHTLY DELETERIOUS MUTATIONS: Mutations with small selection coefficients such that both drift and natural selection can influence their probability of fixation. When population sizes are very large, drift effects are so small that natural section causes such alleles to be lost from the population. [Remember when we modelled selection with infinite population sizes, the always determined fate of an allele.] However, when population sizes are small the fate of slightly deleterious alleles will be determined by drift because the magnitude of change in allele frequency from generation to generation is so much larger than any changes that arise due to negative selection pressure. Thus slightly deleterious alleles can be fixed due to drift

SLIGHTLY BENEFICIAL MUATIONS: The same logic applies as above, except that when population sizes are very large selection acts to fix the allele in the population and when population sizes are small the same allele can be lost due to drift.

The strictly neutral model was extended to accommodate nearly neutral mutations

Beneficial

Deleterious

Neutral Strictly neutral model

Neutral Slightly deleterious model

Slightly deleterious

Slightly beneficial

Neutral Nearly neutral model Neutral

Slightly deleterious

Given this model, the substitution rate is NOT independent of population size. Rather, the fraction of neutral mutations (f0), and hence the substitution rate, changes over time depending on any changes in the population size and the value of the selection coefficient.

Large population size: Small population size: (selection very effective) (selection a little less effective)

neutral slightly neutral slightly mutations deleterious mutations deleterious mutations mutations

FIXED FIXED

Nearly neutral theory provides a possible reconciliation of rate constancy in years with expected rate constancy in generations: Organisms with short generation times tend to be small in size and maintain large population sizes (case 1; e.g., rodents). Organisms with long generation times tend to be large and maintain small population sizes (case 2; e.g., elephants). Under nearly neutral theory the effects of generation time (tg) and population size (Ne) act in opposite directions and could cancel each other out:

• Case 1: Shorter tg means more generations per year and higher yearly rate. This is offset by a reduced rate due to the greater effectiveness of selection; i.e., more mildly deleterious mutations are effectively prevented from going to fixation.

• Case 2: Longer tg means fewer generations per year and lower yearly rate. This is offset by an increased rate due to reduced effectiveness of selection in small populations; i.e., more mildly deleterious mutations fixed per generation due to drift.

Nearly neutral theory also predicts a more narrow range of heterozygosity than strict neutral theory, and this is consistent with what is observed in natural populations.

Neutrality can depend on genetic and environmental conditions

Neutral theory make no predictions about the stability (or instability) of the environment. Environmental conditions change dramatically among seasons and among years. If an environmental change leads to a change in the fraction of neutral mutations (f0), then neutral theory predicts a change in the substitution rate dependent on changing f0.

Changes in the genetic background can also impact the f0 over evolutionary time. Consider the evolution of an enzyme’s activity over time. If an amino acid is fixed by positive selection because it increases the activity of an enzyme (let’s call this site X), then the amino acids at other sites at that time could experience a change in selection pressure, even if they were not the target of positive selection themselves. Sites interacting with site X could have been evolving under a neutral model before the action of natural selection on site X, but evolved under strong purifying selection following that action of natural selection on site X.

Classic examples of this come from viruses such as HIV that cross a species barrier. Following a cross-species transmission event the environment changes, leading to a change in selection pressure acting on many sites of viral genes and overall substitution rate. Moreover, genetic recombination events are common among such viruses, and the altered genetic environment that arises from the recombination event can also influence the nature of selection, including f0.

Success of the neutral theory

Since its inception, weaknesses in the neutral theory have been identified. It is to the credit of this theory that it has survived them and has been enlarged to accommodate phenomena such as slightly deleterious mutations and non-clocklike patterns of evolution. Most molecular biologists, including “selectionists”, accept the notion of neutral evolution for characters such as pseudo- genes and synonymous mutations. Even among amino acid replacements, particularly physiochemically conservative ones, the consensus opinion allows for a large component of neutral evolution. There seems to be very little to debate in terms of the structure of the theory. Differences of opinion remain as to the relative proportions of neutral, slightly deleterious, and adaptive substitutions; however, controversies have shifted to more important issues in molecule evolution. As a theory, the neutral theory appears to be here to stay.

James F. Crow (1985) provided four reasons for its success:

1. The theory provides the best explanation for the dramatic differences in the rates and patterns of evolution in molecules as compared with morphology.

2. The neutral theory provides a common framework for understanding the dramatic differences among genes, codon positions, introns, and pseudogenes.

3. The neutral theory correctly predicts the differences in rates among molecular datasets as well as the similarity of substitution rates between the so-called “living fossil” organisms and the most rapidly changing species.

4. The neutral theory has stimulated theoretical studies as well as studies of natural variation in a framework based on a rigid null hypothesis.

In fact, prior to the 1970’s evolutionary biology was generally regarded as speculative and undisciplined. Roger Stainer commented in a speech before members of the Society for General Microbiology in 1970 that evolutionary studies are “a relatively harmless habit, like eating peanuts”. The application of rigid mathematical models, beginning in the 1930’s, and working up to neutral theory as a general null model for molecular evolution, played a large part in bringing about a dramatic change in the way evolutionary biology was pursued as a science.