<<

Topic 3: (mostly) and recombination

Introduction

MUTATIONS are changes to hereditary material (DNA or RNA) that arise If you are looking at a population and a either by an error in replication, or an error in a repair process. In migrant from another population brings molecular (and population ) we are interested in the with it a new genetic variant, then the mechanisms of mutation as it is the ultimate source of novel genetic source of the new variation can be said variation. The scale of the change in DNA or RNA is hugely variable, it to be migration. However its ultimate can be at an individual nucleotide, at the level of an entire , part of origin was mutation. a , or doubling of the entire ; we call all of these events .

We make a strict separation between the processes that generates mutation from the processes that influence the evolutionary fate of a mutation. Darwin made the same logical distinction in the first edition of the Origin of :

“Whatever the cause may be for each slight difference in the offspring from their parents ⎯and a cause for each must exist⎯ it is the steady accumulation, through , of such differences, when beneficial to the individual that give rise to all the more important modification of structure …”

This distinction has become the central principle of modern evolutionary theory. The processes of mutation and recombination operate at random with respect to the fitness of the molecule, or . Adaptations arise only as a consequence of the sorting of this random variation by natural selection.

The ultimate fate of a mutation is either (i) FIXATION in a population, or (ii) complete loss from a population. During the period of time when a new genetic variant has arisen by mutation but has not yet been fixed or lost in a population, we refer to such as POLYMORPHISM. We may use the term polymorphism at the level of the species as well, and, more rarely, above the species level. We use the term MUTATION only to refer to the result of a mutagenic event, and its use makes no claim to the ultimate fate of that mutation. We use the term SUBSTITUTION to refer to a mutation that has completed the process of being fixed in a population. For example, we use the term NUCLEOTIDE SUBSTITUTION to refer to a single nucleotide mutation that grew in frequency in a population until it replaced all other alternative nucleotide variants at that particular site.

Mutations may be divided into two very broad categories:

1. Point mutations (nucleotide mutations).

2. , , or rearrangement of nucleotides.

The objective of this lecture is to review the causes and types of mutations. We also introduce the process of recombination, which functions to “shuffle” the pre-existing variants that arose via mutation. The processes that determine the evolutionary fate of such genetic variation will be covered in future lectures.

Point mutation

1. Transitions and transversions: Nucleotides can be divided into two classes: (i) purines (A and G) and (ii) pyrimidines (C and T). In many , point mutations are more likely to occur within these classes A G than between them, so their rates are often estimated separately. TV When a mutation is within a class, e.g., from A to G, we say that a TRANSITION (TS) has occurred (blue arrows in figure to the right). When a mutation is between classes, e.g., from A to T, we say a C T TS TRANSVERSION (TV) has occurred (red arrows in figure to the right). If such mutations go to fixation, we can say that either a transitional or transversional substitution has occurred. This class of point mutation applies to all types of nucleotide sequences, coding and non-coding.

2. Synonymous and nonsynonymous: If a mutation occurs in a protein-coding region then they are classified according to its effect on the protein product of the gene. Due to the redundancy of the genetic code, we can classify all mutations into two categories. SYNONYMOUS (SILENT) mutations result in a change among codons that code for the same amino acid; hence, such mutations have no effect on the protein product of the gene. NONSYNONYMOUS (also called REPLACEMENT or MISSENCE) mutations result in a change among codons that code for different amino acids, thus changing the polypeptide encoded by the gene.

3. Nonsense: Another type of mutation in a coding region is one that changes a codon from one that encodes an amino acid to one that encodes a termination signal. This can be a dramatic change if it occurs anywhere other than near the natural end of the polypeptide, as it causes premature end to translation and a truncated polypeptide.

Ser Ter Seq 1 ATG CTG GTC AAG TTG AGA AGT TAA ↓ (1) (A) Seq 2 ATG CTG GTC AAG TTG AGA AGC TAA Ser

Leu Ter Seq 1 ATG CTG GTC AAG TTG AGA AGT TAA ↓ (2) (B) Seq 2 ATG GTG GTC AAG TTG AGA ACT TAA Val

Lys Ter Seq 1 ATG CTG GTC AAG TTG AGA AGT TAA ↓ (3) Seq 2 ATG CTG GTC TAG Ter

Two sequences showing the different types of mutations (1) synonymous mutation, (2) nonsynonymous mutation, (3) nonsense mutation, (A) transition, (B) transversion.

We can compute the expected proportions of mutation types in coding regions if we are willing to assume all nucleotide mutations occur at the same rate (ts = tv) and all codons are used equally frequently (1/61). Although very unrealistic assumptions for most , the expected proportions are useful in giving us a rough idea of the relative “mutational opportunities” for these different types of change. There are 61 sense codons in the standard code, and 9 ways for each to change to another codon (3 alternative nucleotides × 3 codon positions); so there are 61 × 9 = 549 possible mutational pathways to consider. The table below shows the expected proportions of each type of change.

Relative proportion of different types of mutations in hypothetical protein coding sequence. Expected number of changes (proportion) Type All 3 Positions 1st positions 2nd positions 3rd positions Total mutations 549 (100) 183 (100) 183 (100) 183 (100) Synonymous 134 (25) 8 (4) 0 (0) 126 (69) Nonsyonymous 392 (71) 166 (91) 176 (96) 50 (27) nonsense 23 (4) 9 (5) 7 (4) 7 (4) Modified from Li and Graur (1991). Note that we assume a hypothetical model where all codons are used equally and that all types of point mutations are equally likely.

Molecular basis of point mutation is varied. Replication errors can give rise to mismatches between strands of DNA which can be incorrectly repaired, or proofreading can simply fail to recognize an error. UV or a variety of chemicals can lead to direct damage to DNA. The process of transcription can leave the non-coding strand vulnerable to damage by, for example, spontaneous decay. Single stranded DNA has much higher rates of spontaneous decay by process such as deamination, as compared with double stranded DNA.

Insertion, deletion, or rearrangement

1. Nucleotide indels: The term “INDEL” is an informal way of referring to an insertion or deletion mutation. When nucleotide indels occur within a coding region they cause FRAME-SHIFT mutations. A frame shift mutation usually has a very dramatic effect on the encoded polypeptide, as translation is not terminated, but all the encoded amino acids downstream of the frame-shift are altered.

insert T 1↓23 Leu Arg Ser Ter Seq 1 ATG CTG AAG TTG AGA AGT TAA … ↓ Seq 2 ATG CTG ATA GTT GAG AAG TTA AGA Val Glu Lys Leu Arg

The insertion of a T causes the amino acids encoded beyond the insertion to change and since no stop codon is found it would continue until one is reached resulting in a longer polypeptide.

2. Genic indels: Entire genes can be inserted or deleted. A gene originating from another genome might be inserted by a process called LATERAL GENE TRANSFER (LGT). A gene may originate from within an organism’s own genome via the process of gene duplication. When gene indels are compared among different lineages the variation is often referred to as gene PRESENCE-ABSENCE POLYMORPHISM.

Presence-absence polymorphism is an important area of research; e.g., it is critical to understanding the evolution of bacterial pathogenicity. Specific sets of genes are often involved in conferring the capabilities for pathogenicity. When such sets of genes are located together in a genome they are called PATHOGENICITY ISLANDS, and they often are present in a genome as the result of a LGT event.

3. Chromosomal rearrangements: CHROMOSOMAL MUTATIONS include any change from the normal number and condition of the . Entire segments of chromosomes can be DELETED, DUPLICATED, INVERTED, or TRANSLOCATED. The figure to the right illustrates some broad classes of chromosomal rearrangement. Although the examples in this figure are based on eukaryotic chromosomes, chromosomal mutations occur in and as well.

As long as the brakes in the DNA do not fall within a gene, chromosomal mutations do not directly alter the gene. However, there is a possible contextual effect on the phenotype of a gene known as the POSITION EFFECT. In such cases the chromosomal context (i.e., what genes are found Figure courtesy of national health museum near it) defines the expression of a gene and a chromosomal mutation that affects the location of such a gene results in a change of its regulation and expression, and ultimately the phenotype of the organism. A classic example of position effect is an eye colour gene in Drosophila. An inversion moved this particular gene from its standard position at the end of the X chromosome to an internal position near the centromere. This translocation leads to males with mottled red and white eye color rather than normal red colour.

4. Genomic scale mutations: Finally, complete chromosomes can be lost or duplicated. POLYPLOIDY is used to refer to the condition of having more than two haploid (n) sets of chromosomes; i.e., the chromosome number is some multiple n larger than the 2n content of a diploid. Polyploidy is very common in plants, especially in angiosperms. Some estimates suggest as many as 70% of today’s angiosperms are polyploids (the range of estimates is 30 – 70%).

When the extra sets of chromosomes originate from within the species, the condition is AUTOPOLYPLOIDY. When the extra sets of chromosomes originate from a different species (usually very closely related), the condition is ALLOPOLYPLOIDY

A NEUPOILDY is a change in the number of a whole chromosome. A familiar form of aneuploidy is trisomy 21 in humans, a condition brought on by having 3 copies of chromosome 21. While anueplody is rare in vertebrates, as it is very often lethal, it is tolerated by plants, especially those that are polyploids.

The phenotypic effect of mutation

The process of mutation is random; there is no direction with respect to the beneficial or deleterious effects of change. Remember, only the action of natural selection can impose a direction on the course of evolution, and the process which gives rise to the raw material of natural selection (mutation and recombination) is generally undirected.

Logically, because mutation is a random process it is thought to be deleterious in the overwhelming majority of cases when it occurs within a gene. A nice analogy can be made between the information content of a gene and the information contained in some part of a blueprint for a building. Random changes of the lines in even a small part of a blueprint for a complex structure like a building are expected to be highly detrimental, perhaps threatening the structural integrity of the entire building. Similarly random changes to the blueprint for a complex structure such as an are likely to be destructive. In 1930, R.A. Fisher was the first to make such logical arguments about the fitness effects of random changes to hereditary material.

Although exceedingly rare, beneficial mutations must occur from time to time. Fisher argued further that mutations with very slight effects were the more likely to be beneficial, as they would contribute to a process of “fine tuning”. He reasoned that only such fine tuning would be possible in systems of such complexity as the three dimensional structure of an enzyme. This is a view highly compatible with the Neo-Darwinian view that adaptive evolution proceeded slowly, through a long series of mutations with small but beneficial effects. Not surprisingly, proponents of this view tend to focus on the process of point mutation

Alternative views of adaptive evolution suggest that the process is episodic rather than gradual, via bursts of change or changes with large Some mutations can have no effect. Proponents of this view tend to focus on larger scale mutational detectable effects on phenotype. processes such as LGT events, chromosomal rearrangements, or Examples include nucleotide mutations in non-coding and polyploidization events. pseudogene sequences; and inversions where the chromosomal Despite the different viewpoints, all agree that mutation is a random breaks occur in non-transcribed process and the vast majority of mutations that occur within genes are regions. functionally deleterious.

Note that not all mutations will have an impact on phenotype; consider synonymous substitutions, and substitutions within pseudogenes [PSEUDOGENES are DNA sequences derived from functional genes that have been rendered non functional, and consequently evolve free from natural selection.]

Somatic verses mutation:

In organisms with a strict separation of germline cells and somatic cells, somatic mutations are never transmitted to the offspring. This does not mean that somatic mutations do not have an effect on the organism’s fitness; the impact of somatic mutation on fitness is illustrated by cancer causing mutations that result in uncontrolled cell division (Don’t forget your sunscreen!). However, since we are interested in the evolution of molecules, we are concerned with the process and rate of germline mutation. In organisms such as multicellular protists, fungi, and plants, germs cells are not strictly separated from somatic cell lines. In those organisms, germ cells develop from somatic cells; hence, somatic mutations can be transmitted to offspring.

Some questions important to the evolution of genes and genomes:

FIVE KEY QUESTIONS ABOUT MUTATION:

What is the natural rate of mutation?

What are the effects of mutation on fitness?

Is mutation rate itself under genetic control?

Is the mutation rate subject to natural selection?

Is evolution ever limited by the availability of new mutations?

What is the natural rate of mutation?

The first thing to note is that there are different ways to measure rates of mutation, and you should identify the method when you read the literature on this topic and attempt to compare rates. Different measures of mutation rate include:

1. The rate of nucleotide substitution at sites believed to be free from natural selection pressure (neutral).

2. The rate at which new mutations occur at a gene (or per genome) per generation.

3. The rate of accumulation of lethal or deleterious mutations on a chromosome.

4. The rate at which new phenotypic variance is generated by mutation.

All methods of measuring mutation rates fall into two categories: (i) direct measurement, and (ii) indirect measurements.

DIRECT APPROACHES: These methods attempt to quantify the number of new mutations that occur within the time frame of the study. A new mutation can be identified as a visible effect on phenotype in a laboratory population, or an individual in a pedigree. Because mutation is a very slow process, very large number of genes or genomes must be observed; this is difficult for other than , but it has been done for fruitflies and mice. Note that such rates are per locus (or per loci) per generation. Mutations can also be surveyed by studying genealogies, and looking for the appearance of a dominant mutation. Some examples are presented below:

• Microorganisms: A large number of colonies can be screened for mutational events that can be easily detected, such as colony morphology or metabolic function. Estimates per base-pair are quite variable (10-1 to 10-10).

• Mice: The method in mice is to use multiple loci. Females that are homozygous for seven recessive mutations are crossed with homozygous wild-type males. If one is willing to assume that the F1 generation should be wild-type, then those progeny that are not wild-type must be mutants from the wild-type to the recessive allele. These are large scale experiments with upward of 500,000 progeny examined. The rate estimated from such studies in mice is 8 × 10-4

• Fruitflies: Here molecular techniques are sometimes used to detect mutations [more later], with estimates of ~4 × 10-6

• Human genealogies: An example is the rate of spontaneous appearance of autosomal diseases such as achondroplastic dwarfism. The rate estimated from such studies in humans is ~10-5

Although estimates vary greatly, a typical figure is about 10-6.

INDIRECT APPROACHES: These methods measure the number of substitutions in lineages that have diverged from a common ancestor. Thus these methods measure rates since the time of the common ancestor. If the date of the common ancestor is known then the time, in years, must be converted to time in generations; often this involves assuming that current generation times can be precisely measured, and that such estimates reflect the historical average. The number of substitutions between lineages will be influenced by the action of natural selection, and the effective population size. We will treat these issues in detail later in this course.

Substitution rate is not the same as mutation rate when the

mutations are subject to natural selection pressure

New mutations

natural selection and genetic drift acts as a “sieve”

Fixation in a population

The substitution rate can be used as an indirect measure of the mutation rate if the mutations are selectively neutral

These methods involve (i) pairwise comparison between lineages or reconstructing substitutions on a ; (ii) defining which sites are neutral [i.e., have passed through the selective sieve]; and (iii) applying a model based-correction for multiple substitutions at a site [see box on next page]. Although this might seem like more work than direct measurement, it is actually much easier to do. The problem is the impact of any errors linked to your assumptions (generation times, model of evolution, and neutrality) are difficult to quantify.

Rooted phylogenies and ‘ancestral character-state reconstruction” can be used to indirectly infer the number and direction of substitutions.

Case 1 Case 2

C → A → G → C ? ACG TAC TAA ACG TAC TAA

C C ACG TAT TAA ACG TAT TAA T T ? → A → G → T ? ACG TAT TAA ACG TAT TAA

Ancestral character states

• In case 1, simply counting the number of changes as inferred under parsimony might work, but the divergence (branch lengths) must be low.

• In case 2, the divergence is so large (branch lengths so long) that there are at least two sources of error: (i) the uncertainty of the ancestral reconstruction; and (ii) how many substitutions actually occurred along a branch.

• In case 2, model based methods that provide a correction for multiple substitutions at one site along a branch will provide better estimates of the rate.

DISCREPANCIES BETWEEN METHODS: Discrepancies between methods can be large and be in either direction. The problem of the assumptions involved with the indirect methods was listed above. The power to detect mutations in the lab might be low in some cases. Also the particular environment in the lab could either facilitate or reduce the process of mutation as compared to natural populations (e.g., differences in mutational spectra in under aerobic and anaerobic conditions).

A very rough example from humans: Single gene estimates of mutation rates in human pseudogenes provide us with an estimated mutation rate of 10-8 per basepair per generation (Kimura 1983). With 6.4 × 109 base pairs in the diploid human genome we obtain an estimate of 64 new mutations per generation per zygote. This number seems far too high if all mutations are strongly deleterious. Given this result is correct, a large fraction of such mutations must be neutral. The question remains what is the effect of mutation on fitness?

PARSIMONY: a principle that suggests the simplest hypothesis is preferred when all other factors are weighted equally. In the case above, parsimony suggest that the “best” reconstruction is the one that requires the fewest number of nucleotide substitutions over the branches of the phylogenetic tree. Other methods, such as maximum likelihood [more on this topic in later lectures], can be used to reconstruct the ancestral character states and infer the number of changes over the branches of the phylogenetic tree. What are the effects of mutation on the fitness of an organism?

We can now move beyond logical arguments about the fitness effects of mutation, and conduct experiments in which we can estimate the rate of deleterious mutation and measure the fitness consequences of such mutations. Such experiments are called MUTATION ACCUMULATION (MA) experiments.

The procedure was developed by Terumi Mukai (1964). The idea of MA experiment is to accumulate mutations in regions of a chromosome that are experimentally manipulated so that they are “shielded” from natural selection in the form of heterozygosity. Once the experiment has been run sufficiently long enough to “collect” many mutations, the effect of those mutations on fitness is manifested by making them homozygous.

This sort of experiment was possible because of BALANCER CHROMOSOMES in Drosophila. The relevant features of a balancer chromosomes is (i) it contains multiple inversions, thereby suppressing recombination; (ii) it contains that are lethal in the homozygous state; and (iii) a dominant phenotypic marker allele that makes it easy to identify balancer heterozygotes.

Breeding with a to shelter a chromosome from recombination

mutation

Heterozygous balancer (homozygotes are lethal)

Select a heterozygous offspring with balancer marker

Modified from St. Johnson (2002) Nature reviews 3: 176-188

By using a balancer chromosome, mutations can accumulate on a wild type (+) chromosome in the context that they occurred, as they cannot be shuffled via recombination. With the proper breeding system such mutations are (i) “trapped” on the wild-type chromosome; and (ii) sheltered from natural selection if the breeding system always maintains the sheltered chromosome in a heterozygous configuration.

MUKAI’S EXPERIMENT:

1. Establish a set of 101 genetically identical lines of flies (starting with wild type chrom 2).

2. Cross each generation to a reference stock bearing a balancer for chromosome 2. a. A single male is used as the parent in each generation. b. Select the balancer heterozygotes based on the marker phenotype c. Deleterious recessives on chrom 2 are sheltered from selection because (i) they are only allowed to exist as heterozygotes AND (ii) grown in optimal conditions

3. Lines are maintained independently for 60 generations, and mutations are allowed to accumulate. a. Independence allows divergence of the lines in numbers and effects of mutation b. Allows measurement of accumulation of variance among lines in a fitness trait over time

4. Measure fitness at regular intervals in the study a. Breed sheltered chromosome to yield homozygous configuration (het x het) b. Viability index: deviation from 2:1 ratio [see figure below]

Viability index measures depression in fitness as a deviation from 2:1 ratio

Die as larvae Viable Homozygote heterozygote recessives could influence fitness

1 ::2 1

No fitness effects: 2:1

Fitness effects: < 2:1 Modified from St. Johnson (2002) Nature reviews 3: 176-188

RESULTS OF MUKAI’S EXPERIMENTS (MUKAI 1964; MUKAI ET AL. 1972): Mukai recorded an astonishing rate of decline in viability; about 15% reduction in just 40 generations (see figure below). Extrapolated to the whole drosophila genome, and excluding dominant deleterious and lethal alleles, the rate of fitness erosion is about 1% per haploid genome per generation. This is huge (what are the implications for the “mendelians”?). Clearly mutation is a force for destruction of a genome, and natural selection represents an important mechanism for defence against this mutational meltdown.

The variance in fitness among the independent lines increased gradually over the course of the experiment, and the mean fitness declined slowly. This result is consistent with the number of mutations as a random variable, with different lines accumulating different numbers of mutants.

Mean and variance of the viability index over the course of the Mukai et al. (1972) experiment.

NOTE: Indices were standardized to a value of 1 in generation 0

Adapted from Mukai et al. (1974) Genetics 72:335-355.

These experiments lead us to the important concept of MUTATIONAL LOAD. Mutational load is the reduction in fitness (via death, failure to reproduce, or reduced reproductive success) incurred by the presence of harmful mutations. Mutational load is just one part of GENETIC LOAD, the reduction of fitness that is the sum of the effects of (i) muational load; (ii) segregational load; and (iii) substitutional load.

Is mutation rate itself under genetic control?

MUTATOR GENES: There are genes whose function is related to the process of DNA repair and their function influences the accuracy of replication. If such genes are mutated, a Mutator phenotype is conferred upon the cell and the repair genes are then known as Mutator genes. Mutator genes elevate the genomic mutation rate, with bacteria containing these genes have increased mutation rates. Because genetic variation in mutator genes can alter the mutation rate, the mutation rate could be a phenotype of a genome subject to the action of natural selection! A classic example of a mutator gene is mutT of E. coli. The mutT gene affects the rate of conversion between A:T base pairs and C:G base pairs. The non-mutant form is normally responsible for lowering the rate of C⇒A and G⇒T transversions. MutT functions hydrolyzing 8-oxo-dGTP, thereby preventing incorporation of 8-oxodG into DNA during replication; incorporation of 8-oxodG introduces mutations via incorrect base pairing at that site [see slides for a diagram]. Note that all the variants of mutator genes that increase the mutation rate are thought to be defective versions of the involved gene; i.e., no variants have been discovered that decrease the mutation rate.

TRANSPOSABLE ELEMENTS are fragments of DNA that are capable of replicating themselves and move among locations within their “host” genome. They range in size from 1000 to 10,000 base pairs and encode one or more proteins that function to copy the element to a new location. The activity of the transposable elements is not under the control of the host genome and during the process of replication they can cause duplications, deletions and rearrangements in the host genome. The rate of mutation resulting from transposon activity is under genetic control, but not the host genome. Transposons come in a variety of types; the transposition process for Tn5 is shown below.

Diagram of “simple transposition” of a segment of DNA catalyzed by the Tn5 transposase.

Tn5 Transposon DNA

Donor DNA

Transposase enzyme

Cleavage

Target DNA

Capture of target

Strand transferred to target DNA

Note: in “replicative transposition” the sequence element replicates itself from place to place, thus leaving behind

Is mutation rate subject to natural selection?

There is now considerable evidence that the general DNA repair and replication processes that effect mutation contain standing genetic variation. As with any other system with a heritable basis, such variation is subject to natural selection. Because mutation is the ultimate source for variation required for adaptation of other cellular systems to altered environmental conditions, it is interesting to consider the possibility that the genomic rate of mutation is itself subject to natural selection. Hypotheses about the role of natural selection in setting genomic mutation rates fall into two broad categories presented below.

H1: MUTATIONS RATES ARE ADJUSTED TO LEVELS THAT PROMOTE ADAPTATION. This hypothesis has some intuitive appeal. The argument is that because mutations are required for long term capability for adaptive evolution, the rate of beneficial mutation will be of primary importance in setting the mutation rate by natural selection. There are a number of problems with this viewpoint:

1. The vast majority of new mutations are likely to be detrimental to fitness. 2. The input of deleterious mutations might cause selection to favour lower genomic mutation rates. 3. Short term gains in fitness from new mutations would be offset by long term accumulation of deleterious mutations. 4. Natural selection cannot maintain mutation rates in a population for the sake of future adaptive value; it can only act to maintain mutation rates as a consequence of their adaptive value in the present conditions.

Mathematical modelling suggests that there are conditions where the rate of adaptation might set the genomic mutation rate. The models rely on physical linkage of beneficial mutations with mutator genes. Based on a process called “hitchhiking”, selection for the beneficial allele indirectly selects for the linked mutator allele. The process of genetic recombination will disrupt this physical linkage, so these models are only expected to apply to clonally reproducing lineages with low, or no, recombination such as bacteria.

“Hitchhiking” of a mutator gene with and without recombination

No recombination

Recombination

Mutator allele that increase the mutation rate Beneficial allele subject to strong positive selection

Adapted from Sniegowski et al. (2000) BioEssays 22:1057-1066.

How do such models fit real data? Mutator phenotypes seem to be the exception rather than the rule in natural populations, suggesting that this type of natural selection is not operating. Note that experimental populations seeded with cells bearing mutator alleles do evolve higher mutation rates. However, such experimental populations of bacteria have so far revealed that mutator phenotypes, when they evolve, do not increase the rate of adaptation.

Why aren’t mutator phenotypes selected? The best answer seems to be that the negative effects of accumulating deleterious mutations over a long period of time far outweigh the short period of time where having a high rate of beneficial mutation is advantageous.

Note that in mammals the germ line mutation rate is lower than the somatic cell mutation rate. Using mice as a model, the mutation rate is estimated to be about 1.7 × 10-5 in a variety of tissue types, whereas the rate is about 0.6× 10-5 in sperm line cells; about a 3 fold difference. For a variety of reasons, this difference is expected to be an underestimate of the actual rate difference. This suggests that evolutionary pressures act to lower the mutation rate rather than elevate it.

H2: MUTATION RATES ARE ADJUSTED TO MINIMIZE THE COST OF FIDELITY. If one is not willing to allow natural selection for an optimal rate of beneficial mutations to set the genomic rate then one encounters another problem. This problem was first pointed out by A. H. Sturtevant (1937) when he asked the question “Why does the mutation rate not evolve to zero?” Motoo Kimura took up this question in 1967 and presented two possible answers to this question: (1) reductions beyond observed rates are impossible because of physiochemical constraints (no role for natural selection here); and (2) the physiological cost of further reductions are so high that they impose prohibitively high fitness costs on individuals.

There are two interesting observations that shed light on the “cost of fidelity” hypothesis.

1. Remarkable uniformity of the per genome mutation rate. Although the per-site mutation rates are highly variable among organisms, the rate normalized for is strikingly uniform (see Figure below). This observation hints at the operation of a similar selective force in a wide variety of taxa.

Remarkable conservation of genomic mutation rate among DNA-based microorganisms

RNA viruses Higher eukaryotes

Genomic mutation rate

physiological limit ?

Base-pair mutation rate

Mutation rate

Adapted from Sniegowski et al. (2000) BioEssays 22:1057-1066.

Variation among genomic mutation rates is less than one order of magnitude among a wide variety of microorganims; yet, the per base-pair mutation rates vary among the same lineages by almost four orders of magnitude! There are two explanations for this apparent lower boundary. This so-called minimal genomic mutation rate could reflect a point beyond which further reductions entail individual fitness costs because of excessive energetic requirements of DNA replication, proofreading and repair. Of course the lower bound could represent the optimal trade-off between a reduced rate of deleterious mutations and the failure to produce an adequate amount of beneficial mutations as the environment changes over long term evolutionary time scales. I have argued earlier that this second explanation is unlikely. It is also unlikely that natural selection could set such similar rates in widely different and independent evolutionary lineages.

2. An experimental study in Drosophila exposed populations to different levels of X-ray irradiation for up to 600 generations. Over time there was a decrease in the rate of X-ray-induced mutation, suggesting that repair systems had evolved to compensate for the additional influx of mutations. The remarkable finding was that when the irradiation was stopped, the per base mutation rates returned to the wild-type rates. This finding strongly suggest that the level of physiological investment in mutational repair is set by a trade-off between the overall rate of accumulation of deleterious cost to fitness of mutations and the fitness costs of repairing them. In other words, the cost of maintaining the extra repair systems was too great once the source of the extra mutation had been removed from the system.

NATURAL SELECTION FOR HIGH MUTATION RATES WITHOUT INCURRING AN EXCESSIVE INCREASE IN MUTATIONAL LOAD. The consensus opinion disfavours the notion that genomic mutation rates are set to maximize the rate of beneficial mutations. However, there are at least two scenarios where natural selection is thought to possibly play a role in setting mutation rates according to the rate of beneficial mutation.

1. Elevating mutation rates at certain loci. These loci are called CONTINGENCY LOCI, and are most closely associated with . The idea is that the existence of hypermutable loci is valuable in cases where the most important fitness effect is the ability to evade a host . In such cases, simple variability will be more valuable than conservation of function. In this way the accumulation of deleterious mutations is confined to a relatively small portion of the genome, and there is no need for a genome-wide adjustment to the physiochemically costly mechanisms that control the rate of mutation.

2. Restricting elevation of genomic rates to times when further growth and is unlikely without an increase in new mutations. This notion has acquired the name “”. The idea is that if a mechanism could be evolved that allowed increase in mutation rates when they were most needed, say during times of stress, then the long terms costs of excess mutational load would be avoided. The observation that some bacterial cultures exhibited elevated mutation rates under starvation conditions appears to support this hypothesis. Unfortunately there has been little in the way of rigorous modelling of this as an evolutionary process, and the direct effects of physiological stress on mutation rates (as opposed to specific cellular mechanisms evolved under natural selection) have not been ruled out for the examples. This notion remains controversial.

Is evolution ever limited by the availability of new mutations?

We can observe rates of organism evolution in populations and in the fossil record, and we can infer rates of mutation from molecular sequences. An important question is if the rate of evolution is ever limited by the mutation rate, as opposed to limits imposed by the rate of environmental change. Phenotypic changes observed in the fossil record are so gradual that they can easily be explained in terms of the standing variation within a population. However experimental work suggests that the answer to the above question could be yes.

Work by Trudy MacKay on transposable elements in fruitflies (called P elements) directly addresses this question. MacKay knew that the activity of transposable elements have a mutagenic effect on the genome. She used this to her advantage and designed an experiment that compared populations derived from dysgenic crosses with populations derived from non dysgenic crosses. First see the figure below for summary of HYBRID DYSGENESIS.

Matings within M and P strains produce Certain between M and P strains P strains have between 30 and 50 copies normal progeny produce hybrid dysgenisis of a called a P ELEMENT in their genomes. M strains M = no P elements M = no P elements have no P elements.

P = P elements in genome P = P elements in genome The P elements have a highly mutagenic affect on their host genome when active. They are only active in germ-line cells

Offspring of crosses among P strains have normal offspring because they have a REPRESSOR PROTEIN that suppresses the activity of the P elements.

The M (male) x P (female) crosses abnormalities produce normal offspring because the suppressor is present in the cytoplasm of the egg.

In the case of the other cross, the M (female) does not have any P elements; Reduced fertility hence there is no suppressor in the cytoplasm. The P-elements carried in the sperm of P males are free to wreak havoc on the genome.

Normal progeny The repressor protein is encoded in the Normal progeny Normal progeny fourth exon of the P-element. Dysgenic effects are asymmetric

The rational for MacKay’s experimental procedure was simple. Movement of P elements causes mutations within the Drosophila genome. By chance some of these will be at loci that control quantitative characters. Dysgenic crosses will have higher P element activity than non dysgenic crosses; hence the dysgenic crosses will have higher levels of standing genetic variation. If selection is limited by the amount of standing genetic variation, an accelerated response to ARTIFICIAL SELECTION (selecting individuals with extreme phenotypes for breeding) would be evident in the dysgenic hybrids as compared with non dysgenic hybrids.

MACKAY’S EXPERIMENT:

1. Establish populations derived from dysgenic and non dysgenic crosses

2. Apply artificial selection for both high and low bristle numbers (divergent selection)

3. Carry out artificial selection for 16 generations by choosing the 10 most extreme phenotypes in a generation to be the parents of the next generation.

4. Conduct assays of the effect of this variation on fitness (we will not cover this result in this lecture)

Generational means for abdominal bristle score in dysgenic and non-dysgenic lines of Drosophila subjected to artificial selection

Non-dysgenic fruitflies Dysgenic fruitflies

Adapted from MacKay. (1985) Genetics 111:351-374.

There was a clear response to this selection pressure, with the average bristle score increasing or decreasing in the population under artificial selection. The response to artificial selection in the dysgenic lines was twice that in the non-dysgenic lines.

The initial crosses instigate transposition activity that is expected to persist for about 10 generations. It is believed that the cytotype required for stability will have been established after 10 generations (see dysgenic plot above). It seems that if more standing genetic variation is available at the outset, and through the beginning of the selection period, we see a faster response to selection.

The general inference has been that evolution can be constrained by the availability of new mutations.

A little bit about recombination

GENETIC RECOMBINATION is the process by which homologous We see a similarity between segments of DNA with different genetic “characters” are “shuffled” or recombination and migration: they combined to produce new combinations of these genetic characters. both put existing genetic variation This process takes place during and ensures that the genetic into novel combinations. combinations of the offspring differ from those of their parents. Remember: without mutation there would be no variation to work with.

1. Crossing-over (reciprocal recombination): This term was introduced by Morgan and Cattell in 1912 to describe the process of reciprocal chromosome exchange of DNA. This is the process by which recombinant offspring arise.

Parental contribution Resultant offspring

A B a B

a b A b

2. (non-reciprocal recombination): This term describes the process whereby there is non- reciprocal recombination between genes. The process results in one gene sequence becoming identical with another. In such a case we say that gene B has been “converted” by gene A; i.e., a one-way gene conversion event has resulted in the sequence of gene B being made identical to the sequence of gene A. Gene conversion is an evolutionary force for homoginization.

A a A a Gene A Gene B B b B B C c C C

Gene A Gene A D d D d

Gene conversion between the same gene Gene conversion between different genes

Gene conversion proceeds by the mismatch repair system. Strand exchange during meiosis generates heteroduplex, or mismatched segments of DNA. The cell does not “know” which is the original, but “knows” it must repair the heteroduplex region, so it “takes a chance”. If a segment of DNA is repaired by using sequences that differ from the original, then the repaired gene will acquire the DNA sequence of the template.

x x Mismatch x or repair

Mutation and recombination

Mutation and recombination are not always easily distinguishable. Some of the cellular machinery involved in genetic recombination is used in the process of DNA repair, thereby contributing to mutation rates. Large scale rearrangement of DNA segments, duplication of genes or of exons, contributes to mutational load and is the direct result of recombination. For our purposes we will make a distinction between the origin of genetic variation (which we call mutation) and its rearrangement into different configurations (which we will call recombination).