<<

Summary of Introduction to Evolutionary v0.7 Gleb Ebert

November 8, 2017

Preface

This document aims to summarize the lecture Introduction to Evolutionary Biology as it was taught in the autumn semester of 2017. The focus lies on concepts and examples from the lecture are only included when deemed necessary. Unfortunately I can’t guarantee that it is complete and free of errors. You can contact me under [email protected] if you have any suggestions for improvement. The newest version of this summary can always be found here: https://n.ethz.ch/~glebert/

1 1 Introduction By the time they meet up again the differences between • the universal genetic code: bases and codons the are too big to interbreed. One example • the small-subunit (SSU) ribisomal RNA genes Definition: means biological change over time for this is the Siberian Greenish Warbler (the hatched Technical basis: Phenotypes of individuals that are en- area is where no interbreeding occurs with two popula- coded by heritable vary in a and tions present). their frequencies change 2 1.1 History Aristotle: Ladder of / perfection Natural selection is the process underlying adaptive evo- Carl von Linn´e: Systematic classification of lution James Hutton & Charles Lyell: Gradual long-term processes shaped () Jean-Baptiste de Lamarck: Inheritance of acquired characteristics (Lamarckian evolution) 2.1 Darwins postulates of evolutionary change Charles : Evolution is descent with modifica- tion and results in survival of the fittest Evolutionary change over time is a deductive implication of four postulates. 1.4 1.2 1) All populations contain variable individuals indirect observation: long time-scales → long-term 2) Variation among individuals is, at least in part, her- → direct observation: small time-scales short-term changes itable changes 3) Some individuals are more successful at surviving Evidence of Macroevolution: and reproducing than others Evidence of Microevolution: 1) Successions & 4) Survival and of individuals are not 1) Observation from natural populations Law of succession: pattern of correspondence be- random; but individuals with the most favorable • Bacterial to antibiotic stress tween fossile and recent forms from the same locale variation given the environment are those better at • Soapberry bug adaptation to fruit Comparative : Georges Cuvier argued surviving. 2) Observation from living anatomy that certain species are extinct. Recent macrofauna These postulates can be tested in real world populations Vestigal and rudimentary traits is only a fraction of all that ever existed (e.g. Darwin’s finches from the Galapagos islands). One • Kiwi wings 2) Transitional forms has to be careful to not misinterpret biasing factors. For • Human coccyx (Steissbein) Darwinian evolution predicts intermediate forms example measures can be skewed by misiden- • Human arrector pili muscle (Haaraufrichter- between a species and its ancestor (e.g. Microrap- tified paternity, misidentified maternity, food quality or Muskel) tor gui and Archaeopteryx between and maternal effects such as egg quality. • ( might be safe house for good gut modern birds bacteria) 3) Homologies (Owen: “the same in different animals under every variety of form and ”) 1.3 can be found through and comparative . The similarity is due to 2.2 Definition Speciation is the process that results in one species split- inheritance from a common ancestor. They are ting into two or more. One example for this are ring phenotypically and genetically defined and enable Natural selection acts on individuals (more specifically, species. They occur when one species spreads slowly the use of model . phenotypes), its consequences occur in populations as al- around a geographical area to which they don’t spread. Some molecular homologies are lele frequency changes. 2.3 Darwinian vs. Lamarckian evolution split into three or more subbranches. This phenomenon is Classical taxonomic groups are not necessarily mono- called polytomy. Branches can have transitions where a phyletic (e.g. prokaryotes, dicots and fish). Different processes proposed for the same pattern. specific characteristic evolved. The first group or species 1. No initial variation Initial variation to split of is called basal. 2. Individuals adapt Selection acts on 3.3 Tree reconstruction Trees come in many forms depending on the author and during their lifetime individuals on the kind of data they aim to display. They can be 3. Inheritance of aquired Inheritance of Pitfalls of trees inferred form phenotypes alone: left-right, top-down or even circular. changes/characters surviving alleles if • Phenotypes are influenced by genotypes and envi- One should not forget that evolutionary trees are merely (IAC) environment leads to ronment hypotheses. adaptation • Only genotypes are heritable ⇒ individuals and ⇒ populations evolve • Phenotypic similarity due to convergence (analogy populations evolve / homoplasy; e.g. camera eye in mollusks and Epigenetic modifications are attached to the genetic code teleosts) and can be passed on to the offspring (up to two gener- Combination of phenotypic and genotypic data (molecu- ations). Thus their evolutionary relevance is short-term. lar markers) leads to better results. When using molec- ular markers each nucleotide is seen as an independent 2.4 Limits character. • Natural selection acts on existing traits • Natural adaptation does not lead to perfection If multiple possible trees exist, one can assume that the • Natural selection is non-random, but it is not pro- most parsimonious (least evolutionary steps) is the cor- gressive rect one. If multiple equally parsimonious trees turn up • Natural selection is blind to the future, but tells us one estimates uncertainty. This is done by bootstrapping tales from the past which is the generation of data sets made up from the original data set. Data points can be repeated multiple times or be absent altogether. When analyzing these 2.5 “Perfection” in nature data sets one gets a for each. The William Paley argued, that the vertebrate eye is too per- most likely tree will be the one that comes out the most fect and complicated to have resulted from natural pro- after the analysis. The certainty of a certain is cesses. Thus it must be a creation of a conscious designer. placed at its node and given in percent. This number is By looking closely at eyes from various chordates one can the percentage of the replicates in which that particular 3.2 Phylogenetic inference see that there is a lot of variation in the complexity of clade appeared and is also called the bootstrap support the eye. Thus Paleys argument is wrong. • Plesiomorphy: A characteristic that is shared be- of the clade. High bootstrap support means that the tween a species and its ancestor clade is a winner across our artificial data set. • Apomorphy: A characteristic that is different from 3 an ancestor Reversals or “back-” can remove synapomor- • phies. 3.1 Evolutionary trees Synapomorphy: Apomorphic and shared between multiple sister taxa. Also called Evolutionary trees aim to display the course of specia- • Autapomorphy: Apomorphic and different from tion over time and the relation between species. They sister taxa 3.4 Answering questions with phylogeny start with one common ancestor (the root) and end with • Monophyletic group: All descendants of one ances- multiple species (tips or terminal nodes). Dichotomies tor (at least two taxa). Also called clade. The following examples illustrate some of the problems (splits of branches into two) in between are called nodes. • Paraphyletic group: A subset of descendants of one that phylogeny can help to solve. If phylogeny can’t be resolved to dichotomies branches ancestor • By looking at when body lice evolved from lice one can estimate when humans started wear- 2) Formal cause baby. Also a replacement generation of the human pop- ing clothes (around 107,000 years ago). 3) Efficient cause: like modern understanding of ulation would average 83 mutations per bp site. • Forensic were able to conclude which pa- cause-effect relation The base substitution rate in various organisms varies by tients got HIV from their dentist and which got it 4) Final cause: the purpose for which a thing several orders of magnitude. else where. exists of an action is done. The future end • In his E. coli long-term something is supposed to serve project (LTEE) was able to show that the rate in his bacteria evolved. 5 Co-option 4 Science history Co-option (or ) is the process by which a struc- ture or system with an original function adds or changes 4.1 Ancient Greece to a new function. • Hippocrates: Importance of good notes and cate- 5.1 Myxococcus xanthus gorization • Socrates: Dialectic inquiry (finding an answer by Myxococcus xanthus is a social soil bacteria. It can se- initiating a group discussion with questions) crete antibiotics and lytic to externally digest • Plato: Ideal spiritual forms of real world things other microbes. These bacteria show social behavior in • Aristotle: Emphasis on empirical things (Scala the form of “wolf pack hunting” and sporulating through Mutation rates can be calculated from Mutation Accu- Natura) fruiting bodies (only a minority of the colony becomes a mulation Lines: spore). 1) maintain 25 replicate lineages of one common an- 4.2 Deduction and Induction Myxococcus shows two types of motility: S-motility cestor drives swarming on soft agar surfaces, while A-motility 2) grow for 500 generations while choosing randomly Induction infers generalizations base on individual in- contributes to swarming on relatively firm agar surfaces. which individual is transferred (no selection); in stances. Reasoning in which the premises of an argument S-motility requires pili and fibrils. If one knocks out pilin control lineages transfer many individuals per gen- are considered to support its conclusion but do not guar- production the bacteria can regain the ability to swarm eration (selection) ⇒ mutations accumulate and re- antee its truth. on firm agar. This is called evolved cooperative motility duce fitness Deduction is the process of deriving consequences of what (ECM). Through various test it was shown, that instead 3) sequence one from ancestor and from is assumed. Given the truth of the assumptions, a valid of redeveloping pili, Myxococcus adapted A-motility to individual taken from each replicate lineage deduction guarantees the truth of the conclusion. regain the ability to make fibrils. 4) average mutation rate per bp per generation

4.3 Aristotle 6 Sources of phenotypic variation total # of mutations • pupil of Plato = ∗ ∗ • big influence on future course of western science 6.1 Genetic mutation # of bps # of generations # of lineages through founding of fields of logic, biology and psy- genetic differences between individuals cause phenotypic chology variation • generated one of the most impressive systems of thought 6.1.1 Mutation rates 6.1.2 Distribution of mutation fitness effects • heavy emphasis on rules of deductive logic • lack of rigorous experimental methods Base-pair matching during synthesis leaves only one un- • repaired mismatch every 100 million times. This results Most mutations are neutral or negative in their effect on 1) Material cause in around 38 new mutations per human gamete or 76 per fitness. A frequency-fitness diagram might look like this: 6.5 New genes By contrasting the number of duplicate genes vs. age of several species Michael Lynch arrived at an estimated New gene copies can arise either by retro-transposition, duplication rate of 0.01 per gene per million years. Con- which results in genes without introns or regulatory re- sidering that most organisms have a lot of genes a big gions in , or by un-equal crossing over during amount of new genes arise every 1 million years. These , which results in new copies with introns and reg- new gene copies can have different fates: ulatory regions. • lost • deactivated and regained as pseudogene • diverge to serve function distinct from parent gene

6.6 Inversions

An inversion is an event when a piece of a chromosome 6.2 Environmental variation is flipped around and refuse the other way around. They can enforce allele linkage due to lack of crossing over. environmental variation in space/time cause an individ- Thus inversions that lock in advantageous allele combi- ual ’s phenotype to vary nations can be maintained by selection.

6.3 x environment variation 6.7 Whole genome duplication Whole genome duplications are most commonly found genetic differences cause distinct individuals to respond in self-fertilizers. They result in a massive new input of to environmental variation differently genetic material that can evolve.

6.4 vocabulary to know 6.8 More impactful mutations

complementary base neutral alleles pairs transition transversion pseudogenes replacement paralogs, paralogous (non-synonymous) substitutions silent site (synonymous) orthologs, orthologous substitutions loss-of-function linkage mutations indels selection coefficient frequency chromosome inversion genome duplication 2 7 Population 7.2 Calculating Genotype and Allele Frequencies More general: (SUM(pn)) = 1 AA Aa aa integrate Darwin’s theory of evolu- 36 48 16 tion by natural selection and the Mendelian laws of inher- f(Genotyp) 100 100 100 36 + 36 + 48 48 + 16 + 16 itance. The field seeks to document and explain changes f(A)= = 0.6; f(a)= = 0.4 in allele, genotype and phenotype frequencies. It offers a 200 200 quantifiable definition of evolution: It happens by allele f(A)+ f(a) = 1 frequency changes in the population. 7.3 Evolution of Populations 7.1 Heritability The forces causing evolution in real populations: 2 Variance (Genotype) 2 σG Heritability = ; h = 2 Variance (Phenotype) σP

7.3.2 Violations of the HWE 1) Allele frequencies change over time 2) Genotype frequencies do not meet expected values

7.3.3 Testing of the HWE 1) Calculate the observed allele frequencies 7.1.1 Selection Experiments 2) Calculate the expected number of each genotype 7.3.1 Hardy-Weinberg Population and Principle under HWE 3) Compare expected and observed numbers The ideal population: 4) Optional but more exact: use a statistical chi- 1) infinite population size square test for significance 2) random mating among individuals 3) no evolution: 7.4 Selection 1) no selection 2) no mutation 3) no migration 4) no chance of evolution () → Does our population evolve? If not: the population is in Hardy-Weinberg Equilibrium. (The HWP is a null model against which we test if a population is evolving or not. Simples case with two alleles (A and a): p2 +2pq +q2 = 1 Binomial equation: (p + q)2 = 1 3) Calculate allele frequencies among gametes after se- 7.6.2 Selection against dominant alleles lection and random mating

′ ′ ′ ′ A inferior dominant allele will be eliminated from the f (A)= p = f (AA) + 0.5 ∗ f (Aa) gene pool relatively quickly.

′ ′ ′ ′ f (a)= q = f (aa) + 0.5 ∗ f (Aa) ′ ′ ∆p = p − p; ∆q = q − q

7.6 Selection can act on different genotypes In frequency-dependent selection, the fitness of a pheno- If a flour beetle population has two alleles (+ being dom- type depends on its frequency in the population: inant and viable while - is recessive and lethal). The • positive FDS: phenotype increases if it becomes genotype fitness will play out as follows: most common in the population w(+/+) w(+/−) w(−/−) • negative FDS: phenotype declines if it becomes 1 1 0 most common in the population One can predict how the frequencies of the alleles will change: 7.4.1 Selection coefficient s Generation +/+ +/- -/- p = q = wpop f(+) f(−) 7.6.3 Selection favoring heterozygotes The larger the selection coefficient s is the faster allele 0 0 1000 0 0.5 0.5 frequencies change. w (relative reproductive suc- 1 (HWE) 250 500 250 0.5 0.5 1 Heterozygote superiority (overdominance) leads to a sta- cess) is inversely correlated to s: 1 (Selection) 250 500 0 0.6 0.33 0.75 ble equilibrium. 2 (Selection) 436 436 0 0.75 0.25 0.89 w = 1 − s

7.5 Calculating changes in genotype and allele 7.6.1 Selection against recessive alleles frequencies due to selection The section against a recessive lethal allele is less and frequency of first allele: p = f(A) = 0.5 less efficient the rarer it becomes. frequency of second allele: q = f(a) = 0.5 fitness of genotypes: w(AA) = 1.0; w(Aa) = 0.7; w(aa) = 0.6

1) Calculate average fitness w for the whole popula- tion before selection:

2 2 w = p ∗ w(AA) + 2pq ∗ w(Aa)+ q ∗ w(AA) 2) Calculate genotype frequencies after selection:

2 ′ p ∗ w(AA) ′ 2pq ∗ w(Aa) f (AA)= ; f (Aa)= w w 2 ′ q ∗ w(aa) f (aa)= w 7.6.4 Selection against heterozygotes • Neutral: more common, fate determined by drift of hitchhiking Heterozygote inferiority (underdominance) leads to a un- • Negative: very common, selected against, level in stable equilibrium. population determined by mutation-selection bal- ance After selection the frequency of different fitness levels be- tween mutation is different from the initial mutations.

7.7.4 Mutation-Selection Balance After a population reaches a fitness peak a balance will be apparent between • the rate at which new deleterious mutations occur and • the rate at which these deleterious mutations are selected against (negative selection)

µ 7.7.2 Induced mutagenesis qˆ = r s 7.7 Mutation Induced mutagenesis is a stress-induced reaction to in- qˆ = equilibrium frequency of mutant allele µ = rate of mutation to mutant allele New mutations can arise by erroneous DNA replication terrupted DNA replication due to UV-irradiation dam- age, polymerase-inhibition by antibiotics, etc. It leads s = coefficient of selection against mutant allele to a transient increase of mutability through error-prone polymerases, increased DNA uptake / recombination or Examples of autosomal recessive alleles: SMA activation of “sleeping” prophages in genome (spinal muscular atrophy) causes loss of muscle control through progressive atrophy (lower extremities first). Loss-of-function mutations lead to gradual of mo- tor neuron cells in parts of spinal chord. A high mutation rate to the loss-of-function allele leads to the maintenance 7.7.3 Mutation rate estimation of SMA at mutation-selection balance. Cystic fibrosis can cause various complications with the Mutations most frequent cause of death being lung problems (at µ = 80%). One in 25 people carries one copy of the reces- Time ∗ Genetic target sive disease allele in CFTR. The CFTR is use by Salmonella typhi to invade gut cells which causes Mutation is the ultimate source of . typhoid fever. The invasion can only happen in the Over time more and more alleles are added to a gene wild type CFTR but not with the cystic fibrosis causing 7.7.1 Mutation fitness effects pool. In the short run the effect may be weak, but it can mutation. Meaning that there is a trade-off in CFTR • Positive: rare, selected fro increase slowly cause substantial change in the long-term. between protection against causal agent of typhoid fever and causing cystic fibrosis in homozygous carriers. 7.9 Migration 7.12.1 Linkage equilibrium The frequency of any haplotype can be calculated by mul- For dominant lethal alleles the mutation - selection bal- Migration homogenizes allele frequencies across popula- tiplying the frequencies of the constituent alleles. ance looks as follows tions if not opposed by other forces of evolution such as selection. FST values measure the degree to which sepa- qˆ = µ g = f(AB)= f(A)∗f(B); g = f(Ab)= f(A)∗f(b) rate populations are genetically distinct due to absence AB Ab of gene flow. g = f(aB)= f(a) ∗ f(B); g = f(ab)= f(a) ∗ f(b) 7.8 Genetic Drift aB ab With g being the genotype frequency, one can calculate • random survival of alleles (sampling errors) the coefficient of linkage disequilibrium D: • most powerful in small populations 7.10 Evolutionary change • strongest when natural selection is weakest − The forces that cause evolutionary change: D = gABgab gAbgaB • cannot produce Forces that create variation in evolving populations: Small population/sample sizes magnify the effects of Linkage happens due to relative physical location on • mutation change hence replication in experimental science. , but linkage equilibrium and disequilibrium • recombination are characteristics of populations: Forces that determine the fate of variation: • In a population in linkage equilibrium the frequen- • cies of B and b alleles are the same on A- and a- 7.8.1 Heterozygosity is reduced over time selection • genetic drift bearing chromosomes (and vice versa). • The probability that any given allele will drift to fixation • migration In a population in linkage disequilibrium the fre- is equal to its frequency in the population: • (indirectly: non-random mating, NRM) quencies of B and b alleles are different on A- vs. a-bearing chromosomes (and vice versa). x P (fix)= 2N 7.11 Selfing & Inbreeding Deviations from HW expectations suggests that one of where x = total number of allele copies and three mechanisms causing linkage disequilibrium is at N = # of diploid individuals in population Selfing (self-fertilization) reduces heterozygote frequency. work: Inbreeding can depress average fitness. • selection on multi-locus genotypes • genetic drift 7.8.2 Effective population size (N ) E • population admixture (i.e. migration) Effective population sizes are especially sensitive to un- 7.12 Evolution at multiple loci: linkage equal sex ratios in populations. The higher the inequality disequilibrium the lower the NE and the higher the effect of genetic drift. Extension of Hardy-Weinberg analysis to two loci 4 ∗ Nmales ∗ Nfemales through tracking of not only allele but also chromosome NE = Nmales + Nfemales frequencies. There is genetic linkage between two loci if they are on the same non-recombining stretch of a chromosome 7.8.3 Chance, Determinism, and Evolution a in nuclear chromosomes of sexual diploids • Mutational input is largely random loci remain together after meiotic crossing • Evolution by selection is non-random b in non-recombining organisms / • Evolution by genetic drift is random loci are on the same single chromosome → The power of drift is inversely proportional to the If two loci are linked, selection on one locus can affect power of selection. the evolutionary fate of the other (a hitchhiking-effect)