Population

Joe Felsenstein

GENOME 453, Winter 2004

Population Genetics – p.1/47 Godfrey Harold Hardy (1877-1947) Wilhelm Weinberg (1862-1937)

Population Genetics – p.2/47 A Hardy-Weinberg calculation

5 AA 2 Aa 3 aa 0.50 0.20 0.30

Population Genetics – p.3/47 A Hardy-Weinberg calculation

5 AA 2 Aa 3 aa 0.50 0.20 0.30

0.50 + (1/2) 0.20 (1/2) 0.20 + 0.30

Population Genetics – p.4/47 A Hardy-Weinberg calculation

5 AA 2 Aa 3 aa 0.50 0.20 0.30

0.50 + (1/2) 0.20 (1/2) 0.20 + 0.30 0.6 A 0.4 a

0.6 A

0.4 a

Population Genetics – p.5/47 A Hardy-Weinberg calculation

5 AA 2 Aa 3 aa 0.50 0.20 0.30

0.50 + (1/2) 0.20 (1/2) 0.20 + 0.30 0.6 A 0.4 a

0.6 A 0.36 AA 0.24 Aa

0.4 a 0.24 Aa 0.16 aa

Population Genetics – p.6/47 A Hardy-Weinberg calculation

5 AA 2 Aa 3 aa 0.50 0.20 0.30

0.50 + (1/2) 0.20 (1/2) 0.20 + 0.30 0.6 A 0.4 a

0.6 A 0.36 AA 0.24 Aa Result: 0.36 AA 0.6 A 1/2 0.48 Aa 1/2 0.4 a 0.4 a 0.24 Aa 0.16 aa 0.16 aa

Population Genetics – p.7/47 Calculating the frequency (two ways)

Suppose that we have 200 individuals: 83 AA, 62 Aa, 55 aa

Method 1. Calculate what fraction of gametes bear A:

Genotype Number frequency Fraction of gametes all 83 0.415 AA 0.57 A 1/2 Aa 62 0.31 1/2 0.43 a

aa 55 0.275 all

Population Genetics – p.8/47 Calculating the gene frequency (two ways)

Suppose that we have 200 individuals: 83 AA, 62 Aa, 55 aa

Method 2. Calculate what fraction of in the parents are A:

Genotype Number A's a's 83 166 0 228 AA = 0.57 A 400 Aa 62 62 62 172 = 0.43 a 400 aa 55 0 110

228+ 172 = 400

Population Genetics – p.9/47 The process of at one locus

are lethal in this case

gametes

zygotes

Population Genetics – p.10/47 The process of natural selection at one locus

genotypes are lethal in this case

gametes

zygotes

Population Genetics – p.11/47 The process of natural selection at one locus

genotypes are lethal in this case

gametes

zygotes

gametes

zygotes

Population Genetics – p.12/47 The process of natural selection at one locus

genotypes are lethal in this case

gametes

zygotes

gametes

zygotes

Population Genetics – p.13/47 The process of natural selection at one locus

genotypes are lethal in this case

gametes

zygotes

gametes

zygotes

gametes

...

Population Genetics – p.14/47 The process of natural selection at one locus

genotypes are lethal in this case

gametes

zygotes

gametes

zygotes

gametes

...

Population Genetics – p.15/47 The process of natural selection at one locus

genotypes are lethal in this case

gametes

zygotes

gametes

zygotes

gametes

...

Population Genetics – p.16/47 The process of natural selection at one locus

genotypes are lethal in this case

gametes

zygotes

gametes

zygotes

gametes

...

Population Genetics – p.17/47 The process of natural selection at one locus

genotypes are lethal in this case

gametes

zygotes

gametes

zygotes

gametes

...

Population Genetics – p.18/47 The process of natural selection at one locus

genotypes are lethal in this case

gametes

zygotes

gametes

zygotes

gametes

...

Population Genetics – p.19/47 The process of natural selection at one locus

genotypes are lethal in this case

gametes

zygotes

gametes

zygotes

gametes

...

Population Genetics – p.20/47 The process of natural selection at one locus

genotypes are lethal in this case

gametes

zygotes

gametes

zygotes

gametes

...

Population Genetics – p.21/47 A numerical example of natural selection

Genotypes: AA Aa aa relative 1 1 0.7 (assume these are viabilities) fitnesses: Initial gene frequency of A = 0.2 Initial genotype frequencies (from Hardy−Weinberg) (newborns) 0.04 0.32 0.64

Population Genetics – p.22/47 A numerical example of natural selection

Genotypes: AA Aa aa relative 1 1 0.7 (assume these are viabilities) fitnesses: Initial gene frequency of A = 0.2 Initial genotype frequencies (from Hardy−Weinberg) (newborns) 0.04 0.32 0.64 x 1 x 1 x 0.7 Survivors (these are relative viabilities) 0.04 + 0.32 + 0.448 = Total: 0.808

Population Genetics – p.23/47 A numerical example of natural selection

Genotypes: AA Aa aa relative 1 1 0.7 (assume these are viabilities) fitnesses: Initial gene frequency of A = 0.2 Initial genotype frequencies (from Hardy−Weinberg) (newborns) 0.04 0.32 0.64 x 1 x 1 x 0.7 Survivors (these are relative viabilities) 0.04 + 0.32 + 0.448 = Total: 0.808

genotype frequencies among the survivors: (divide by the total) 0.0495 0.396 0.554

Population Genetics – p.24/47 A numerical example of natural selection

Genotypes: AA Aa aa relative 1 1 0.7 (assume these are viabilities) fitnesses: Initial gene frequency of A = 0.2 Initial genotype frequencies (from Hardy−Weinberg) (newborns) 0.04 0.32 0.64 x 1 x 1 x 0.7 Survivors (these are relative viabilities) 0.04 + 0.32 + 0.448 = Total: 0.808

genotype frequencies among the survivors: (divide by the total) 0.0495 0.396 0.554 gene frequency A: 0.0495 + 0.5 x 0.396 = 0.2475 a: 0.554 + 0.5 x 0.396 = 0.7525

Population Genetics – p.25/47 A numerical example of natural selection

Genotypes: AA Aa aa relative 1 1 0.7 (assume these are viabilities) fitnesses: Initial gene frequency of A = 0.2 Initial genotype frequencies (from Hardy−Weinberg) (newborns) 0.04 0.32 0.64 x 1 x 1 x 0.7 Survivors (these are relative viabilities) 0.04 + 0.32 + 0.448 = Total: 0.808

genotype frequencies among the survivors: (divide by the total) 0.0495 0.396 0.554 gene frequency A: 0.0495 + 0.5 x 0.396 = 0.2475 a: 0.554 + 0.5 x 0.396 = 0.7525 genotype frequencies: (among newborns) 0.0613 0.3725 0.5663 Population Genetics – p.26/47 The algebra of natural selection

New gene frequency is then (adding up A bearers and dividing by everybody) Genotype: AA Aa aa 2 p w + (1/2) 2pq w 2 2 AA Aa Frequency: p 2pq q p' = 2 2 p w + 2pq w + q w Relative fitnesses: w w w Aa AA Aa aa AA aa 2 2 mean fitness of A After selection: p w 2pq w q w AA Aa aa p ( p w + q w ) w AA Aa A p' = = p 2 2 w p w + 2pq w + q w Note that these don't add up to 1 AA Aa aa

mean fitness of everybody

Population Genetics – p.27/47 Is effective?

Suppose (relative) fitnesses are: So in this example each change of AA Aa aa a to A multiplies the fitness 1 2 by (1+s), so that it increases it

(1+s) 1+s 1 by a fraction s. A x (1+s) x (1+s)

The time for gene frequency change, in generations, turns out to be: change of gene frequencies 0.5 s 0.01 − 0.1 0.1 − 0.5 0.5 − 0.9 0.9 − 0.99 1 3.46 3.17 3.17 3.46 0.1 25.16 23.05 23.05 25.16

0.01 240.99 220.82 220.82 240.99 of frequency gene 0 0.001 2399.09 2198.02 2198.02 2399.09 generations

Population Genetics – p.28/47 An experimental selection curve

Population Genetics – p.29/47 Rare occur mostly in heterozygotes

This shows a population in Hardy−Weinberg equilibrium at gene frequencies of 0.9 A : 0.1 a

Genotype frequencies: 0.81 AA : 0.18 Aa : 0.01 aa Note that of the 20 copies of a, 18 of them, or 18 / 20 = 0.9 of them are in Aa genotypes

Population Genetics – p.30/47 Overdominance and

AA Aa aa 1 − s 1 1 − t

when A is rare, most A's are in Aa, and most a's are in aa

The average fitness of A−bearing genotypes is then nearly 1

The average fitness of a−bearing genotypes is then nearly 1−t

So A will increase in frequency when rare

when a is rare, most a's are in Aa, and most A's are in AA

The average fitness of a−bearing genotypes is then nearly 1

The average fitness of A−bearing genotypes is then nearly 1−s

So a will increase in frequency when rare

gene frequency of A 0 1 Population Genetics – p.31/47 Underdominance and unstable equilibrium

AA Aa aa 1+s 1 1+t

when A is rare, most A's are in Aa, and most a's are in aa

The average fitness of A−bearing genotypes is then nearly 1

The average fitness of a−bearing genotypes is then nearly 1+t

So A will decrease in frequency when rare

when a is rare, most a's are in Aa, and most A's are in AA

The average fitness of a−bearing genotypes is then nearly 1

The average fitness of A−bearing genotypes is then nearly 1+s

So a will decrease in frequency when rare

gene frequency of A 0 1 Population Genetics – p.32/47 Fitness surfaces (adaptive landscapes)

Overdominance Underdominance stable equilibrium

__ __ w w unstable equilibrium

(gene frequency changes) (gene frequency changes)

0 1 0 1 p p Is all for the best in this best of all possible worlds?

Can you explain the underdominance result in terms of rare alleles being mostly in heterozygotes?

Population Genetics – p.33/47 1 Gene frequency

0 0 1 2 3 4 5 6 7 8 9 10 11

Time (generations) Population Genetics – p.34/47 Distribution of gene frequencies with drift

0 1

0 1 time

0 1

0 1

0 1 Note that although the individual populations wander their average hardly moves (not at all when we have infinitely many populations)

Population Genetics – p.35/47 A cline (name by Julian Huxley)

no migration 1 some

more gene frequency

0

geographic position

Population Genetics – p.36/47 A famous common-garden experiment

Clausen, Keck and Hiesey’s (1949) common-garden experiment in Achillea lanulosa

Population Genetics – p.37/47 Heavy metal

Population Genetics – p.38/47 House sparrows

Population Genetics – p.39/47 House sparrows

Population Genetics – p.40/47 Rates Coat color mutants in mice. From Schlager G. and M. M. Dickie. 1967. Spontaneous and mutation rates in the house mouse. Genetics 57: 319-330

Locus Gametes tested No. of Mutations Rate

Nonagouti 67,395 3 4.4 × 10−6 Brown 919,619 3 3.3 × 10−6 Albino 150,391 5 33.2 × 10−6 Dilute 839,447 10 11.9 × 10−6 Leaden 243,444 4 16.4 × 10−6 ——- — ————- Total 2,220,376 25 11.2 × 10−6

Population Genetics – p.41/47 Mutation rates in humans

Population Genetics – p.42/47 Forward vs. back mutations Why mutants inactivating a functional gene will be more frequent than back mutations The gene

12 places can mutate to nonfunctionality

only one place can mutate back to

function can sometimes be restored by a "second site" mutation, too

Population Genetics – p.43/47 A sequence space

For sequences of length 1000, there are 3 X 1000 = 3000 "neighbors" one step away in sequence space 1000 602 But there are 4 sequences, which is about 10 in all ! No two of them are more than 1000 steps apart. Hard to draw such a space

How do we ever evolve? Woiuldn't it be impossible to find one of the tiny fraction of possible sequences that would be even marginally functional? The answer seems to be that the sequences are clustered An example of such clustering is the English language, as illustrated by a popular word game:

W O R D But the word BCGH W O R E cannot be made into G O R E an English word G O N E G E N E

There are also only a tiny fraction of all 456,976 four−letter words that are English words But they are clustered, so that it is possible to "evolve" from one to another through intermediates

Population Genetics – p.44/47 Mutation as an evolutionary force

−6 If we have two alleles A and a, and mutation rate from A to a is 10 and mutation rate back is the same,

1

0.5

0 0 1 million generations

Mutation is critical in introducing new alleles but is very slow in changing their frequencies

Population Genetics – p.45/47 Estimation of a human mutation rate By an equilibrium calculation. Huntington’s disease. Dominant. Does not express itself until after age 40. 1/100, 000 of people of European ancestry have the gene. Reduction in fitness maybe 2%.

If frequency is q, then 2q(1 − q) of everyone are heterozygotes. 0.02 of these die. Each has half its copies the Huntington’s allele. So as the frequency of people with the gene is ' 1/100, 000, the fraction of all copies that are mutations that are eliminated is 0.00001 × 1/2 × 0.02 ' 10−7 If we are at equilibrium between mutation and selection, this is also the fraction of copies that have a new mutation.

Similar calculations can be done with recessive alleles.

Population Genetics – p.46/47 How it was done This projection produced using the prosper style in LaTeX, using Latex to make a .dvi file, using dvips to turn this into a Postscript file, using ps2pdf to mill a PDF file, and displaying the slides in Adobe Acrobat Reader.

Result: nice slides using freeware.

Population Genetics – p.47/47