Population Genetics of Polymorphism and Divergence Stanley A

Population Genetics of Polymorphism and Divergence Stanley A

Population Genetics of Polymorphism and Divergence Stanley A. Sawyer∗, † and Daniel L. Hartl† ∗Department of Mathematics, Washington University, St. Louis, Missouri 63130, †Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110 November 3, 1992 ABSTRACT Frequencies of mutant sites are modeled as a Poisson random field in two species that share a sufficiently recent common ancestor. The selective effect of the new alleles can be favorable, neutral, or detrimental. The model is applied to the sample configurations of nucleotides in the alcohol dehydrogenase gene (Adh) in Drosophila simulans and D. yakuba (McDonald and Kreitman 1991, Nature 351: 652–654). Assuming a synonymous mutation rate of 1.5 × 10−8 per site per year and 10 generations per year, we obtain 6 estimates for the effective population size (Ne =6.5 × 10 ), the species divergence time −6 (tdiv =3.74 Myr), and an average selection coefficient (σ =1.53 × 10 per generation for advantageous or mildly detrimental replacements), although it is conceivable that only two of the amino acid replacements were selected and the rest neutral. The analysis, which includes a sampling theory for the independent infinite sites model with selection, also suggests the estimate that the number of amino acids in the enzyme that are susceptible to favorable mutation is in the range 2–23 out of 257 total possible codon positions at any one time. The approach provides a theoretical basis for the use of a 2 × 2 contingency table to compare fixed differences and polymorphic sites with silent sites and amino acid replacements. t has been more than 25 years since Lewontin have been more at odds, and a great controversy en- I and Hubby (1966) first demonstrated high levels sued. To a large extent the issue has been clouded of molecular polymorphism in Drosophila pseudoob- by inadequate data (Lewontin 1974, 1991). Ob- scura. This finding had two strong immediate ef- servations of natural populations are snapshots of fects on evolutionary genetics: it stimulated molec- particular places and times, and the resulting infer- ular studies of many other organisms, and it led to ences about the long-term fate of molecular poly- a vigorous theoretical debate about the significance morphisms can be challenged by neutralists and se- of the observed polymorphisms (Lewontin 1991). lectionists alike. By the same token, laboratory ex- The experimental studies soon came to a consen- periments capable of detecting selection coefficients sus in demonstrating widespread molecular poly- as small as are likely to be important in nature morphism in numerous species of plants, animals, are currently impractical (Hartl and Dykhuizen and microorganisms. The theoretical debate was 1981, Hartl 1989), although some large effects have not so quickly resolved. One viewpoint (Kimura been documented (see Powers et al. 1991 for a re- 1968, 1983) held that most observed molecular vari- view). ation within and among species is essentially selec- In the 1980s, the increasing use of DNA se- tively neutral, with at most negligible effects on sur- quencing in evolutionary genetics gave some hope vival and reproduction. Opposed was the classical that the impasse could be overcome. Direct ex- Darwinian view that molecular polymorphism is the amination of genes, rather than the electrophoretic raw material from which natural selection fashions mobility of gene products, yields vast amounts of evolutionary progress, and that the newly observed information consisting of hundreds or thousands of molecular variation was unlikely to be any different nucleotides. The data are also of a different qual- Lewontin ( 1974). The two viewpoints could not ity, since the DNA sequences are unambiguous and 2 S.A.SawyerandD.L.Hartl contain both synonymous nucleotide differences and ment fixed differences. When 30 aligned DNA se- differences that change amino acids. To the ex- quences from the alcohol dehydrogenase (Adh) locus tent that the synonymous differences are subjected of three species of Drosophila were compared (Mc- to weaker selective effects than amino acid differ- Donald and Kreitman 1991a), there were too few ences, comparisons between the two types of poly- polymorphic replacement sites (P =0.007, two- morphisms can serve as a basis of inference. Syn- sided Fisher exact test). McDonald and Kreit- onymous polymorphisms are more common than man (1991a) argue that the most likely reason for amino acid polymorphisms (Kreitman 1983), and the discrepancy is that some of the amino acid dif- also appear to be more weakly affected by selection ferences were fixed as a result of positive selection (Sawyer, Dykhuizen,andHartl 1987). acting on replacement mutations. The possibility that the fixed differences could have resulted from With data from only one species, the level of a combination of slightly deleterious alleles (Ohta synonymous and replacement polymorphism must 1973), coupled with a dramatically changing popu- be substantial in order for statistical analysis to lation size, was also considered by McDonald and have enough power to detect selection (Sawyer, Kreitman (1991a) but considered implausible be- Dykhuizen,andHartl 1987; Hartl and Sawyer cause this would seem to require extraordinarily fine 1991). Most eukaryotic genes are not sufficiently tuning among a large number of independent pa- polymorphic to allow this approach. An alternative rameters. approach, pioneered by Hudson, Kreitman,and Aguade´ (1987), is based on comparing polymor- Although the McDonald-Kreitman test has con- phisms within species with fixed differences between siderable intuitive appeal, little quantitative theory species. This approach has been applied to the exists for the comparison of intraspecific polymor- Drosophila fourth chromosome (Berry, Ajioka, phism with interspecific divergence in the presence and Kreitman 1991) as well as to the tip of the of selection. In this paper we present such a the- X chromosome (Begun and Aquadro 1991), both ory. Among other things, it addresses the question of which are regions of reduced recombination. The of whether the imbalance in the Adh contingency level of polymorphism in these regions is also re- table could have resulted from the random fixation duced, and the analysis suggests strongly that the of mildly deleterious alleles over an extremely long reduction is the result of genetic hitchhiking associ- time in a population of constant size, rather than ated with periodic selective fixations. fixations of advantageous alleles in a shorter period of time. The theory also provides an estimate of the Comparison of molecular variation within and average amount of selection required to produce the between species is also the crux of a statistical test discrepancy observed, as well as an estimate of the proposed by McDonald and Kreitman (1991a). rate at which favorable mutations occur (or, equiv- The test is for homogeneity of entries in a 2 × 2 alently, an estimate of the average number of amino contingency table based on aligned DNA sequences. acids in the protein that are susceptible to a fa- The rows in the contingency table are the num- vorable mutation at any one time). Several objec- bers of replacement or synonymous nucleotide dif- tions to the details of the implementation of the ferences, and the columns are either the numbers McDonald-Kreitman test have been raised (Graur of fixed differences between species or else of poly- and Li 1991, Whittam and Nei 1991), and these morphic sites within species. Here polymorphic sites are also addressed briefly. are defined as sites that are polymorphic within one or more of the species, and fixed differences are de- fined as sites that are monomorphic (fixed) within Rationale and Results of the Analysis each species but differ between species. The term silent refers to nucleotide differences in codons that The first step in our method is to analyze the sample do not alter the amino acid, and replacement refers configurations of the nucleotides occurring at syn- to nucleotide differences within codons that do alter onymous sites in the aligned DNA sequences un- the amino acid. The McDonald-Kreitman test com- der the assumption that the synonymous variation pares the number of silent and replacement poly- is selectively neutral. This information is used to morphic sites with the number of silent and replace- estimate the mutation rate at silent sites and the Polymorphism and Divergence 3 divergence time between pairs of species. The diver- tween sites. There is considerable linkage dise- gence time is critical because, if the divergence time quilibrium in Adh around the Fast versus Slow between species is sufficiently long, then conceiv- electrophoretic polymorphism in D. melanogaster ably all of the fixed amino acid differences between (but not in D. simulans and D. yakuba). Possi- species could be due to the fixation of mildly delete- ble balancing or clinal selection on this polymor- rious alleles, and the significance of the McDonald- phism may not only affect nucleotide configurations Kreitman contingency table might be an artifact of in D. melanogaster, but also may not be appropri- saturation at silent sites. Using the estimated values ate for the model of genic selection that we apply of the silent mutation rate and the divergence time, below. Among the three species for which McDon- the numbers of synonymous polymorphic sites and ald and Kreitman (1991a) have Adh sequences, fixed differences predicted from the neutral configu- we are most confident in applying the analysis to ration theory are compared with the observed num- the D. simulans versus D. yakuba comparison. The bers. These estimates fit the observed Adh data resulting analysis of the joint nucleotide configura- very closely for all three pairwise species compar- tions at silent sites for D. simulans and D. yakuba isons, which suggests that the configuration distri- leads to the following estimates for the scaled silent butions at synonymous sites are roughly consistent mutation rate µs (summed over synonymous sites) with an equilibrium neutral model.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    20 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us