
Copyright 0 1992 by the Genetics Society of America Population Geneticsof Polymorphism and Divergence Stanley A. Sawyer*'+and Daniel L. Hartlt *Department of Mathematics, Washington University, St.Louis, Missouri 63130 and TDepartmentof Genetics, Washington University School of Medicine, St. Louis, Missouri 631 10 Manuscript received March 7, 1992 Accepted for publication August12, 1992 ABSTRACT Frequencies of mutant sites are modeled as a Poisson random field in two species that share a sufficiently recent common ancestor.The selective effectof the new alleles can be favorable,neutral, or detrimental. The model is applied to the sample configurations of nucleotides in the alcohol dehydrogenase gene (Adh) in Drosophila simulans and Drosophila yakuba. Assuming a synonymous mutation rate of 1.5 X lo-' per site per year and 10 generations per year,we obtain estimates for the effective population size (N, = 6.5 X lo')), the species divergence time (tdiv= 3.74 million years), and an average selection coefficient(u = 1.53 X 1O"j per generation for advantageousor mildly detrimental replacements), althoughit is conceivable that only twoof the amino acid replacements were selected and the rest neutral.The analysis, which includesa sampling theory forthe independent infinitesites model with selection, also suggests the estimate that the number of amino acids in the enzyme that are susceptible to favorable mutation is in the range 2-23 at any one time. The approach provides a theoretical basis for the use of a 2 X 2 contingency tableto compare fixed differences and polymorphic sites with silent sites and amino acid replacements. T has been more than 25 years since LEWONTIN currently impractical (HARTLand DYKHUIZEN198 1 ; I and HUBBY(1 966) first demonstrated high levels HARTL 1989), althoughsome large effects have been of molecular polymorphism in Drosophilapseudoob- documented [see POWERSet al. (1 99 1) for a review]. scura. This finding had two strong immediate effects In the 1980s, theincreasing use of DNA sequencing on evolutionary genetics: it stimulated molecular stud- in evolutionary genetics gave some hopethat the ies of many other organisms, and it led to a vigorous impasse could be overcome.Direct examination of theoretical debate about the significance of the ob- genes, ratherthan the electrophoretic mobility of served polymorphisms (LEWONTIN199 1). The exper- geneproducts, yieldsvast amounts of information imental studies soon came to a consensus in demon- consisting of hundreds or thousands of nucleotides. strating widespreadmolecular polymorphism in The data are also of a different quality,since the DNA numerous species of plants,animals, and micro- sequences are unambiguous and contain both synon- organisms. The theoretical debate was not so quickly ymous nucleotidedifferences and differences that resolved. One viewpoint (KIMURA1968, 1983) held change amino acids. To the extent that synonymous that most observedmolecular variation within and differences are subjected to weaker selective effects among species is essentially selectively neutral, with at than aminoacid differences, comparisonsbetween the most negligible effects on survival and reproduction. two types of polymorphisms can serve as a basis of Opposed was the classical Darwinian view that molec- inference.Synonymous polymorphisms are more ular polymorphism is the raw materialfrom which common than amino acid polymorphisms (KREITMAN natural selection fashions evolutionary progress, and 1983), andalso appear to be moreweakly affected by that the newly observed molecular variation was un- selection (SAWYER, DYKHUIZENand HARTL 1987). likely to be any different (LEWONTIN1974). The two With data from only one species, the level of syn- viewpoints could not have been more at odds, and a onymous andreplacement polymorphism mustbe great controversy ensued. To a large extent theissue substantial in orderfor statistical analysis to have has been clouded by inadequatedata (LEWONTIN enough power to detect selection (SAWYER, DYKHU- 1974, 1991). Observations of natural populations are IZEN and HARTL 1987; HARTL and SAWYER1991). snapshots of particular places and times, and the re- Most eukaryotic genes are not sufficiently polymor- sulting inferences about the long-term fate of molec- phic to allow this approach. An alternative approach, ular polymorphisms can be challenged by neutralists pioneered by HUDSON, KREITMAN andAGUAD~ and selectionists alike. By the same token, laboratory (1 987),is based on comparing polymorphisms within experiments capable of detectingselection coefficients species with fixed differences between species. This as small as are likely to be important in nature are approach has been applied to the Drosophila fourth Genetics 132: 1161-1 176 (December, 1992) 1162 S. A. Sawyer and D. L. Hart1 chromosome (BERRY,AJIOKA, and KREITMAN199 1) provides an estimate of the average amount of selec- as well as to the tipof the X chromosome (BEGUNand tion required to produce thediscrepancy observed, as AQUADRO 199l), both of whichare regions of reduced well as anestimate of the rate at which favorable recombination. The level of polymorphism in these mutations occur (or, equivalently, an estimate of the regions is also reduced,and the analysis suggests average number of amino acids in the protein that are strongly that the reduction is the result of a genetic susceptible to a favorable mutation at any one time). hitchhiking associated with periodic selective fixa- Several objections to thedetails of the implementation tions. of the McDonald-Kreitman test have been raised Comparison of molecular variation within and be- (GRAUR andLI 1991; WHITTAMand NEI 199 l), and tween species is also thecrux of a statistical test these are also addressed briefly. proposed by MCDONALDand KREITMAN(1 99la). The test is for homogeneity of entries in a 2 x 2 contin- RATIONALEAND RESULTS OF THE ANALYSIS gency table based on aligned DNA sequences. The rows in the contingency table arethe numbers of The first step in our method is to analyze the sample replacement or synonymous nucleotide differences, configurations of the nucleotides occurring at synon- and the columns areeither the numbers of fixed ymous sites in the aligned DNA sequences under the differences between species or else of polymorphic assumption thatthe synonymous variation is selec- sites within species. Here polymorphic sites are defined tively neutral. This informationis used to estimate the as sites that are polymorphic within one or more of mutation rate at silent sites and the divergence time the species, and Fxed dijferences are defined as sites between pairs of species. The divergence time is crit- that are monomorphic (fixed) within each species but ical because, if the divergence time between species is sufficiently long,then conceivably allof the fixed differ between species. The term silent refers to nu- amino acid differences between species could be due cleotide differences in codons that do not alter the to the fixation of mildly deleterious alleles, and the amino acid, and replacement refers to nucleotide dif- significance of the McDonald-Kreitman contingency ferences within codons that do alter the amino acid. table might be an artifact of saturation at silent sites. The McDonald-Kreitman test compares the number Using the estimated values of the silent mutation rate of silent and replacement polymorphic sites with the and the divergence time, the numbers of synonymous number of silent and replacement fixed differences. polymorphic sites and fixed differencespredicted When 30 aligned DNA sequences from the alcohol from the neutral configuration theory are compared dehydrogenase (Adh) locus of three species of Dro- with the observed numbers. These estimates fit the sophila were compared (MCDONALDand KREITMAN observed Adh data very closely for all three pairwise 1991a), there were too few polymorphic replacement species comparisons, which suggests that the configu- sites (P = 0.007, two-sided Fisher exact test). MC- ration distributions at synonymous sites are roughly DONALD andKREITMAN (1991a) argues that themost consistent with an equilibrium neutral model. likely reason for the discrepancy is that some of the The second step is to develop equationsfor the amino acid differences were fixed as a result of posi- expected number of polymorphic sites and fixed dif- tive selection acting on replacement mutations. The ferences between a pair of speciesin terms of the possibility that the fixed differences could have re- magnitude and direction of selection, the mutation sulted from a combination of slightly deleterious al- rate to new alleles having a given (constant) selective leles (OHTA 1973),coupled with a dramatically chang- effect, and thedivergence time. From these equations ing population size, was also considered byMC- we estimate the amountof selection needed toexplain DONALDand KREITMAN (199la) but considered the observed deficiency or excess in the number of implausible because this would seem to require ex- replacement polymorphisms. Wealso estimate the traordinarily fine tuning among a large number of rate of new mutations resultingin amino acid replace- independent parameters. ments (or, equivalently, the number of amino acid Although the McDonald-Kreitman test has consid- sites in the proteinproduct at which favorable or erable intuitive appeal, little quantitative theory exists mildly deleterious substitutionsare possible at any one for thecomparison of intraspecific polymorphism with time).
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages16 Page
-
File Size-