<<

J Med Genet: first published as 10.1136/jmg.8.4.444 on 1 December 1971. Downloaded from Annotation

Journal of Medical Genetics (1971). 8, 444.

Polymorphism and Protein The Neutral -Random Drift Hypothesis HARRY HARRIS From the Galton Laboratory, University College London, Wolfson House, London NW1 Enzyme and Protein Diversity 1969; Selander and Yang, 1969), and the horseshoe In recent years many examples of what are con- or king crab Limulus polyphemus (Selander et al, veniently referred to as enzyme and protein 1970). The results indicate that two or more polymorphisms have been discovered in the course relatively common giving rise to electro- of surveys of human and of naturally phoretic differences may exist at some 25-40% of occurring populations of other animal species. loci coding for enzyme and protein structure. The term polymorphism is used in this context for The frequencies of the common alleles which any situation where members of a can be make up the different polymorphisms vary consider- sharply classified into several distinct in ably from case to case. However a useful index of terms of particular characteristics of the enzyme or the extent of the phenomenon in a particular popu- protein, and where at least two of the phenotypes lation is obtained by estimating the average hetero- per . This is the sum of the have an appreciable incidence (greater than 2%). pro-copyright. Most of these polymorphisms can be attributed to portion of individuals found to be heterozygous at the occurrence of two or more alleles each coding each locus divided by the total number of loci for a structurally distinct of a polypeptide studied, including of course both 'polymorphic' and chain in the particular enzyme or protein. It is 'non-polymorphic' loci. Studies on some 26 thought that the structural difference between the arbitrarily chosen loci in European and negro polypeptides usually amounts to no more than a populations gave values in each population of just single amino-acid substitution, and that the allelic over 0-05 for the average heterozygosity per locus difference originated in a single mutational event which could be detected by enzyme electrophoresis http://jmg.bmj.com/ involving the change of only one base for another (Harris, 1970). However it was of interest that the in the sequence of several hundred or thousand contributions of different loci to this average bases in the DNA of the particular . But sometimes varied quite widely between the two direct evidence for this has so far only been ob- ethnic groups. Since it seems probable that at tained in a limited number of polymorphisms, most only about one third of all possible structural and there are certainly some exceptions where the changes in an enzyme produced by polypeptide products of the two common alleles would be detected electrophoretically, the results on September 27, 2021 by guest. Protected differ by more than one amino acid, eg, the sheep suggest that the true value for average heterozy- haemoglobins A and B (Boyer et al, 1967). gosity per locus may be around 0-16. Put another Electrophoretic surveys of arbitrarily chosen way, one might expect that any single individual enzymes and proteins have been carried out in in these populations would be heterozygous at per- various naturally occurring animal populations to haps 16% of gene loci coding for enzyme structure. see how often such polymorphisms occur. A Less systematic data on some 30 other enzyme loci number of very different species have been studied in man appear to lead to essentially the same esti- in this way. They include man (Harris, 1966 and mate. Similar or higher estimates of average 1969), Drosophila pseudoobscura (Hubby and heterozygosity per locus have been obtained from Lewontin, 1966; Lewontin, and Hubby, 1966; electrophoretic studies of different enzyme and Prakash, Lewontin, and Hubby, 1969), mouse other proteins in naturally occurring populations of (Ruddle et al, 1969; Selander, Hunt, and Yang, Drosophila pseudoobscura, D. persimilis, Mus mus- culus, Peromyscus polionotus, and Limulus polyphe- Received 28 May 1971. mus (Selander et al, 1970). 444 J Med Genet: first published as 10.1136/jmg.8.4.444 on 1 December 1971. Downloaded from Polymorphism and Protein Evolution 445 It should be pointed out that these investiga- clinical abnormality is seen in the heterozygous tions have been largely confined to soluble enzymes state. and other proteins which can be examined electro- It is usual to regard any with a frequency in phoretically. It is possible, for example, that the population of less than 10-2 as 'rare'. In fact structural proteins or membrane-bound enzymes the majority of rare alleles which have so far been may differ in this respect, though there is as yet found appear to have frequencies much less than little evidence one way or another on this point. this. In surveys involving 2000 to 10,000 indi- Also of course only a very small number of different viduals particular 'rare' alleles are often found in enzymes and proteins have been investigated com- only one or two individuals. So their individual pared with the many thousands that actually occur frequencies appear to be generally less than 10-s in an organism. So the estimates of the fraction of and usually perhaps less than 10- 4. However, polymorphic loci or of average heterozygosity per because quite a number of different 'rare' alleles can locus must be regarded as very rough and tentative. evidently occur at any one locus, they may contri- Nevertheless it is clear that in each of several very bute appreciably to individual heterozygosity different species enzyme and protein polymorphism (Hopkinson and Harris, 1971). Rough estimates is a not infrequent phenomenon. It is certainly derived from electrophoretic surveys of enzymes very much more extensive than was generally in human populations suggest that any single indi- thought possible only a few years ago. vidual may be heterozygous for one or another rare Besides the alleles giving rise to these common allele at perhaps 0-30' of loci coding for enzyme polymorphic differences many more alleles deter- structure. mining distinctive variants have also been found to occur when particular enzymes or proteins have been examined by suitable procedures in sufficiently Mutation, Selection, and Drift large numbers of individuals. Human adult Thus it now seems likely that at most gene loci haemoglobin a and fi loci (Lehmann and Carrell, coding for enzyme or protein structure many 1969), and the enzyme glucose-6-phosphate dehy- different alleles occur among the members of natural drogenase (Kirkman, 1971) have for various reasons populations. At any particular locus the majority copyright. been studied from this point of view in more detail, are probably very rare, but there must of course and in very many more individuals than any other always be one allele which is very common, and proteins, and at each of the loci concerned some 50 often as we have seen there may be two. or more rare variants determined by different rare The alleles which are present in populations to- alleles have already been discovered. And the rate day may be presumed to have arisen by separate at which new examples are currently being reported mutations in individuals in earlier generations,

suggests that many more remain to be identified. some perhaps relatively recently, others in the more http://jmg.bmj.com/ Other enzymes and proteins have been surveyed less remote past. And in terms of classical theory one extensively, but nevertheless in a of cases would expect that in general their incidence and (eg, serum albumin, transferrin, phosphoglucomu- distribution today will reflect the operation of tase, phosphohexose isomerase, placental alkaline over the course of many previous phosphatase, NADH diaphorase, etc) a series of generations on the flux of mutant alleles being different variants each evidently due to a different continuously generated by fresh mutations at a low rare allele at the structural locus concerned have but steady rate. been found. In general it is now beginning to Natural selection will tend to eliminate from the on September 27, 2021 by guest. Protected appear that as more individuals are examined for a population mutant alleles which have relatively particular enzyme or protein the number of different deleterious effects on individuals who carry them, rare variants that are discovered progressively in- and tend to cause the spread of those which are creases. Some of these rare variants give rise in advantageous. Since one would expect that a either homozygous or heterozygous states to overt single random alteration in a complex structure like clinical abnormality, but many appear to be an enzyme protein is more often likely to impair relatively harmless. Furthermore the great major- than to improve its function, very many more of the ity of rare variants found in population surveys do mutants that occur at any particular locus are likely not appear when family investigations are carried to be detrimental than advantageous. Thus the out, to be the product of fresh mutations occurring conclusion that the great majority of variants due to in the immediately preceding generation. The alleles at a single locus are rare, is in general keep- occasional exceptions have usually been variants ing with this expectation. (such as unstable haemoglobins) where overt But the discovery that at an appreciable J Med Genet: first published as 10.1136/jmg.8.4.444 on 1 December 1971. Downloaded from 446 Harry Harris proportion of loci, two common alleles occur rather* likely which would have given than just one was rather surprising, and its inter- rise with the minimum number of mutational pretation has engendered a great deal of discussion changes to the amino-acid sequences in the proteins and controversy. as they are found to occur in the different species What can be called the classical view is that such today (Fitch and Margoliash, 1967). The evolu- polymorphisms are the consequences of differential tionary tree so constructed turns out to resemble selection. Many polymorphisms are thought to quite closely the phylogenetic relationships obtained represent stable, balanced equilibria due to hetero- in the orthodox way using morphological and other zygous advantage, frequency-dependent selection, standard taxonomic criteria. In this case however or other processes. Others however may represent the phylogenetic tree is entirely deduced from the intermediary stages in gene evolution in which one amino-acid sequences of a single polypeptide chain allele is progressively replacing another on its way and can therefore be taken to represent the evolu- to fixation. Such situations are sometimes called tion of a single gene. 'transient polymorphisms'. But whatever the Using estimates derived from palaeontology of details of a particular situation it is usually assumed the time that has elapsed since various groups of that the occurrence of two common alleles at any species diverged from a common ancestor, it is particular locus must have come about because of possible from such data to estimate and compare differential selection on the phenotypic differences the rates of by amino-acid sub- they produce. stitution in particular proteins and in different lines A quite different view of the matter has however of evolutionary descent (Kimura, 1969; King and recently been put forward, notably by Kimura Jukes, 1969). Thus for cytochrome C the average (Kimura, 1968; Kimura and Ohta, 1971). This sug- rate of amino-acid substitution per amino-acid gests that the enzyme and protein variants that make site per year appears to have been about 4 x up these polymorphisms are selectively neutral and 10-10. Not surprisingly the rate varies from pro- that the occurrence of two common alleles at many tein to protein. For example, the rate for haemo- different loci is a simple consequence of random globin appears to have been about 21 times greater . If correct, this hypothesis would than for cytochrome C, and the rate for fibrino- copyright. seem to call for a considerable reappraisal of many peptide A about 4 times that for long held haemoglobin. concepts in and in But it is claimed that for any one protein, the rate evolutionary . of evolutionary change by amino-acid substitution has been remarkably uniform over widely divergent Molecular Evolution lines of descent (Kimura, 1969). Thus the average rate of amino-acid substitution in haemoglobin is The strongest for evidence Kimura's hypothesis estimated as about 10-9 per amino-acid site per http://jmg.bmj.com/ comes from the mathematical analysis of the rates of year, and it appears that this rate has been essenti- molecular evolution of certain proteins such as ally the same in the evolutionary lines leading for cytochrome C and haemoglobin whose amino-acid example to the a haemoglobin chains of the carp sequences have been determined over a wide range and of man since they diverged from a common of species. ancestor some 350-400 million years ago; in the For example, the amino-acid sequence of the lines leading to the a or ,6 chains of various cytochrome mammals C obtained from some 30 different such as man, mouse, rabbit, horse, pig, and cow species on September 27, 2021 by guest. Protected has now been determined (Nolan and since they diverged from a common ancestor some Margoliash, 1968). In the sequence of the 100 or 80 million years ago; and indeed in the lines so amino leading acids which occur in each of these pro- to the human a and f chains since they originated teins, about 30 positions appear to be invariant, and by gene at a time estimated to the duplication have been others differ to varying extents in the amino- about 450 million years ago. acid substituents which occur. These amino-acid Kimura that this differences argues apparent constancy of may be assumed to have arisen during evolutionary rate in a protein can only be plausibly the course of evolution by single base change muta- accounted for by supposing that the evolutionary tions, and by comparison of the amino-acid dif- are due to the ferences changes fixation in different species between each pair of proteins it is possible of particular mutations which while resulting in to estimate the minimal number of mutations amino-acid substitutions in the which would protein cause no have been required for their diver- significant change in its functional properties, and gence from their nearest common ancestor. From which are therefore such data it selectively neutral. The term is then possible to construct the most non-Darwinian evolution has sometimes been J Med Genet: first published as 10.1136/jmg.8.4.444 on 1 December 1971. Downloaded from Polymorphism and Protein Evolution 447 used to describe such a process (King and Jukes, where Ne is the effective population number of the 1969). species, s, is the selective advantage of the mutant, Kimura has shown that if one defines the rate, k, and u is the rate at which advantageous mutants are of mutant substitution in evolution as the long term produced per gamete per unit time. They argue average of the number of mutants that are sub- that in order for the rate of amino-acid substitution stituted per unit time, then under the neutral to have remained constant per year over diverse mutation-random drift hypothesis, k = u, where lines such as the lines leading to carp and human u is the mutation rate per gamete for neutral mutants haemoglobins from a common ancestor, it would be at this locus per unit time (Kimura, 1968). It necessary to assume under natural selection that follows that the apparent uniformity of rate of the three different parameters N.,sl, and u were so amino-acid substitution per year in a particular adjusted that their product remained constant. polypeptide chain is readily accounted for, provided And they consider that this is highly implausible, that the substitutions are indeed selectively neutral particularly over lines leading to such species as and that rate of mutation of the gene to such neutral carp and man, which have been separate for some alleles has been the same in different species and in 350-400 million years, during which time many different lines of descent. marked differences at the phenotypic level have Only a proportion of all the possible mutants that evolved. may arise are assumed to be neutral, and the dif- In this connexion it should be pointed out that ferences in rates of amino-acid substitution in Kimura is not arguing that natural selection has been different proteins such as haemoglobin and cyto- of no importance at all in evolution. He is how- chrome C, can be explained by supposing that a ever saying that surrounding any adaptive changes different fraction of mutants are neutral in each case, that may occur because of natural selection, there is according to the structural and functional character- a very great deal of random noise from neutral istics of the different proteins. change, and that the differences in amino-acid An intriguing and unexpected consequence of the sequence of a particular protein in different species hypothesis is that for any given protein the neutral are largely, if not entirely, the product of such mutation rate per year should be the same in dif- neutral substitutions. copyright. ferent species. But mutation rates are usually considered in terms of the rate per gene per A Phase of generation rather than the rate per gene per year. Polymorphism: Molecular Since generation times vary widely from species to Evolution? species, it seems necessary on this hypothesis to Now whether these differences between species assume that they are proportional to mutation rate. are the products of neutral mutations, or whether mouse as Thus if in the the generation time is taken they are the result ofnatural selection, it is clear that http://jmg.bmj.com/ 6 months and in man as 20 years, the hypothesis they must have occurred in any given case by the implies that the mutation rate per gene per genera- progressive replacement of a pre-existing gene by a tion is about 40 times greater in man than in mouse. particular mutant which has spread through the The hypothesis that the differences in amino- species and eventually become fixed as the charac- acid sequence of a particular protein in different teristic . So at any one time one might expect species are simply the consequence of neutral to find evidence for such partially completed re- mutations which have been fixed by random placements occurring. The obvious candidates for genetic drift, is of course exactly the opposite of the this are of course the enzyme and protein poly- on September 27, 2021 by guest. Protected classical view which would ascribe the observed morphisms. evolutionary changes in protein structure to the A very large number of different alleles may in operation of natural selection. On this view the principle be generated by separate mutations within structure of a particular protein in a given species the confines of a single gene. Thus from a typical has been selected because it is best adapted to meet gene containing a DNA sequence say 900 bases long the particular features of the external environment and coding for a polypeptide chain of 300 amino in which the species , and also to meet the special acids, as many as 2700 different alleles may be internal environment which depends on the activi- generated by single base change mutations alone, ties of all the other present. since each base could be altered to one of three According to Kimura and Ohta (1971) the rate of others in different mutational events. From what mutant substitution in evolution by natural selection is known about the genetic code and about the is given by amino-acid composition of proteins one would ex- k=4Neslu pect that some 70% or so of these mutant alleles J Med Genet: first published as 10.1136/jmg.8.4.444 on 1 December 1971. Downloaded from AAQ Harry Harris would determine a structurally altered protein at first sight to be incorrect, since there seems no which differs from the original by a single amino- doubt that at least certain polymorphisms involve acid substitution. An indeterminate number of significant differences in the extent to which dif- other mutations involving deletions, duplications, or ferent phenotypes contribute to the next genera- other kinds of rearrangement of parts of the base tion. The best known example is the sickle-cell sequence of the gene may also occur. Kimura haemoglobin polymorphism in African populations suggests that although many of the possible struc- (Allison, 1964). Here it has been shown that the tural variants of a protein which may be generated heterozygotes are less susceptible than other by random mutation are detrimental and therefore individuals to the more serious consequences of tend to be eliminated by natural selection, there , so that in conditions where malaria is will be some which are effectively neutral in the endemic they have a better chance of surviving to sense that they do not produce any functional adult and so becoming parents. The selective change in the organism which would affect its difference that this confers is evidently the reason viability or fertility and hence its ability to contri- for the high incidence that this particular mutant bute offspring to the next generation. The per- has come to assume in Africa, despite its lethal sistence of such neutral mutants in subsequent effects in homozygotes who develop the severe generations will be largely a matter of chance, and disease, sickle-cell anaemia. Another probable ex- the majority will in fact be lost. But very occa- ample of such a selective differential is provided by sionally one of them will spread by random drift the glucose-6-phosphate dehydrogenase poly- and eventually become the predominant allelic morphism in Mediterranean populations. Here form. Such a process would however require a certain are predisposed to develop the very large number of generations before it were disease Favism and this must limit at least to some complete (about 4Ne generations), and so for quite extent their average contribution to the next long periods of time there would be two common generation. The effect is thought to be balanced alleles co-existing in the species. as in the sickle-cell case by a better viability of The argument is supported by an elegant mathe- heterozygotes under malarial conditions (Luzzatto, matical analysis (Kimura and Ohta, 1971) which Usanga, and Reddy, 1969). copyright. appears to show that the degree of heterozygosity However, it would I think be argued by Kimura resulting from polymorphism which has been ob- and his colleagues that these are special cases and served in different animal populations, is of a that their occurrence does not necessarily destroy magnitude consistent with that expected from the the general argument that most enzyme poly- rates of amino-acid substitution in protein evolu- morphisms are selectively neutral. And it is true tion, on the assumption that both the intraspecific that in the majority of the enzyme and protein

polymorphic differences and the interspecific polymorphisms that are known to occur in man http://jmg.bmj.com/ amino-acid substitutions are selectively neutral. and in other species there is as yet little direct evi- Thus the hypothesis suggests that protein polymor- dence to suggest that the several genotypes differ in phism and protein evolution should be regarded not either viability or effective fertility. as separate phenomena, but as two aspects of a single However, selectionists can reasonably contend process, that is random frequency drift of neutral that only extremely small selective differences would mutants in finite populations. have been needed to produce and subsequently maintain most polymorphisms, and that the selec- on September 27, 2021 by guest. Protected Selective Differences tive differences which can be detected in such cases as the sickle-cell and glucose-6-phosphate dehydro- The crucial question of course is whether or not genase polymorphisms simply represent extreme selective differences between the different pheno- examples of the general rule. In order to detect types which make up the enzyme and protein poly- directly the very small selective differences which morphisms actually exist. Such selective dif- might in most cases be expected it would certainly ferences if they occur should in general be manifest in humans be necessary to carry out population and by differences between the phenotypes in the extent demographic studies of a very considerable size to which they contribute offspring to the next and complexity, and so far only a few relatively generation. The differences might arise for ex- limited investigations of this sort have been at- ample because the phenotypes differ in their tempted. The same general problem arises in relative viabilities in early life or in their relative analogous studies in naturally occurring animal fertilities when adult. populations. Thus, in general, the absence of In its simplest form Kimura's hypothesis appears direct evidence for viability or fertility differences J Med Genet: first published as 10.1136/jmg.8.4.444 on 1 December 1971. Downloaded from Polymorphism and Protein Evolution 449 between individuals of the several phenotypes in the enzyme protein (Hopkinson, Spencer, and most polymorphisms though consistent with the Harris, 1963 and 1964). But the differences in hypothesis of neutrality, is not at present particu- structure which cause the electrophoretic differences larly strong evidence in its favour. also result in differences in activity. On average An important source of difficulty in such in- the level of acid phosphatase activity in red cells vestigations is that in general one would expect that of homozygous type B individuals is about 50% selection will be directed at complex phenotypic greater than in type A individuals, and the hetero- characteristics dependent on many different en- zygous type BA individuals show intermediate zymes and proteins acting together. It may there- levels. Similar differences are found in the other fore be necessary to compare the relative fitnesses of types. multiple combinations of phenotypes of different The precise role of this particular acid phospha- enzymes and proteins, rather than rely simply on tase in red cell metabolism is not known, and there comparisons between the common phenotypes for are no obvious differences in health or viability each enzyme or protein taken separately. This is between individuals of the different phenotypes. likely to enhance greatly the difficulty of obtaining But it would seem rash to assume that the differences unambiguous evidence for selective differences in in activity are not reflected in some way in red cell terms of viability or effective fertility. metabolism and they therefore play no part at all in determining the biological of individuals of the different types. Functional Differences Similar activity differences have been demon- However, even small selective differences in strated in other enzyme polymorphisms originally viability or fertility if they occur must be the con- detected electrophoretically. Besides this there sequence of functional differences between the are a number of polymorphisms which were particular enzymes or proteins determined by the originally identified precisely because of such common alleles involved in a polymorphism. quantitative differences. I have attempted to list Thus, while on the neutralist hypothesis the enzyme all the enzyme polymorphisms so far detected in products of the common alleles should be major human populations in the Table and to copyright. functionally the same, on the selectionist hypo- thesis they should be in some degree functionally TABLE different. ENZYME POLYMORPHISMS IN MAN One way of trying to assess such possible functional differences is to measure the level of Polymorphisms where Polymorphisms where as yet Quantitative Differences no Quantitative Differences activity of the enzyme in the different phenotypes. between the Common have been Reported

In theory a very large number of different variants Phenotypes have been Found http://jmg.bmj.com/ of a particular enzyme protein may be generated by Glucose-6-phosphate dehydro- Adenosine deaminase genase Peptidase D (prolidase) separate mutations and one might expect that while Red-cell acid phosphatase Pancreatic amylase some of them will not be associated with any Phosphogluconate dehydro- Pepsinogen genase Glutamate-oxalate transaminase change in activity, others will show such a change. Adenylate kinase Phosphoglucomutase So one can on Placental alkaline phosphatase Locus PGM1 argue that the neutralist hypothesis Peptidase A Locus PGM3 the enzyme variants that make up the common Peptidase C Galactose-l-phosphate uridyl polymorphisms should usually be variants which transferase show Glutathione reductase on September 27, 2021 by guest. Protected no change in activity, whereas on the selec- Liver acetyl transferase tionist view differences in activity should frequently Red cell NADase Salivary amylase be found. Serum cholinesterase In fact Locus E1 differences in enzyme activity have been Locus E2 demonstrated in a number of polymorphisms Alcohol dehydrogenase Locus ADH2 originally detected by electrophoresis. Red-cell Locus ADH3 acid phosphatase in man is a good example. The enzyme appears to be peculiar to the red cell and differs in its structure and properties from the acid classify them according to whether or not quantita- phosphatases found in other tissues. In European tive differences in activity have been demonstrated populations several electrophoretically distinct between the common phenotypes. Out of the 23 phenotypes are seen and it has been shown that gene loci involved, evidence for quantitative dif- they are due to the occurrence of three common ferences have been obtained in 16. In most of the alleles determining structurally distinct forms of other cases the point has not yet been closely J Med Genet: first published as 10.1136/jmg.8.4.444 on 1 December 1971. Downloaded from 450 Harry Harris examined. So such quantitative differences appear sheep haemoglobin freed from small molecules is to be a not uncommon phenomenon among the similar to that of human haemoglobin only when enzyme polymorphisms. the latter is complexed with 2-3 diphosphoglycerate, It should be noted that enzyme activity deter- a substance which is virtually absent from the red mined by assay in vitro using standard techniques cells of ungulates, but which is present in most is usually likely to be a very crude and somewhat other animals including man where it has been insensitive indicator of activity in vivo, because the shown to be of major importance in the regulation assay has generally to be carried out with quite un- of oxygen release by haemoglobin (Benesch, physiological substrate concentrations and in an Benesch, and Yu, 1968; Benesch and Benesch, ionic environment often very different from that 1969). obtaining intracellularly. So subtle differences In its simplest form Kimura's hypothesis would which may be important in vivo may often be suggest that if one could entirely replace the haemo- missed. globin present in human red cells by haemoglobin The finding that activity differences are by no obtained from say the mouse or the rabbit, or even means uncommon tends to favour the view that the the carp, no functional disability would result. polymorphisms are the consequence of natural Such an experiment is not of course feasible, but selection rather than of random drift. However, essentially the same kind of comparison may be the argument is far from conclusive because in the provided in the particular cases where the long- absence in most cases of any clear understanding of term evolutionary products of a gene duplication how such differences might affect the organism as a occur together in a single species. For example whole, it can still be argued that the quantitative Kimura (1969) has argued that the rate of amino- differences observed may cause no significant acid substitution in the evolution of the a and , selective differential so that the alleles concerned are chains of human haemoglobin since they originated effectively neutral. from a common ancestral gene by duplication, has The question of possible functional differences been the same as the rates of amino-acid substitu- between the structurally distinct versions of par- tion occurring in the subsequent divergence of ticular proteins such as cytochrome C and haemo- evolutionary lines leading to structurally different a copyright. globin in different species is also important to the chains in different species, and also in the lines argument, since in its simplest form the neutral leading to the structurally different ,B chains. This mutation-random drift hypothesis of molecular is considered as strong evidence for the view that evolution implies that the amino-acid differences the various amino-acid differences between these which they exhibit do not have any significant effect polypeptide chains are the consequence of neutral on their functional properties. In the case of cyto- mutations. If so one might have expected that the chrome C the failure to observe significant dif- a and , chains ofhaemoglobin would be functionally http://jmg.bmj.com/ ferences in reaction rates of bovine cytochrome C interchangeable. That this is not the case however is oxidase with the cytochrome Cs obtained from a readily seen from a comparison of the properties of variety of species and differing from bovine cyto- the normal Hb A tetramer a292 and the abnormal chrome C by up to 27 amino acids out of a sequence tetramer Hb H which has four normal ,B chains but of 104 (L. Smith cited by Margoliash, Barlow, and no ax chain (Jones, et al, 1959) and which is formed Byers, 1970) seemed to argue in favour of the view in individuals in whom a chain synthesis is specifi- that these amino-acid substitutions were neutral in cally restricted (Weatherall, 1965). Hb H is very their effects. But more recently evidence has been much less stable than Hb A and readily precipitates. on September 27, 2021 by guest. Protected obtained for differences in their ion binding It has an oxygen affinity 10 times greater than the properties which may be relevant to mitochondrial Hb A, and its oxygen dissociation curve shows no ion transport, and possibly of important functional evidence of haem-haem interaction (Benesch et al, and selective significance (Margoliash et al, 1970). 1961). This and other evidence makes it clear In the case of haemoglobin, functional activity as that the molecular evolution of the a and , chains expressed for example in its oxygen affinity depends has been shaped to a significant degree by natural both on the structure of the protein and on the selection, and that the emergence of the tetrameric particular ionic environment in the red cell which form of haemoglobin with two a and two , chains is may also differ from species to species. An not purely a fortuitous consequence of neutral interesting example which illustrates the difficulty mutations and random drift. of assuming that the structural differences in the Slightly more than 50% of the homologous sites protein are of no functional significance is provided in the human a and , chains differ in their amino- by the demonstration that the oxygen affinity of acid substituents and in addition there are several J Med Genet: first published as 10.1136/jmg.8.4.444 on 1 December 1971. Downloaded from Polymorphism and Protein Evolution 451 sites represented in one chain but not in the other However, although the arguments which have which are presumably the consequence of the been put forward are still in many respects incon- fixation of mutations causing deletion or additions clusive, they have served to clarify some of the key to the ancestral DNA sequences. It remains to be issues involved in the controversy and have exposed determined to what extent the specific differences at a number of quite basic problems which had tended the various sites contribute to the marked overall to become somewhat neglected. difference in the functional properties of the two Clearly much more extensive and critical data chains. If it should turn out that only a small pro- needs to be collected, and there is obviously also a portion of the substitutions were significant in this need for the development of new and more sensitive respect then the neutral mutant-random drift hy- methods for investigating from the functional point pothesis would not be too seriously eroded. But of view both the enzyme and protein polymorphic most protein chemists would probably regard this variation within species, and the differences be- result as unlikely. tween species. The , and y haemoglobin chains which are pre- REFENCES sumed to have derived from a somewhat more recent Allison, A. C. (1964). Polymorphism and natural selection in gene duplication raise another interesting point. human populations. Cold Spring Harbor Symposia in Quanti- tative Biology, 29, 137-149. They are both formed in the normal individual but Benesch, R. and Benesch, R. E. (1969). Intracellular organic at different times. Synthesis of y chains occurs phosphates as regulators of oxygen release by haemoglobin. Nature, 221, 618-622. in fetal life and gives place to ,B chain synthesis at Benesch, R., Benesch, R. E., and Yu, C. I. (1968). Reciprocal the end of gestation, so that while the principal binding of oxygen and diphosphoglycerate by human . Proceedings of the National Academy of Sciences of the United States haemoglobin in the red cells of the fetus is Hb F of America, 59, 526-532. (a2y2) in postnatal life it is Hb A (cr2#2). The two Benesch, R. E., Ranney, H. M., Benesch, R., and Smith, G. M. chains differ in about one quarter of their amino- (1961). The chemistry of the Bohr effect. II. Some properties of hemoglobin H. Journal of Biological Chemistry, 236, 2926- acid substituents. If these differences are all to 2929. be attributed to neutral substitutions, then it would Boyer, S. H., Crosby, E. F., Thurmon, T. F., Noyes, A. N., Fuller, G. F., Leslie, S. E., Shepard, M. K., and Herndon, C. N. (1969). seem necessary to conclude that the development A and A2 in New World primates: comparative shift from y to , chain synthesis at birth is also variation and its evolutionary implications. Science, 166, 1428- copyright. fortuitous and of no functional 1431. significance. Boyer, S. H., Hathaway, P., Pascasio, F., Bordley, J., Orton, C., and Naughton, M. A. (1967). Differences in the amino acid sequences Concluding Remarks of tryptic peptides from three sheep hemoglobin chains. Journal of Biological Chemistry, 242, 2211-2232. The neutral mutation-random drift explanation Clarke, B. (1970a). Selective constraints on amino-acid substitu- of polymorphism and protein evolution brings to- tions during the evolution of proteins. Nature, 228, 159-160. Clarke, B. (1970b). Darwinian evolution of proteins. Science, gether areas of biology as diverse as the mathe- 168, 1009-1011.

matical theory of population genetics, the nature Fitch, W. M. and Margoliash, E. (1967). Construction of phylo- http://jmg.bmj.com/ of structure-function of and genetic trees. Science, 155, 279-284. relationships enzymes Fitch, W. M. and Markowitz, E. (1970). An improved method for proteins, the interpretation of field studies of determining codon variability in a gene and its application to the polymorphic variation, and the significance of rate of fixation of mutations in evolution. Biochemical Genetics, 4, 579-593. species differences. It calls for a new look at many Harris, H. (1966). Enzyme polymorphisms in man. Proceedings of long held concepts in each of these different fields, the Royal Society. 164B, 298-310. Harris, H. (1969). Enzyme and protein polymorphism in human and not surprisingly it has provoked considerable populations. British Medical Bulletin, 25, 5-13. controversy. Harris, H. (1970). The Principles of Human Biochemical Genetics. A variety ofother arguments besides those already North-Holland, Amsterdam. on September 27, 2021 by guest. Protected Hopkinson, D. A. and Harris, H. (1971). Recentwork on isozymes in mentioned have been deployed both for and against man. Annual Review of Genetics, 5. (In press.) the hypothesis. See for example, Boyer et al Hopkinson, D. A., Spencer, N., and Harris, H. (1963). Red cell acid phosphatase variants: a new human polymorphism. Nature, (1969); King and Jukes (1969); Clarke (1970a 199,969-971. and b); Smith (1970a and b); Richmond Hopkinson, D. A., Spencer, N., and Harris, H. (1964). Genetical (1970); Fitch and studies on human red cell acid phosphatase. American Journal of Markowitz (1970); Jukes and Human Genetics, 16, 141-154. King (1971); Kimura and Ohta (1971). These Hubby, J. L. and Lewontin, R. C. (1966). A molecular approach to arguments cite phenomena of many different kinds the study of genic heterozygosity in natural populations. I. The number of alleles at different loci in Drosophila pseudoob- derived from a wide range of species. But critical scura. Genetics, 54, 577-594. data by which the hypothesis can be assessed is still Jukes, T. H. and King, J. L. (1971). Deleterious mutations and neutral substitutions. Nature, 231, 114-115. very limited, and it is often far from clear whether a Jones, R. T., Schroeder, W. A., Balog, J. E., and Vinograd, J. R. particular example should be regarded as a special (1959). Gross structure of hemoglobin H. Journal of the case or as something which illustrates a general American Chemical Society, 81, 3161. Kimura, M. (1968). Evolutionary rate at the molecular level. rule. Nature, 217, 624-626. J Med Genet: first published as 10.1136/jmg.8.4.444 on 1 December 1971. Downloaded from 452 Harry Harris Kimura, M. (1969). The rate of molecular evolution considered Prakash, S., Lewontin, R. C., and Hubby, J. L. (1969). A molecu- from the standpoint of population genetics. Proceedings of the lar approach to the study of genic heterozygosity in natural National Academy of the United States of America, 63, 1181-1188. populations. IV. Patterns of genic variation in central, marginal Kimura, M. and Ohta, T. (1971). Protein polymorphism as a phase and isolated populations of Drosophila pseudoobscura. Genetics, of molecular evolution. Nature, 229, 467-469. 61, 841-858. King, J. T. and Jukes, T. H. (1969). Non-Darwinian evolution. Richmond, R. C. (1970). Non-Darwinian evolution: a critique. Science, 164, 788-798. Nature, 225, 1025-1028. Kirkman, H. N. (1971). Glucose-6-phosphate dehydrogenase. Ruddle, F. H., Roderick, T. H., Shows, T. B., Weigl, P. G., Advances in Human Genetics, 2. Plenum Press, New York. (In Chipman, R. K., and Anderson, P. K. (1969). Measurement of press.) genetic heterogeneity by means of enzyme polymorphisms. Lehmann, H. and Carrell, R. W. (1969). Variations in the structure Journal of , 60, 321-322. of human haemoglobin. British Medical Bulletin, 25, 14-23. Selander, R. K., Hunt, W. G., and Yang, S. Y. (1969). Protein polymorphism and genic heterozygosity in two European sub- Lewontin, R. C. and Hubby, J. L. (1966). A molecular approach to species of the house mouse. Evolution, 22, 379-390. the study of genic heterozygosity in natural populations. II. Selander, R. K. and Yang, S. Y. (1969). Protein polymorphism and Amount of variation and degree of heterozygosity in natural genic heterozygosity in a wild population of the house mouse populations of Drosophila pseudoobscura. Genetics, 54, 595-609. (Mus musculus). Genetics, 63, 653-667. Luzzatto, L., Usanga, E. A., and Reddy, S. (1969). Glucose-6- Selander, R. K., Yang, S. Y., Lewontin, R. C., and Johnson, W. E. phosphate dehydrogenase deficient red cells: resistance to infec- (1970). in the horseshoe crab (Limulus poly- tion by malarial parasites. Science, 164, 839-842. phemus), a phylogenetic 'relic'. Evolution, 24,402-414. Margoliash, E., Barlow, G. H., and Byers, V. (1970). Differential Smith, J. M. (1970a). The causes of polymorphism. Symposia of binding properties of cytochrome C: possible relevance for the Zoological Society ofLondon, No. 26, 371-383. mitochondrial ion transport. Nature, 228, 723-726. Smith, J. M. (1970b). Population size polymorphism and the rate Nolan, C. and Margoliash, E. (1968). Comparative aspects of ofnon-Darwinian evolution. American Naturalist, 104,231-237. primary structures of proteins. Annual Review of Biochemistry, Weatherall, D. J. (1965). The Thalassaemia Syndromes. Blackwell 37, 727-790. Scientific, Oxford. copyright. http://jmg.bmj.com/ on September 27, 2021 by guest. Protected