<<

Copyright  2001 by the Genetics Society of America

Evidence for a Period of Directional Selection Following Duplication in a Neurally Expressed Locus of Triosephosphate Isomerase

T. J. S. Merritt and J. M. Quattro Department of Biological Sciences, Program in Marine Science, Baruch Institute and School of the Environment, University of South Carolina, Columbia, South Carolina 29208 Manuscript received May 15, 2001 Accepted for publication July 9, 2001

ABSTRACT A striking correlation between neural expression and high net negative charge in some teleost isozymes led to the interesting, yet untested, suggestion that negative charge represents an adaptation (via ) to the neural environment. We examine the of the triosephosphate isomerase (TPI) gene family in fishes for periods of positive selection. Teleost fish express two TPI , including a generally expressed, neutrally charged isozyme and a neurally expressed, negatively charged isozyme; more primitive fish express only a single, generally expressed TPI isozyme. The TPI gene phylogeny constructed from sequences isolated from two teleosts, a single acipenseriform, and other TPI sequences from the databases, supports a single event early in the evolution of bony fishes. Compari- sons between inferred ancestral TPI sequences indicate that the neural TPI isozyme evolved through a period of positive selection resulting in the biased accumulation of negatively charged amino acids. Further, the number of nucleotide changes required for the observed amino acid substitutions suggests that selection acted on the overall charge of the and not on specific key amino acids.

ISSUE-SPECIFIC isozymes, multiple molecular forms Morizot and Schmidt 1990). Most neural isozymes Tof a specific enzyme within individuals, are funda- have not been as well characterized biochemically as mental components of cellular diversity, representing LDH-C, but a striking number have a high net negative specialization of proteins for specific functions and charge (e.g., Burger et al. 1963a,b; Shaklee et al. 1973; distinct cellular environments. For this reason, re- Champion and Whit 1976; Fisher and Whitt 1978; searching isozyme evolution provides insight into the Fisher et al. 1980). Fisher et al. (1980) hypothesized mechanisms through which biochemical and biological that high net negative charge might represent an adap- complexity has evolved. Several authors hypothesize tation to the unique biochemistry of the neural environ- that functional specialization within gene families is the ment. result of natural selection (e.g., Li 1983; Ohta 1991; Recently, positive selection has been shown to have Hughes 1994), and the unique biochemical properties been involved in adaptive radiation within a number of of highly specialized isozymes within gene families cir- gene families (e.g., Tanaka and Nei 1989; Hughes and cumstantially support this selection hypothesis. For ex- Hughes 1993; Zhang et al. 1998; Duda and Palumbi ample, within the lactate dehydrogenase (LDH: EC 1999; Bishop et al. 2000; Willett 2000). In these cases, 1.1.1.27) gene family, Ldh-C results from a relatively gene duplication was shown to be followed by a period of recent duplication of the Ldh-B locus in teleost fishes positive selection, a reflection of functional divergence (Quattro et al. 1993). Although the duplication was within the gene family. Functional divergence of the relatively recent, LDH-C is expressed predominantly in neural, and generally expressed, fish LDH isozymes sug- the eye and brain and has pH optima, reaction kinetics, gests that they might have evolved through a similar and thermal stability distinct from LDH-B (Whitt 1970; period of positive selection, perhaps subsequent to gene Shaklee et al. 1973). These unique attributes possibly duplication. This would be especially interesting in that represent adaptation of LDH-C to expression in the all examples of gene duplication and positive selection distinct, highly aerobic, neural environment (Whitt to date have involved reproductive isolation mechanisms, 1970). ligand binding, or recognition of non-self. If positive Neural isozymes such as the teleost LDH-C are widely selection was involved in the functional divergence of distributed among vertebrates (e.g., Penhoet et al. 1966; the neural isozymes this would be the first example re- Fisher et al. 1980; Marangos and Schmechel 1987; ported involving duplication, and specialization to a uni- que environment within an organism, of a housekeeping protein. Triosephosphate isomerase (TPI: E.C. 5.3.1.1) is a di- Corresponding author: Joseph M. Quattro, Department of Biological Sciences, University of South Carolina, Columbia, South Carolina meric glycolytic enzyme that catalyzes the interconver- 29208. E-mail: [email protected] son of dihydroxyacetone phosphate and glyceraldehyde-

Genetics 159: 689–697 (October 2001) 690 T. J. S. Merritt and J. M. Quattro

3-phosphate. Two TPI proteins are expressed in teleost directional selection and the specific accumulation of fish: a neutrally charged, generally expressed isozyme negatively charged amino acids following the TPI gene and a negatively charged neural isozyme (Pontier and duplication. Hart 1981; Morizot and Schmidt 1990); all other jawed vertebrates (gnathostomes), including the sar- copterygian fish, the coelocanth, express a single TPI MATERIALS AND METHODS protein (Kolb et al. 1974; Maquat et al. 1985; Straus TPI cloning and sequencing: Acipenser brevirostrum (shortnose and Gilbert 1985; Old and Mohrenweiser 1988; sturgeon) tissues were obtained from R. Chapman, South Car- Cheng et al. 1990; Kuraku et al. 1999). In addition, olina Department of Natural Resources (Charleston, SC); Xi- only a single TPI is expressed in the lancelet, a primitive phophorus maculatus (southern platy) and Danio rerio (zebra- fish) samples were purchased from local pet stores. Tissues chordate (Nikoh et al. 1997), and in two agnathid verte- were dissected from single fish and processed immediately or brates, lamprey and hagfish (Kuraku et al. 1999). Sev- stored at Ϫ70Њ. Total RNA was purified from tissue using a eral phylogenetic studies of TPI sequences suggest that commercially available kit (RNAeasy, QIAGEN, Chatsworth, a TPI gene phylogeny is recoverable and likely well CA). Complementary DNA (cDNA) was synthesized from total supported (Nikoh et al. 1997; Kuraku et al. 1999). The RNA according to the manufacturer’s instructions (Superscript Preamplification System, GIBCO BRL, Gaithersburg, MD). apparent simplicity of the TPI gene family makes it an Degenerate oligonucleotide primers were designed from attractive system with which to investigate the role of an alignment of vertebrate TPI sequences in GenBank: selection in the evolution of neurally expressed iso- TPI14F, 5Ј-GTN GGN GGN AAY TGG AAR ATG-3Ј; TPI16F, zymes. 5Ј-GGN AAY TGG AAR ATG AAY GG-3Ј; TPI225R, 5Ј-CAN Ј Ј Recognition that gene families evolve episodically, with ARR AAN CCR TCN AYR TC-3 ; and TPI235R, 5 -WWN TCN ACR AAY TCN GGY TT-3Ј. periods of positive selection followed by periods of puri- Degenerate positions are represented by the following ambi- fying selection, has been crucial to the demonstration guity codes: N ϭ A, G, C, T; Y ϭ C, T; R ϭ A, G; W ϭ A, T. of the role of selection in the diversification of gene Numbers refer to the amino acid position populated by the families. Positive selection can result in the rate of non- 3Ј base of each primer; a preliminary ClustalW (Thompson exceeding that of synonymous et al. 1994) alignment of protein sequences from GenBank was used as a reference. substitution, whereas purifying selection generally re- Combinations of these four oligonucleotide primers were sults in the rate of synonymous substitution exceeding used to amplify a 681- to 687-bp segment of the TPI cDNA the nonsynonymous rate. The period of positive selec- by the polymerase chain reaction (PCR; Saiki et al. 1988). PCR tion following gene duplication may be relatively brief, was carried out for 40 cycles under the following conditions: Њ Њ and if gene sequences are compared across too long a denaturation at 95 for 1 min, annealing at 48 for 1 min, and extension at 72Њ for 1 min. In some cases nested reamplifica- period of evolutionary time, evidence for positive selec- tions were necessary to obtain sufficient PCR product for clon- tion might be obscured by subsequent substitutions ing. In these cases, a sample of the first amplification product (Hughes 1994). Given that many gene duplications are (using primers 14F and 235R) was diluted 1:100 and used as quite old, comparison of modern gene sequences might template for a secondary amplification (using primers 16F and not detect evidence for positive selection. Comparisons 225R) under identical PCR conditions. PCR products were cloned into pGEM T-vector (Promega, Madison, WI) and se- of inferred ancestral sequences, internal nodes on a quenced manually. Dideoxy DNA sequencing was performed gene tree that bracket a period of interest, circumvent with Sequenase (USB Biochemicals) and [35S]dATP. At least this problem by isolating periods of evolutionary time three independent clones per PCR fragment were sequenced (e.g., Messier and Stewart 1997; Zhang et al. 1998). on both strands. This approach requires that taxa be selected such that Individual TPI loci were targeted for PCR amplification by exploiting the tissue-specific expression of each isozyme (Pon- ancestral sequences tightly bracket the period of evolu- tier and Hart 1981; Morizot and Schmidt 1990). TPI-B was tionary interest. amplified from liver cDNA from both zebrafish and platy. TPI- Any changes in selective pressures acting on the two A and TPI-B were amplified from zebrafish eye or ovary cDNA. TPI isozymes presumably occurred directly following Sequence analysis of individual clones allowed us to distin- duplication of the Tpi locus. The presence of a second guish the two loci. Cloning of degenerate PCR products from platy eye, brain, or ovary cDNA yielded only TPI-B clones. Tpi locus in teleost fishes, and only in teleost fishes However, direct sequencing of amplification products from despite the wide variety of chordates that have been platy eye or ovary cDNA allowed portions of the TPI-A se- surveyed, strongly suggests that the two fish isozymes quence to be read as “background,” secondary bands on the are the products of a gene duplication sometime during sequencing gel, behind the strong signal of the TPI-B se- the radiation of higher fishes. To bracket this period quence. TPI-A specific primers were designed from these re- gions. The extreme 5Ј and 3Ј coding and untranslated regions in the evolution of the TPI locus we have sequenced of each cDNA was amplified using the method of rapid ampli- the entire coding region of the single TPI gene from fication of cDNA ends (RACE; Frohman 1990). RACE ampli- the shortnose sturgeon, a primitive actinopterygian fish, fications used gene-specific primers designed from the initial and two TPI , each from two teleost species, the gene fragment. Conditions for the PCR, cloning, and screen- zebrafish and the southern platy. Comparisons of ances- ing were as described above. Isoelectric point (pI) values were calculated for each predicted amino acid sequence using the tral TPI nucleotide and amino acid sequences con- EMBL isoelectric point service (http://www.embl-heidelberg. structed using these sequences and other vertebrate TPI de/cgi/pi-wrapper.pl). sequences from the data banks indicate a period of Phylogenetic analyses: In addition to the five fish TPI se- Directional Selection on Teleost Neural Isozymes 691 quences reported here, the following gnathostome sequences tides of 3Ј untranslated sequence. The cDNA open read- were obtained from GenBank and used in phylogenetic ing frame codes for a putative 248-amino-acid protein analyses: (GenBank accession no. M10036), rhesus mon- key (X08023), rat (L36250), mouse (L31793), and chicken with a predicted pI of 6.5 (Figure 1). PCR products (M11941). A previous analysis of TPI gene evolution revealed from zebrafish eye cDNA included two cDNAs: the first no evidence for a gene duplication prior to the divergence identical to that found in liver samples and the second of agnathid and gnathostome vertebrates; thus the single TPI consisting of 1028 nucleotides, including an ATG start cDNAs isolated from agnathans (lamprey, AB025327; hagfish, codon, a TGA stop codon, 79 nucleotides of 5Ј untrans- AB025322) were used to root phylogenetic trees describing Ј TPI evolution in gnathostome vertebrates. lated sequence, and 202 nucleotides of 3 untranslated Amino acid sequences were aligned using the ClustalW mul- sequence. The open reading frame of this second cDNA tiple alignment program (Thompson et al. 1994). Minor ad- codes for a 248-amino-acid protein with a predicted pI justments were made to the alignment manually, resulting in of 4.7 (Figure 1). Both sequences are similar to TPI an overall alignment of 246-amino-acid positions. Nucleotide sequences in GenBank. On the basis of the tissues from sequences were then aligned using the amino acid alignment as a guide. A phylogeny was constructed by the neighbor- which they were isolated, the cDNAs were tentatively joining (NJ) method (Saitou and Nei 1987) using the algorithm identified as, respectively, the TPI-B and TPI-A proteins implemented in PAUP* (version 4.0 b2; Swofford 1999). Posi- (herein referred to as DrTPI-B and DrTPI-A) reported tions in which alignment forced gaps within the sequences were by Pontier and Hart (1981). The predicted pI values removed because of uncertain homology. Missing data were re- for both proteins are consistent with Pontier and Hart’s stricted to the outgroup (agnathans); thus these sites were retained and treated as missing data in phylogenetic analyses. (1981) observation of a neutrally charged, generally Ingroup (gnathostome) relationships were estimated first and expressed isozyme and a negatively charged neural iso- this topology used as a backbone constraint in subsequent zyme. analyses (e.g., see Swofford et al. 1996). To confirm that the two cDNAs we characterized cor- Pairwise distances were calculated from the first and second responded to the tissue-specific isozymes reported by position nucleotides using the Kimura two-parameter model (Kimura 1980). Pairwise proportional distances were calcu- Pontier and Hart (1981) we designed oligonucleotide lated from the amino acid sequences. Bootstrapping (Felsen- primers specific for each gene and used RT-PCR ampli- stein 1985) was used to evaluate the degree of support for fication to assay for the presence of message across a particular groupings in the NJ analysis. variety of tissues in zebrafish. Primers specific for the Test for positive selection: Ancestral sequences were recon- DrTPI-B gene amplified a product from every tissue exam- structed using a distance-based method (Zhang et al. 1997) by first inferring the amino acid sequences and then inferring ined (eye, brain, ovary, liver, gut, and muscle), whereas the nucleotide sequences under the restriction of the ancestral primers specific for the DrTPI-A gene amplified a prod- amino acid sequences. Reconstructions used the ANC-GENE uct only from eye, brain, and ovary (PCR products were computer program (Zhang et al. 1998) with the Jones, Taylor, sequenced and verified as DrTPI-B or DrTPI-A). This pat- and Thorton (JTT; Jones et al. 1992) model of amino acid tern of message expression matches the pattern of protein substitution. Only the TPI sequences from the jawed verte- brates were used in the sequence reconstructions because of expression reported by Pontier and Hart (1981). missing data in the lamprey and hagfish sequences. The first PCR products from southern platy tissue samples were three amino acids of each sequence could not be inferred similar to those from zebrafish. A single cDNA, con- because of gaps in the ClustalW alignment of extant TPI se- sisting of 874 nucleotides, including an ATG start codon, quences. TAA stop codon, 45 nucleotides of 5Ј untranslated se- The reconstructed sequences were used to compute the Ј number of synonymous (s) and nonsynonymous (n) substitu- quence, and 85 nucleotides of 3 untranslated sequence, tions per branch using proportional differences as imple- was isolated from liver samples (hereafter XmTPI-B). The mented in the BN-BS computer program (Zhang et al. 1998). cDNA open reading frame codes for a putative 247- Reconstruction of ancestral sequences and estimation of the amino-acid protein with a predicted pI of 7.7 (Figure number of substitutions using a Poisson model for amino acid 1). Sequence analysis of PCR products from platy eye substitution, the Jukes-Cantor model of distance estimation (Jukes and Cantor 1969), or a combination of either of these revealed two cDNAs: the first identical to that found in methods with the JTT model or proportional distances yielded liver samples and the second consisting of 929 nucleo- results essentially identical to those reported here. The num- tides, including an ATG start codon, TAA stop codon, ber of potential nonsynonymous sites (N) and synonymous 90 nucleotides of 5Ј untranslated sequence, and 95 nu- sites (S) were also calculated for the TPI sequences using cleotides of 3Ј untranslated sequence (hereafter XmTPI- the BN-BS program. The ratio of transitions to transversions, required for calculation of s, n, S, and N, was calculated using A). The open reading frame of this second cDNA codes MEGA (version 1.0; Kumar et al. 1993). for a 247-amino-acid protein with a predicted pI of 4.4 Sequences reported in this article have been submitted to (Figure 1). Again, both sequences show high similarities GenBank (accession nos: AF387818–AF387822). to TPI sequences in GenBank, and the predicted pI values of the two platy TPI proteins are consistent with previous observations (Morizot and Schmidt 1990) of RESULTS a neutrally charged, generally expressed isozyme and a Cloning: PCR products isolated from zebrafish liver negatively charged neural isozyme. cDNA consistently yielded a cDNA of 897 nucleotides, We used the same strategy that we used in confirming including an ATG start codon, TAA stop codon, 80 the identity of the two zebrafish TPI cDNAs to confirm nucleotides of 5Ј untranslated sequence, and 70 nucleo- that the two cDNAs we characterized from platy corre- 692 T. J. S. Merritt and J. M. Quattro

Figure 1.—Amino acid se- quences for the fish TPI genes reported here and amino acid sequences for the inferred an- cestral teleost (T), ancestral TPI-A (A), and ancestral TPI-B (B) genes. Negative amino acids in the neural proteins (ex- tant and inferred) are shown in boldface. Protein surface amino acids are shaded. Amino acids were considered to be on the protein surface if they had an accessible surface area Ն1 A˚ 2 as determined by the Surface computer program (Collabor- ative Computational Project 1994) using the chicken TPI pro- tein as a reference structure. Predicted pI values are shown at the 3Ј end of each protein.

sponded to the tissue-specific isozymes reported by Mor- ing an ATG start codon, TGA stop codon, 38 nucleotides izot and Schmidt (1990): we designed oligonucleotide of 5Ј untranslated sequence, and 216 nucleotides of 3Ј primers specific for each gene and used RT-PCR ampli- untranslated sequence. The cDNA open reading frame fication to assay for the presence of message across a codes for a 249-amino-acid protein (herein referred to variety of tissues in platy. Primers specific for the XmTPI- as AbTPI; Figure 1). Interestingly, sequence analysis of B gene amplified a product from every tissue examined several sturgeon 3Ј RACE products revealed a second (eye, brain, ovary, liver, gut, and muscle), whereas prim- TPI cDNA. We sequenced 144 bp of the 3Ј end of coding ers specific for the XmTPI-A gene amplified a product region of this cDNA (including the stop codon) and only from eye, brain, and ovary (PCR products were 100 bp of the 3Ј untranslated sequence. The coding sequenced and verified as XmTPI-B or XmTPI-A). This regions of the two sequences differ by 5% at the nucleo- pattern of message expression matches the pattern of tide level (7 changes in 133 bp) with no nonsynonymous TPI message expression we found in zebrafish and the changes (including termination codons). The 3Ј un- pattern of protein expression reported by Morizot and translated sequences, however, differ by 50%, implying Schmidt (1990). that the gene sequences are not likely alleles at the same PCR products from shortnose sturgeon samples con- locus, but represent separate loci. In contrast, over this tained one cDNA consisting of 1063 nucleotides, includ- same portion of the coding region, the two zebrafish Directional Selection on Teleost Neural Isozymes 693

phyly of the four teleost TPI sequences and, within this group, the two TPI-A and two TPI-B proteins. Bootstrap analysis provided no support for grouping either the two zebrafish TPI genes or the two platy genes (these sequences did not group in any of the bootstrap trees). This indicates that the pair of TPI sequences observed in teleost fish result from a gene duplication event that occurred sometime early in the radiation of ray-finned fishes, after the separation of the Acipenseriform and ,(mya; Grande and Bemis 1991 200ف) teleost fishes ;mya 100ف) but before the radiation of teleost fishes Patterson 1993). Rooting the ingroup taxa with the two agnathid se- quences places the root between actinopterygian fishes and tetrapods as would be expected, given currently held views on vertebrate phylogeny. Monophyly of the actinopterygian fishes (placement of the sturgeon TPI with the teleost TPI sequences), however, is only weakly supported using both nucleotide and amino acid char- acters. We are inclined to accept this topology in that other placements of the sturgeon TPI sequences within the TPI tree would require a greater number of gene duplication and loss events than required by this topol- ogy (see discussion). Figure 2.—Phylogenetic analysis of vertebrate TPI se- Reconstructing ancestral sequences and statistical quences. (A) Neighbor-joining phylogeny summarizing in- tests for selection: In the absence of selective pressure, ferred evolutionary relationships among vertebrate TPI se- the ratio of observed nonsynonymous changes (n)to quences. The tree was rooted with hagfish and lamprey TPI synonymous changes (s) will equal the ratio of potential sequences (see materials and methods). Numbers along each branch represent the proportion of 1000 bootstrap repli- nonsynonymous (N) and synonymous (S) changes cates supporting that node (analysis of TPI nucleotide se- (Zhang et al. 1997). Positive, diversifying selection can quences above, amino acid sequences below). (B) Estimated lead to an increase in n/s relative to N/S, whereas purify- number of synonymous and nonsynonymous changes along ing selection can lead to a decrease in n/s relative to branches of the TPI gene tree. Numbers above each branch N/S (Zhang et al. 1997). n and s were calculated directly represent the number of nonsynonymous (n) and synonymous (s) substitutions calculated for each branch of the tree, shown for each branch of the gene tree by direct pairwise as n/s. ** indicates the only comparison in which n/s is signifi- comparisons of all nodes using the methods of Zhang cantly greater than N/S (Fisher’s exact test, P ϭ 0.0005). T, et al. (1998). Ancestral sequences (internal nodes) were B, and A indicate the nodes corresponding to the ancestral calculated from extant sequences (terminal nodes). The single teleost TPI sequence, ancestral TPI-B sequence, and posterior probability of the inferred ancestral states ancestral TPI-A sequence, respectively; see text and Figure 1 for details. ranged from 98 to 99% for the amino acids and from 83 to 98% for the nucleotides. Using a ratio of transitions to transversions of 1.1 (calculated from the data set), N and S were calculated to be 536 and 196, respectively TPI sequences differ by 26% (35 changes) and the two (N/S ϭ 2.7). The n/s ratio exceeds 2.7 along three platy TPI sequences by 25% (33 changes). Shortnose branches of the tree (Figure 2B): the branch leading sturgeon are thought to be recent polyploids including to the ancestral single teleost gene (F-T branch), the a recent genome duplication event unique to the species branch leading to the ancestral B gene (T-B branch), (Blacklidge and Bidwell 1993), which explains the and the branch leading to the ancestral A gene (T-A close sequence similarity between the two cDNAs. Given branch). Using Fisher’s exact test, the difference be- the lack of amino acid differences between the sturgeon tween the n/s and N/S values was statistically significant TPI proteins, we have based our analysis of TPI evolution only along the T-A branch (P ϭ 0.005). on the single completely sequenced TPI coding region Four sites of the inferred T, B,orA sequences (amino from sturgeon. acids 27, 90, 194, and 242; Figure 1) were deemed am- Phylogenetic analyses: Neighbor-joining analyses of biguous because the most likely amino acid was less chordate TPI nucleotide sequences produce a tree (Fig- than twice as probable as the next most likely amino ure 2A) whose topology is consistent with previously acid (e.g., Zhang et al. 1998). If these four sites are reported TPI gene trees (Nikoh et al. 1997; Kuraku et excluded from the analysis, n/s still significantly exceeds al. 1999). Bootstrap analysis strongly supported mono- N/S (P ϭ 0.003) along the T-A branch. Additionally, 694 T. J. S. Merritt and J. M. Quattro

TABLE 1 served for the rest of the tree (10/7 vs. 121/21; P ϭ Number of amino acid substitutions that alter or 0.014); substitutions to a negative amino acid occur conserve a given amino acid property along significantly more often along the T-A branch than branches of the TPI phylogeny across the rest of the tree.

Reconstructed changes along branch(s) DISCUSSION Property T-A Tree P Gene phylogeny and selective pressures: Phylogenetic analysis of vertebrate TPI cDNA sequences confirms that Charge the TPI gene duplicated early in the evolution of teleost Conserve 12(10) 133(121) fish and indicates that the two TPI isozymes found in Alter 8(7) 21(21) 0.007(0.014) Polarity zebrafish and southern platy are products of this dupli- Conserve 15(13) 110(101) cation. Monophyly of the actinopterygian fish sequences Alter 5(4) 44(42) 0.484(0.426) (placement of the sturgeon TPI sequence with the tele- Size ost TPI sequences) is only weakly supported by both Conserve 14(11) 97(90) the TPI nucleotide and amino acid data sets (Figure Alter 6(6) 57(53) 0.363(0.556) 2A). Placement of the sturgeon TPI elsewhere in the Branches are as defined in Figure 1; tree refers to summed tree, however, would require a greater number of dupli- changes along all branches, excluding T-A. P is the result of cations and would lose events to reconcile the gene tree a Fisher’s exact test that the observed number of changes topology with currently accepted views on vertebrate within properties is homogeneous across branch designations. phylogeny. For example, placement of the sturgeon TPI Results of a similar analysis excluding four ambiguously recon- structed amino acid positions (see Figure 1 legend for details) outside of both tetrapod and teleost sequences would are shown in parentheses. require duplication of the TPI gene prior to the radia- tion of jawed vertebrates, a subsequent loss of one locus in the sturgeon lineage, either a single loss of the other ancestral sequences were reconstructed using alterna- locus prior to the divergence of teleosts and tetrapods tive placements of the sturgeon sequence (e.g., basal or two independent losses, one in teleosts and the other to all other gnathostome TPI sequences). Because the in tetrapods, and, finally, duplication of the TPI gene methods of Zhang et al. (1997) assume a basal polytomy, in teleosts. Other placements of the sturgeon sequence the reconstructed sequences, and n and s, were unaf- require similar numbers of additional duplication and fected, and inference of n and s, n/s still significantly loss events. Given that the topology shown in Figure exceeds N/S (P ϭ 0.005) along the branch leading to 2A requires the fewest number of duplication and loss the ancestral TPI-A. events to explain the pattern of expression of TPI across Nonrandom amino acid substitution: Different pat- taxa we are inclined to accept this topology, even though terns of amino acid substitution across two evolutionary bootstrap support for placement of the sturgeon se- periods suggest differences in selective pressures be- quence is low. In any event, reconstruction of ancestral tween those periods. Predicted amino acid sequences sequences (below) and inference of patterns of sequence (Figure 1) were used to compare the amino acid substitu- change over evolutionary time rely on an unrooted tree. tions along the T-A branch with those across the rest of Placement of the sturgeon TPI sequence as sister to the the TPI gene tree (Table 1). Amino acids were grouped teleost TPI sequences, with the tetrapod TPI sequences by physical and chemical properties (using the criterion basal to all other gnathostome sequences, does not change of Grantham 1974) to determine whether observed the results of this analysis. amino acid substitutions conserved or altered charge, LDH-C, the negatively charged LDH neural isozyme size, or polarity. Of the 20 amino acid changes between of teleost fishes, results from a gene duplication that nodes T and A, 8 result in a negatively charged amino occurred during roughly this same time period (Quat- acid. No changes along this branch resulted in positively tro et al. 1993, 1995; Stock et al. 1997). Studies of the charged amino acids. There are 154 amino acid changes evolution of the Hox (Amores et al. 1998) and Dlx across the rest of the TPI gene tree; 21 of these are to (Stock et al. 1996; Neidert et al. 2001) genes also sug- or from negative amino acids (data not shown). These gest duplications in both these gene families early in ratios, 12/8 and 133/21, are significantly different (Fish- the radiation of ray-finned fishes. The duplication of these er’s exact test, P ϭ 0.007; Table 1). Excluding the four last two gene families early in the evolution of higher ambiguous ancestral sites (see above), 17 amino acid fishes is thought to result from duplication of entire chro- changes occur along the T-A branch, 7 of which are to mosomes or perhaps the entire genome (Stock et al. negative amino acids; 142 changes occur across the rest 1996; Amores et al. 1998; and Neidert et al. 2001). It of the tree, 21 of which are to or from negative or posi- is possible that the TPI and LDH neural isozymes result tive amino acids. The ratio of changes along the T-A from this large-scale duplication event. branch is still significantly different from the ratio ob- Comparison of the observed number of nonsynony- Directional Selection on Teleost Neural Isozymes 695 mous and synonymous changes (n and s; Figure 2B) likely to alter the three-dimensional structure of the with the expected number of nonsynonymous and syn- protein and to disrupt protein function than changes onymous changes (N and S) across each branch of the across the surface. That the changes do not involve the TPI gene tree shows that, in general, n/s is less than active site or dimerization interface is also expected for N/S. This indicates that the TPI gene has evolved under a protein evolving within functional constraints, and the purifying selection throughout much of its evolutionary TPI-A protein does form dimers and function at least history, as is expected for a protein coding sequence. well enough to allow in vitro staining (Pontier and However, the ratio of n/s is significantly greater than Hart 1981). This apparent avoidance of change at criti- N/S across the branch between the ancestral single cal amino acids is also consistent with theoretical models teleost TPI protein (T) and the ancestral TPI-A protein that suggest that following duplication daughter genes (A). This suggests that, following duplication of the do not radically change their function but act to special- ancestral gene, TPI-A evolved through a period of posi- ize within the broader function of the parent gene (e.g., tive selection. Along the terminal TPI-A branches n/s Hughes 1994). is again less than N/S, indicating a return to purifying During the course of these analyses we observed a selection. pattern of nucleotide substitution along the T-A branch Pattern of amino acid substitution: The high net nega- that is more difficult to explain than the aforemen- tive charge predicted for the TPI-A ancestral protein tioned periods of positive and directional selection. In- results from the number of changes (eight) along the terestingly, all changes to negatively charged amino T-A branch from amino acids with positive or neutral acids along the T-A branch require single nucleotide net charges to amino acids with negative net charges changes (e.g., a lysine-to-glutamic acid substitution re- (Figure 1). Eight substitutions that alter amino acid quires a single nucleotide change), suggesting that charge is particularly striking in that charge changes amino acid substitution might have been a function of are generally rare in protein evolution (Peetz et al. the number of evolutionary steps (nucleotide changes) 1986; Xia and Li 1998). Accumulation of negatively required during the evolution of the neural isozymes. charged amino acids could have resulted from an overall To test if this apparent correlation was due simply to acceleration in the rate of amino acid change during chance, we calculated the expected number of muta- positive selection along this branch (diversifying selec- tional events required to yield the observed charge pro- tion) or, more interestingly, from an acceleration spe- file, assuming that amino acid substitution occurred at cifically in the rate of change to negative amino acids random along the surface of the protein. One hundred along this branch (directional selection). sets of eight amino acids were created by random sam- To distinguish between diversifying and directional pling without replacement from the pool of all non- selection in TPI-A evolution, the amino acid substitu- negative surface amino acids of the TPI-T ancestral pro- tions occurring along the T-A branch were compared tein (104 possible amino acids; Figure 1). The minimum with all other substitutions across the TPI gene tree. number of evolutionary steps required to change each Forty percent of the amino acid substitutions along the amino acid within a set to a negative amino acid was then T-A branch (8 of 20) were from either neutral or positive calculated. For example, lysine requires two nucleotide amino acids to negative amino acids. There are no changes to convert into an aspartic acid, but only one changes along this branch from negative amino acids to to convert into a glutamic acid; therefore, a lysine re- neutrally or positively charged residues. In comparison, quires only one step to become a negative amino acid. significantly fewer substitutions (14%, or 21 of 154) The minimum number for each amino acid was then along the other branches are to or from negative amino summed within each 8-amino-acid set to give a set total acids. When substitutions are grouped by other physico- (with a possible range from 8 to 16 steps). Across the chemical characteristics (size or polarity), the amino 100 replicates, the average number of evolutionary steps with a range of 8–14 steps. Only a ,11ف acid changes along the T-A are indistinguishable from required was those along the other branches. These two points indi- single replicate required the minimum of 8 steps. It is cate accelerated accumulation of negative amino acids unlikely then (P ϭ 0.01) that substitutions requiring by the TPI-A protein during a period of positive selec- the observed number of evolutionary steps would have tion. The TPI-A gene evolved through a period of direc- occurred due to chance, indicating that the substitu- tional selection, resulting in a more negatively charged tions to negative amino acids observed along the T-A protein. branch are correlated with the number of evolutionary The amino acid substitutions observed along the T-A steps required to make those substitutions. branch occur across the surface of the protein (Figure This is not surprising if selection acted to fix any 1) and avoid the catalytic center and dimer interface substitution to a negative amino acid across the surface residues (as identified by Wierenga et al. 1991). That of the protein; amino acid substitutions that require the the changes should occur across the surface of the pro- fewest possible number of steps (one) are most likely tein is not unexpected; internal changes, especially those to occur and, therefore, should be observed more often involving changes in amino acid charge, would be more than those requiring a greater number of steps. This 696 T. J. S. Merritt and J. M. Quattro correlation between observed amino acid substitution LITERATURE CITED and number of evolutionary steps is surprising, however, Amores, A., A. Force, Y.-L. Yan, L. Joly, C. Amemiya et al., 1998 if selection acted to fix negative amino acids at specific Zebrafish hox clusters and vertebrate genome evolution. Science key points across the protein; this would require that 282 (27): 1711–1714. Bishop, J. G., A. M. Dean and T. Mitchell-Olds, 2000 Rapid evolu- key positions across the protein all happened to require tion in plant chitinases: molecular targets of selection in plant- only a single evolutionary step to change to a negative pathogen coevolution. Proc. Natl. Acad. Sci. USA 97: 5322–5327. amino acid. This analysis, then, suggests that following Blacklidge, K. H., and C. A. Bidwell, 1993 Three ploidy levels indicated by genome quantification in Acipenseriformes of North the duplication of the TPI gene selection acted on the America. J. Hered. 84: 427–430. overall net charge of the TPI-A protein and not on key Burger, A., M. Eppenberger, U. Wiesmann and R. Richterich, amino acids. Evolution of the TPI-A protein seems to 1963a Isozyme der Creatin-Kinase. Helv. Physiol. Acta 21: C6– C10. have followed one of the “shortest paths” to overall high Burger, A., R. Richerich and H. Aebi, 1963b Die Heterogenita¨t net negative charge. der Creatin-Kinase. Biochem. Z. 339: 305–314. That selection acted on the overall charge, and not Champion, M. J., and G. S. Whitt, 1976 Differential gene expression in multilocus isozyme systems of the developing green sunfish. on key amino acids, is also suggested by slight differ- J. Exp. Zool. 196: 263–282. ences in the negative amino acids present in DrTPI-A Cheng, J., L. M. Mielnicki, S. C. Pruitt and L. E. Maquat, 1990 and XmTPI-A (Figure 1). While both neural proteins Nucleotide sequence of murine triosephosphate isomerase cDNA. Nucleic Acids Res. 18: 4261. share a strongly net negative charge and the majority Collaborative Computational Project, 1994 Surface. Acta Crys- of negative amino acids, there are negative amino acids tallogr. 50: 760–763. that are unique to the neural isozyme of either species. Duda, T. F., Jr., and S. R. Palumbi, 1999 Molecular genetics of ecological diversification: duplication and rapid evolution of The selective constraints that are present (purifying se- toxin genes of the venomous gastropod Conus. Proc. Natl. Acad. lection predominates along each terminal node; Figure Sci. USA 96: 6820–6823. 2B) seem to conserve the overall charge of the molecule, Felsenstein, J., 1985 Confidence limits on phylogenies: an ap- proach using the bootstrap. Evolution 39: 783–791. but allow some variation in the amino acids that contrib- Fisher, S. E., and G. S. Whitt, 1978 Evolution of isozyme loci and ute to the charge. This is in direct contrast to cases such their differential tissue expression. Creatine kinase as a model as the opsin gene family (reviewed in Yokoyama 1997) system. J. Mol. Evol. 12: 25–55. Fisher, S. E., J. B. Shaklee, S. D. Ferris and G. S. Whitt, 1980 Evo- in which selection appears to act at specific amino acid lution of five isozyme systems in the chordates. Genetica 52: sites promoting specific substitutions. 73–85. Analysis of vertebrate TPI gene coding regions sup- Frohman, M. A., 1990 RACE: rapid amplification of complementary DNA ends, pp. 28–38 in PCR Protocols: A Guide to Methods and ports earlier proposals that the high net negative charge Applications, edited by M. A. Innis,D.H.Gelfand,J.J.Sninsky of neural isozymes is a protein level adaptation. We and T. J. White. Academic Press, New York. have not attempted to address the exact nature of the Grande, L., and W. E. Bemis, 1991 Osteology and phylogenetic relationships of fossil and recent paddlefish (polydontidae) with selective pressures involved, but the large number of comments on the interrelationships of the acipenseriformes. J. gene families that include a negatively charged neural Vert. Paleo. Memoir 1: 1–121. isozyme suggests that these pressures are a general fea- Grantham, R., 1974 Amino acid difference formula to help explain protein evolution. Science 185: 862–864. ture of the neural environment, not a phenomenon spe- Hughes, A. L., 1994 The evolution of functionally novel proteins cific to the TPI gene family. The physiological or bio- after gene duplication. Proc. R. Soc. Lond. Ser. B 256: 119–124. chemical advantage of negative neural isozymes is not Hughes, A., and M. K. Hughes, 1993 Adaptive evolution in the rat olfactory receptor gene family. J. Mol. Evol. 36: 249–254. immediately apparent. However, a high intracellular, Jones, D. T., W. R. Taylor and J. M. Thorton, 1992 The rapid relative to extracellular, concentration of negatively generation of mutation data matrices from protein sequences. charged organic molecules is necessary for maintenance Comput. Appl. Biosci. 8: 275–282. Jukes, T. H., and C. R. Cantor, 1969 Evolution of protein mole- of the resting potential of neurons (Nicholls et al. cules, pp. 21–123 in Mammalian Protein Metabolism, edited by H. N. 1992). Accumulation of a large net negative charge by Munro. Academic Press, New York. multiple neural proteins might in some way be associ- Kimura, M., 1980 A simple method for estimating evolutionary rate of base substitutions through comparative studies of nucleotide ated with this requirement. This is consistent with the sequences. J. Mol. Evol. 16: 111–120. conclusion that selection appears to have acted on the Kolb, E., J. I. Harris and J. Bridgen, 1974 Triosephosphate iso- overall net charge of the protein, not on changes at merase from the coelacanth. An approach to the rapid determina- tion of an amino acid sequence with small amounts of material. specific amino acids. Further examination of other gene Biochem. J. 137(2): 185–197. families that include neural isozymes may shed further Kumar, S., K. Tamura and M. Nei, 1993 MEGA: molecular evolu- light on this topic. tionary genetics analysis, version 1.0. University Park: Pennsylva- nia State University. We thank David Stock, Joseph Staton, Robbie Young, Jacqueline Kuraku, S., D. Hoshiyama, K. Katoh, H. Suga and T. Miyata, 1999 Litzgus, Jim Grady, Robert Friedman, and Austin Hughes for helpful Monophyly of lampreys and hagfish supported by nuclear DNA- comments on this manuscript, and Lirong Shi for technical assistance. coded genes. J. Mol. Evol. 49: 729–735. We are also grateful to Lukasz Lebioda for assistance with the amino Li, W.-H., 1983 Evolution of duplicate genes and pseudogenes, pp. 14–37 in Evolution of Genes and Proteins, edited by M. Nei and acid surface area analysis. This work was supported by the Research R. K. Koehn. Sinauer Associates, Sunderland, MA. and Productive Scholarship Fund of the University of South Carolina, Maquat, L. E., R. Chilcote and P. M. Ryan, 1985 Human triose- the Cooperative Institute for Fisheries Molecular Biology [FISHTEC; phosphate isomerase cDNA and protein structure. Studies of NOAA/NMFS (RT/F-1)], and the National Science Foundation triosephosphate isomerase deficiency in man. J. Biol. Chem. 260: (OCE-9814172). 3748–3753. Directional Selection on Teleost Neural Isozymes 697

Marangos, P. J., and D. E. Schmechel, 1987 Neuron specific eno- dehydrogenase isozymes: the molecular and genetic basis for the lase, a clinically useful marker for neurons and neuroendocrine unique eye and liver LDHs of teleost fishes. Exp. Zool. 185: cells. Annu. Rev. Neurosci. 10: 269–295. 217–240. Messier, W., and C.-B. Stewart, 1997 Episodic adaptive evolution Stock, D. W., D. L. Ellies, Z. Zhao, M. Ekker and F. H. Ruddle, of primate lysozymes. Nature 385: 151–154. 1996 The evolution of the vertebrate Dlx gene family. Proc. Morizot, D. C., and M. E. Schmidt, 1990 Starch gel electrophoresis Natl. Acad. Sci. USA 93: 10858–10863. and histochemical visualization of proteins, pp. 23–80 in Applica- Stock, D. W., J. M. Quattro, G. S. Whitt and D. A. Powers, 1997 tions of Electrophoresis and Isoelectric Focusing in Fisheries Management, Lactate dehydrogenase (LDH) gene duplication during chordate edited by D. H. Whitmore. CRC Press, Boca Raton, FL. evolution: the cDNA sequence of the LDH of the tunicate Styela Neidert, A. H., V. Virupannavar, G. W. Hoover and J. A. Lange- plicata. Mol. Biol. Evol. 14: 1273–1284. land, 2001 Lamprey Dlx genes and early vertebrate evolution. Straus, D., and W. Gilbert, 1985 Genetic engineering in the pre- Proc. Natl. Acad. Sci. USA 98 (4): 1665–1670. cambrian: structure of the chicken triosephosphate isomerase Nicholls, J. G., A. R. Martin and B. G. Wallace, 1992 From Neuron gene. Mol. Cell. Biol. 5: 3497–3506. to Brain: A Cellular and Molecular Approach to the Function of the Swofford, D. L., 1999 PAUP* Phylogenetic Analysis Using Parsi- mony (*and Other Methods, version 4.0b). Sinauer Associates, Nervous System, Ed. 3. Sinauer Associates, Sunderland, MA. Sunderland, MA. Nikoh, N, N. Iwabe, K. Kuma, M. Ohno, Y. Sugiyama et al., 1997 Swofford, D. L., G. J. Olsen, P. J. Waddell and D. M. Hillis, An estimate of divergence time of parazoa and eumetazoa and 1996 Phylogenetic inference, pp. 407–514 in Molecular Systematics, that of cephalochordata and vertebrata by aldolase and triose- Chap. 11, edited by D. M. Hillis,C.Moritz and B. K. Mable. phosphate isomerase clocks. J. Mol. Evol. 45: 97–106. Sinauer Associates, Sunderland, MA. Ohta, T., 1991 Multigene families and the evolution of complexity. Tanaka, T., and M. Nei, 1989 Positive Darwinian selection observed J. Mol. Evol. 33: 34–41. at the variable region genes of immunoglobins. Proc. Natl. Acad. Old, S. E., and H. W. Mohrenweiser, 1988 Nucleotide sequence of Sci. USA 76: 4521–4525. the triosephosphate isomerase gene from Macaca mulatta. Nucleic Thompson, J. D., D. G. Higgins and T. J. Gibson, 1994 CLUSTAL Acids Res. 16: 9055. W: improving the sensitivity of progressive multiple sequence Patterson, C., 1993 An overview of the early fossil record of acan- alignment through sequence weighting, position specific gap pen- thomorphs. Bull. Mar. Sci. 52 (1): 29–59. alties and weight matrix choice. Nucleic Acids Res. 22 (22): 4673– Peetz, E. W., G. Thomsomn and P. W. Hedrick, 1986 Charge 4680. changes in protein evolution. Mol. Biol. Evol. 3: 84–94. Whitt, G. S., 1970 Developmental genetics of the lactate dehydroge- Penhoet,E.,T.Rajkumar and W. J. Rutter, 1966 Multiple forms nase isozymes of fish. J. Exp. Zool. 175: 1–36. of fructose diphosphate aldolase in mammalian tissues. Proc. Wierenga, R. K., M. E. M. Noble, G. Vriend, S. Nauche and W. G. J. Natl. Acad. Sci. USA 56 (4): 1275–1282. Hol, 1991 Refined 1.83 A˚ structure of trypanosomal triosephos- Pontier, P. J., and N. H. Hart, 1981 Developmental expression of phate isomerase crystallized in the presence of 2.4 M ammonium glucose and triose phosphate isomerase genes in teleost fishes sulphate. J. Mol. Biol. 220: 995–1015. (Brachydanio). J. Exp. Zool. 217: 53–71. Willett, C. S., 2000 Evidence for directional selection acting on Quattro, J. M., H. A. Woods and D. A. Powers, 1993 Sequence pheromone-binding proteins in the genus Choristoneura. Mol. analysis of teleost retina-specific lactate dehydrogenase C: evolu- Biol. Evol. 17: 553–562. tionary implications for the vertebrate lactate dehydrogenase Xia, X., and W.-H. Li, 1998 What amino acid properties affect pro- gene family. Proc. Natl. Acad. Sci. USA 90: 242–246. tein evolution? J. Mol. Evol. 47: 557–564. Quattro, J. M., D. D. Pollock, M. Powell, H. A. Woods and D.A. Yokoyama, S., 1997 Molecular genetic basis of adaptive selection: Powers, 1995 Evolutionary relations among vertebrate muscle- examples from color vision in vertebrates. Annu. Rev. Genet. 31: 315–336. type lactate dehydrogenases. Mol. Mar. Biol. Biotech. 4: 224–231. Zhang, J., S. Kumar and M. Nei, 1997 Small-sample tests of episodic Saiki, R. K., D. H. Gelfand, S. Stoeffel, S. J. Scharf, R. Higuchi evolution: a case study of primate lysozymes. Mol. Biol. Evol. 14: et al., 1988 Primer directed enzymatic amplification of DNA 1335–1338. with a thermostable DNA polymerase. Science 239: 487–491. Zhang, J., H. F. Rosenberg and M. Nei, 1998 Positive Darwinian Saitou, N., and M. Nei, 1987 The neighbor-joining method: a new selection after gene duplication in primate ribonuclease genes. method for reconstructing phylogenetic trees. Mol. Biol. Evol. Proc. Natl. Acad. Sci. USA 95: 3708–3713. 4: 406–425. Shaklee, J. B., K. L. Kepes and G. S. Whitt, 1973 Specialized lactate Communicating editor: S. Yokoyama