View metadata, citation and similar papers at core.ac.uk brought to you by CORE

provided by Elsevier - Publisher Connector

Developmental Biology 300 (2006) 282–292 www.elsevier.com/locate/ydbio

Intermediary in sea urchin: The first inferences from the genome sequence ⁎ Manisha Goel a, Arcady Mushegian a,b,

a Stowers Institute for Medical Research, 1000 E. 50th St., Kansas City, MO 64110, USA b Department of Microbiology, Molecular Genetics, and Immunology, University of Kansas Medical Center, Kansas City, KS 66160, USA Received for publication 8 May 2006; revised 10 August 2006; accepted 16 August 2006 Available online 22 August 2006

Abstract

The genome sequence of the purple sea urchin Strongylocentrotus purpuratus recently became available. We report the results of functional annotation and initial analysis of more than 2300 proteins predicted to be involved in metabolite transport and enzymatic conversion in sea urchin. The comparison of various reconstructed biosynthetic and catabolic pathways in sea urchin to those known in other genomes suggests the overall similarity of the sea urchin metabolism to that of the vertebrates, with relatively small but non-trivial differences from both vertebrates and protostomes. There are several examples of two parallel, non-orthologous solutions for the same molecular function in sea urchin, in contrast with the other completely sequenced metazoans that tend to contain just one version of the same function. There are also genes that appear to be close phylogenetic neighbors of plant or bacterial homologs, as opposed to homologs in other Metazoa. The evolutionary and functional significance of these variations is discussed. © 2006 Elsevier Inc. All rights reserved.

Keywords: S. purpuratus; Sea urchin; Comparative genomics; Metabolic pathways; Thiaminase; Urea cycle; Cholesterol; Phosphoglyerate mutase

Introduction stomes, for which complete or near-complete sequences are available from nematode C. elegans and a range of insects, and Echinoderms, in particular California purple sea urchin in completely sequenced deuterostomes, i.e., sea squirts, as well Strongylocentrotus purpuratus, have been favorite model as several vertebrates from to human. Two plant organisms for developmental biology and gene regulation genomes, many fungal and bacterial genomes, and large EST studies (Oliveri and Davidson, 2004; Cameron et al., 2004). At collections from diverse eukaryotes are also available. Many of the same time, very little is known about the intermediary the pathways biochemically studied or computationally recon- metabolism of any echinoderm and about genes that code for structed in these species may be conserved in sea urchin as well. and transporters in sea urchin. Nutritional require- Earlier analysis of sea urchin EST sequence conservation has ments of echinoderms are not well known as the diets used for indicated that sequences of sea urchin tend to be closer to their cultivating sea urchins tend to be rich and undefined, and homologs in other deuterostomes than to protostomes, in limited understanding of the energy metabolism during the agreement with the consensus tree of metazoan evolution embryonic development of sea urchin predates large-scale (Poustka et al., 2003). On the other hand, peculiarities of the sequencing era (Czihak et al., 1975). lifestyle and diet of echinoderms might have resulted in Genome sequence of S. purpuratus allows one to examine selection pressure to possess biochemical functions that are sea urchin metabolism from the comparative point of view. present in invertebrates or even in non-metazoan organisms. Metabolic pathways have been studied or inferred in proto- In this work, we started to analyze the complement of predicted metabolic enzymes and transporters encoded by sea ⁎ Corresponding author. Stowers Institute for Medical Research, 1000 E. 50th urchin genome. The first aspect of the analysis is functional: we St., Kansas City, MO 64110, USA. Fax: +1 816 926 2041. would like to understand better the physiology of sea urchin. By E-mail address: [email protected] (A. Mushegian). looking for sea urchin orthologs of proteins which are part of

0012-1606/$ - see front matter © 2006 Elsevier Inc. All rights reserved. doi:10.1016/j.ydbio.2006.08.030 M. Goel, A. Mushegian / Developmental Biology 300 (2006) 282–292 283 known metabolic pathways, we can assert the presence of these MetaCyc (Krieger et al., 2004). The mapping of the metabolic pathways pathways in the genome, which would indicate the biosynthetic annotated in sea urchin to KEGG database and their comparison to that known in human, fly, and worm are shown in Table 1. and catabolic capacities of sea urchin. The absence of any particular pathway may indicate that the corresponding nutrient Results and discussion is available in their natural diets or may suggest that an alternative pathway is employed by sea urchin. Overview The other aspect of analysis is evolutionary: we can compare the putative pathways in sea urchin with those of other The 2297 annotated metabolism-related proteins represent organisms and discern the picture of gains and losses of about 8% of total predicted genes for the sea urchin genome. In genes, functions, and metabolic circuits over the phylogeny of addition, nearly the same number of proteins (2296) is metazoan animals and other eukaryotes. predicted to have either enzymatic or transport functions, but with broad or unknown specificities (the “R” category in the Methods COG/KOG system) (Fig. 1). The distribution of homologs of 2297 metabolism-related sea urchin proteins by category is Eukaryotic clusters of orthologous groups KOGs (Tatusov et al., 2003) were selected as a reference resource for annotating metabolism-related gene products shown in Table 2. We compared these numbers to the counts of of S. purpuratus. We selected all KOGs (981 families in total) belonging to the the human, nematode and fruit fly proteins included in each following categories: [C] energy production and conversion; [E] amino acid respective category of the KOG database and to the proportion transport and metabolism; [F] nucleotide transport and metabolism; [G] of genes in each category in all these species (the predicted carbohydrate transport and metabolism; [H] coenzyme transport and metabo- number of genes for each genome, except sea urchin, is from lism; [I] lipid transport and metabolism; [P] inorganic ion transport and metabolism; [Q] secondary metabolites biosynthesis, transport, and catabolism. Table 23 in Lander et al., 2001). Examination of Table 2 The multiple sequence alignments of these KOGs were converted to their indicates that, at the level of major functional categories of respective HMMs by the hmmbuild module of HMMer (Eddy, 1998). These metabolic enzymes and transporters, sea urchin proteome most HMMs were used to search through a database of 28,944 sea urchin predicted closely resembles the protein set of humans, in both the proteins, and sea urchin sequences matching the HMMs with a conservative E- absolute number of proteins assigned to each category and the value <10− 15 were collected. At this cutoff, the sum of false negatives and false positives appeared to be minimized (data not shown). For KOGs that did not fraction of such proteins within the complete protein set. The match any predicted protein sequences at this cutoff, we did a tblastn search model protostomes tend to have similar absolute number of against the sea urchin assembly, using the consensus sequences of each KOG as proteins in all functional categories, but the proportion of the query. Some of the matches which show a high sequence identity to E. coli metabolism-related proteins in nematode and fruit fly exceeds sequences were discarded as possible contaminations. the numbers for sea urchin and human by 50–150%. Some of This highly automated, efficient protocol allowed us to quickly select the majority of sea urchin products that shared similarity with the metabolism- this excess is a consequence of the lower total number of genes related proteins in at least three eukaryotes (definition of a KOG). All matches in invertebrates and comparable number of genes dedicated to were examined to remove spurious alignments (typically, less-than-full-length maintaining essentially similarpathwaysofintermediary matches owing to compositional bias in HMM and/or target protein). To correct metabolism. In certain cases, however, the excess of genes in for the false negatives, all proteins that did not pass the cutoff were compared, a particular category may indicate genuine differences in gene using the RPS-BLAST program (Marchler-Bauer et al., 2002), to the CDD database from NCBI (Marchler-Bauer et al., 2005) and analyzed for the presence repertoire correlated with the lifestyle. For example, larger of significant matches (E-value 10− 5) to KOGs or unicellular organism-specific proportion of proteins involved in carbohydrate metabolism COGs associated with intermediary metabolism. Proteins were then annotated and transport in flies may reflect the composition of their based on the sequences similarity to sequences with known function, based on plant-based diet, whereas similarity of sea urchin protein sequence comparison and the presence of characteristic domains. This allowed distribution to that of humans may be a result of their more us to annotate functionally about 2300 predicted genes in the sea urchin genome assembly. At a later step in the analysis, when a particular function was reported omnivorous lifestyle. missing in sea urchin (see Results), we used the known proteins with the There are 117 metabolism-related KOGs that are found in appropriate activity as queries in a PSI-BLAST search of the NR database, built some of the completely sequenced eukaryotes, but do not have the checkpoint profiles (Altschul and Koonin, 1998), and performed additional orthologs in the sea urchin genome assembly. The distribution searches of sea urchin protein set with these profiles. of these KOGs in four major clades of eukaryotes (plants, In addition to the KOG/COG, NR, and CDD databases, we compared sea urchin gene products to 15,852 predicted protein sequences of the urochordate fungi, and two metazoan clades) is shown in Table 3. There are C. intestinalis genome (http://genome.jgi-psf.org/ciona4/ciona4.download.html) 12 distinct phyletic patterns characterizing such KOGs (Table and to the EST databases at NCBI. 4). Seven of these patterns, including three most common ones In a fraction of cases, sequences of the predicted metabolism-related proteins and accounting for more than 91% of genes missing from sea were incomplete. It is possible that some of them represent pseudogenes, but we urchin but found in other eukaryotes, are easily explained either expect that a substantial portion of these partial predictions will be resolved as complete genes in the process of S. purpuratus genome assembly. by a single loss of an ancestral gene in one evolutionary lineage The ABC transporters, many of which are involved in uptake of metabolic or by ancient lineage-specific gene gain (Supplementary Table precursors, as well as cytochrome P450 family members are examined in more 1A). The largest group among the phyletic patterns is the detail (Goldstone et al., 2006). KOGs present only in plants and fungi: these genes were either Phylogenetic trees were constructed using the neighbor-joining method as lost or never gained by ancestral metazoans. A major subset of implemented in the NEIGHBOR program of the PHYLIP package, with 100 bootstrap replicates (Felsenstein, 1989). Multiple alignments were created using these (48 KOGs) belongs to two functional categories (amino ClustalW (Thompson et al., 1994). The annotated genes were mapped onto acid and coenzyme metabolism), reflecting the inability of metabolic pathways as represented in KEGG (Kanehisa et al., 2006) and metazoans to synthesize these compounds and the essential 284 M. Goel, A. Mushegian / Developmental Biology 300 (2006) 282–292

Table 1 Table 1 (continued) Mapping of sea urchin metabolism-related proteins on KEGG pathways, in KEGG Human C. Drosophila Sea comparison to the number of proteins known to be part of each pathway from Map ID elegans urchin human, C. elegans, and Drosophila Amino acid metabolism KEGG Human C. Drosophila Sea Glycine, serine and 00260 33 17 16 24 Map ID elegans urchin threonine metabolism Carbohydrate metabolism Methionine metabolism 00271 9 6 6 6 Glycolysis/Gluconeogenesis 00010 28 20 24 23 Cysteine metabolism 00272 8 7 6 5 Citrate cycle (TCA cycle) 00020 21 20 19 19 Valine, leucine, and 00280 33 21 20 21 Pentose phosphate pathway 00030 16 16 14 16 isoleucine degradation Pentose and glucuronate 00040 7 5 5 6 Valine, leucine, and 00290 7 7 6 3 interconversions isoleucine biosynthesis Fructose and mannose 00051 20 16 16 19 Lysine biosynthesis 00300 5 –– 3 metabolism Lysine degradation 00310 25 12 15 15 Galactose metabolism 00052 16 9 9 11 Arginine and proline 00330 32 14 19 20 Ascorbate and aldarate 00053 2 3 4 3 metabolism metabolism Histidine metabolism 00340 17 6 6 5 Starch and sucrose 00500 24 13 15 15 Tyrosine metabolism 00350 24 13 14 15 metabolism Phenylalanine metabolism 00360 10 6 6 7 Aminosugars metabolism 00530 21 10 12 11 Tryptophan metabolism 00380 25 17 19 14 Nucleotide sugars 00520 7 7 5 5 Phenylalanine, tyrosine, and 00400 7 7 7 4 metabolism tryptophan biosynthesis Pyruvate metabolism 00620 23 15 18 18 Urea cycle and metabolism 00220 19 8 9 15 Glyoxylate and 00630 11 6 6 9 of amino groups dicarboxylate metabolism Propanoate metabolism 00640 22 14 13 17 Nucleotide metabolism Butanoate metabolism 00650 25 17 19 15 Purine metabolism 00230 96 64 61 35 C5-branched dibasic acid 00660 2 2 – 2 Pyrimidine metabolism 00240 72 60 58 23 metabolism Inositol metabolism 00031 2 – 22Other amino acids Inositol phosphate 00562 21 9 12 8 Beta-Alanine metabolism 00410 0 9 9 9 metabolism Taurine and hypotaurine 00430 16 3 4 3 metabolism Energy metabolism Aminophosphonate 00440 5 5 6 4 Oxidative phosphorylation 00190 108 87 91 42 metabolism ATP synthesis 00193 29 24 28 21 Selenoamino acid 00450 5 11 8 6 Photosynthesis 00195 –– – 1 metabolism Carbon fixation 00710 12 13 11 12 Cyanoamino acid 00460 11 4 3 3 Reductive carboxylate cycle 00720 7 8 6 6 metabolism Methane metabolism 00680 5 6 5 5 D-glutamine and D-glutamate 00471 2 2 1 2 Nitrogen metabolism 00910 9 11 6 10 metabolism Sulfur metabolism 00920 10 4 2 2 D-arginine and D-ornithine 00472 1 1 – 1 metabolism Lipid metabolism Glutathione metabolism 00480 10 9 9 9 Fatty acid biosynthesis 00061 10 8 5 5 Fatty acid elongation in 00062 8 4 6 5 Cofactors and vitamins mitochondria metabolism 00730 3 –– 3 Fatty acid metabolism 00071 23 14 15 17 Riboflavin metabolism 00740 6 2 4 4 Synthesis and degradation of 00072 6 5 6 4 Vitamin B6 metabolism 00750 5 4 2 4 ketone bodies Nicotinate and nicotinamide 00760 18 6 7 9 Biosynthesis of steroids 00100 17 4 7 14 metabolism Bile acid biosynthesis 00120 15 8 8 10 Pantothenate and CoA 00770 12 6 6 6 C21-steroid hormone 00140 9 –– 2 biosynthesis metabolism Biotin metabolism 00780 6 – 21 Androgen and estrogen 00150 15 3 3 9 Folate biosynthesis 00790 9 6 9 13 metabolism One carbon pool by folate 00670 17 12 10 9 Glycerolipid metabolism 00561 16 7 9 14 Retinol metabolism 00830 3 –– 2 Glycerophospholipid 00564 29 17 22 22 Porphyrin and chlorophyll 00860 15 5 12 11 metabolism metabolism Arachidonic acid metabolism 00590 27 6 6 4 Ubiquinone biosynthesis 00130 13 12 12 4 Linoleic acid metabolism 00591 10 2 2 4 The number of proteins as part of each pathway, for Human, C. elegans, and Drosophila is taken from KEGG database. Amino acid metabolism Glutamate metabolism 00251 24 21 20 16 Alanine and aspartate 00252 19 13 15 12 requirement for these as nutrients in the diet. There were also metabolism five patterns, comprising 10 KOGs, which must be explained by multiple evolutionary events each, i.e., either by multiple M. Goel, A. Mushegian / Developmental Biology 300 (2006) 282–292 285

Fig. 1. The distribution of metabolism-related proteins of S. purpuratus in various functional categories (see Methods). losses in different evolutionary lineages or perhaps by these genes has been reported earlier, in the course of analysis of horizontal gene transfer (see examples below). S. purpuratus ESTs (Poustka et al., 2003). The presence of the There were 21 KOGs with representatives in sea urchin but complete cholesterol biosynthetic pathway indicates that sea not in humans (Supplementary Table 1B). Most interestingly, urchin must be capable of cholesterol biosynthesis at least under six of these KOGs are also missing from completely sequenced some conditions. Synthesis of small amounts of cholesterol has genomes of other Metazoa (Table 5a). These genes must have been reported in another sea urchin species, Echinus esculentus intriguing evolutionary history, which either includes multiple (Smith and Goad, 1974). gene losses in both protostomes and deuterostomes (but not in sea urchin) or horizontal transfer into echinoderm lineage from Urea metabolism outside the metazoan clade (Table 5b). Complete reconstruction of the pathways and metabolic Sea urchin is an ammonotelic animal-unlike terrestrial fluxes through them is beyond the scope of this initial analysis. animals, it does not secrete urea as the nitrogenous waste In the rest of this paper, however, we present the examples of (Ruppert et al., 2003). Nonetheless, all enzymes of the urea findings that clarify the status of individual molecular functions cycle are found in the genome of sea urchin (Supplementary or selected pathways in sea urchin. Table 2C), as well as in genomes of other marine organisms (Armbrust et al., 2004). This suggests that urea has other Cholesterol biosynthesis functions in marine, if not in all, animals. One such function may be osmotic homeostasis, which requires a distinct type of Addition of cholesterol to animals' diet is often considered glutamine-dependent carbamoyl-phosphate synthetases, CPS- beneficial for sea urchin cultivation. We were able, however, to III (Withers, 1998). SPU_002174 is a candidate for CPS-III identify all genes of the trunk mevalonate pathway and of function in sea urchin. In the phylogenetic tree, this sequence cholesterol synthesis from mevalonate in S. purpuratus appears to be equidistant from the known CPS-III and from the (Supplementary Tables 2A and B). Expression of some of alternative, paralogous ammonia-dependent CPS-I sequences,

Table 2 Distribution of metabolism-related KOGs (proteins) identified in S. purpuratus, in other invertebrates and vertebrate representatives is shown according to KOG functional categories Functional categories as described in No. of KOGs (proteins) No. of these KOGs (proteins) (% of total predicted protein repertoire) present in KOG database (% of total predicted Human C. elegans Drosophila protein repertoire) identified in sea urchin [C] Energy production and conversion 148 (330) (1.14%) 148 (456) (1.43%) 143 (329) (1.80%) 144 (292) (2.19%) [E] Amino acid transport and metabolism 107 (369) (1.27%) 103 (484) (1.51%) 96 (268) (1.47%) 98 (429) (3.22%) [F] Nucleotide transport and metabolism 55 (125) (0.43%) 54 (169) (0.53%) 49 (87) (0.48%) 50 (86) (0.64%) [G] Carbohydrate transport and metabolism 126 (527) (1.82%) 126 (498) (1.56%) 115 (471) (2.58%) 123 (315) (2.36%) [H] Coenzyme transport and metabolism 52 (91) (0.31%) 52 (116) (0.36%) 40 (62) (0.34%) 47 (66) (0.49%) [I] Lipid transport and metabolism 140 (390) (1.35%) 138 (501) (1.57%) 128 (389) (2.13%) 121 (329) (2.47%) [P] Inorganic ion transport and metabolism 77 (296) (1.02%) 77 (485) (1.52%) 74 (317) (1.74%) 74 (218) (1.63%) [Q] Secondary metabolites biosynthesis, 32 (189) (0.65%) 32 (227) (0.71%) 29 (187) (1.02%) 28 (177) (1.33%) transport and catabolism Total—metabolism-related 715 (2297) (7.94%) 703 (2820) (8.81%) 648 (2051) (11.22%) 657 (1842) (13.81%) Total no. of predicted genes 28,944 32,000 18,266 13,338 The number of protein is also represented as percentage of the total predicted protein repertoire of each genome. 286 M. Goel, A. Mushegian / Developmental Biology 300 (2006) 282–292

Table 3 Distribution of 117 metabolism-related KOGs for which no significant matches were found in S. purpuratus genome assembly Functional categories as described in KOG No. of KOGs absent No. of these KOGs (proteins) present in database in sea urchin Human C. elegans Drosophila S. cerevisiae Arabidopsis [C] Energy production and conversion 22 13 (55) 18 (32) 15 (20) 12 (18) 18 (53) [E] Amino acid transport and metabolism 39 3 (9) 3 (3) 2 (2) 37 (49) 37 (91) [F] Nucleotide transport and metabolism 7 2 (3) 2 (3) 2 (4) 7 (12) 6 (23) [G] Carbohydrate transport and metabolism 10 3 (9) 3 (10) 3 (6) 8 (23) 8 (66) [H] Coenzyme transport and metabolism 15 0 (0) 0 (0) 0 (0) 15 (31) 15 (24) [I] Lipid transport and metabolism 9 3 (8) 4 (29) 2 (3) 9 (12) 7 (40) [P] Inorganic ion transport and metabolism 19 8 (27) 5 (10) 6 (10) 17 (28) 15 (154) [Q] Secondary metabolites biosynthesis, 5 3 (8) 3 (9) 2 (4) 4 (9) 5 (32) transport and catabolism Total—metabolism-related 117 31 (109) 34 (88) 28 (43) 102 (166) 104 (452) but the presence of a signature cysteine residue in position 292 and plants. In yeast, there are two differently regulated malate (Hong et al., 1994) suggests that SPU_002174 may have a CPS- synthase homologs. One, Mls1p, is involved in glyoxylate III like activity. The presence of this gene indicates that sea shunt, and the other, Dal7p, appears to have a role in urchin may use urea as a balancing osmolyte. sequestering the glyoxylate byproduct resulting from allantoin We have also identified several predicted urea transporters. degradation, a role that does not involve isocitrate . SPU_004849 and SPU_010793 represent two such genes with Nematodes are the only known metazoan clade encoding significant protein sequence similarity to solute carrier family isocitrate lyase and malate synthase as a single bifunctional 14 members, which are found in vertebrates and play a role in protein expressed in larvae in tissue- and stage-specific manner. urea transport (Shayakul and Hediger, 2004). The two Glyoxylate cycle may help convert fat into carbohydrates conserved motifs diagnostic of the known urea transporters during worm embryogenesis (Liu et al., 1995). SPU_004444 (Bagnasco, 2003) appear to be modified in these proteins. and SPU_005315 show significant sequence similarity to Interestingly, SPU_011497 and SPU_004180 are gene frag- bacterial and plant malate synthases, but isocitrate lyase-like ments similar to, respectively, N-terminus and C-terminus of gene could not identified in the genome of sea urchin. The same fungal Dur3p urea transporter. An EST database search with sea pattern of presence and absence is also observed in fishes Te- urchin Dur3p-like protein finds matches to proteins only from traodon spp. and Danio rerio. The lack of the adjoining fungi. Thus, sea urchin genome may have functionally and of glyoxylate shunt in fishes and sea urchins may suggest a role evolutionarily unusual repertoire of urea transporters. for malate synthase in allantoin/urea metabolism, similar to its Malate synthase and isocitrate lyase are two enzymes of yeast homolog Dal7p. glyoxylate pathway, which is observed in bacteria, yeast, fungi Thiaminase

Table 4 Several sea urchin proteins show higher similarity to plant, Phyletic patterns for 117 KOGs absent in S. purpuratus genome fungal, or bacterial proteins, or to proteins from protostomes, Pattern No. of Most plausible evolutionary explanation than to human homologs (Table 6). One such protein, KOGs SPU_001327, has significant sequence similarity to bacterial AY–– 69 Ancestral eukaryotic gene lost or never gained TenA protein (COG0819). This protein is found in an operon in Metazoa with thiazole biosynthetic genes in bacteria like B. subtilis, and A Y P H 13 Ancestral eukaryotic gene lost in echinoderms ––P H 4 Ancestral metazoan gene lost in echinoderms homologous domain is fused with another thiamin biosynthesis A – P H 8 Ancestral eukaryotic gene lost in fungi and domain, ThiD, in S. cerevisiae. echinoderms AYP– 9 Ancestral eukaryotic gene lost in deuterostomes – Y P H 1 Ancestral eukaryotic gene lost or never gained in plants Table 5a AY– H 3 Ancestral eukaryotic gene lost in protostomes Phyletic pattern of 21 KOGs absent in human but present in S. purpuratus – Y –– 5 Fungal invention Pattern No. of Most plausible explanation – Y – H 1 Multiple losses or horizontal gene transfer KOGs – YP– 2 Gain in animal/fungal lineage, loss in deuterostomes – P – Y 1 Loss in higher deuterostomes A ––H 1 Multiple losses or horizontal gene transfer A ––Y 6 Loss in multiple lineages ? A – P – 1 Loss/divergence in subset of fungi and in AP–– 1 Loss in higher deuterostomes and yeast deuterostomes AP– Y 13 Loss in higher deuterostomes A=Arabidopsis. A=Arabidopsis. Y=yeast. P=protostomes (either/both C. elegans, Drosophila). P=protostomes (either/both C. elegans, Drosophila). H=human. H=human. Y=yeast. M. Goel, A. Mushegian / Developmental Biology 300 (2006) 282–292 287

Table 5b the triose part of glycolytic pathway. At least two different types Putative functions of 6 KOGs present in S. purpuratus but absent in C. elegans, of PGMs are known, those which require 2,3-biphosphoglyce- Drosophila, and human rate as the (cofactor-dependent type; dPGM) and those KOG SPU matches Putative function which can function independently of this cofactor (iPGM). The KOG1397 SPU_09168 Ca2+/H+ antiporter VCX1 two PGMs have distinct phyletic patterns: dPGMs are KOG1481 SPU_26470 distributed in vertebrates, fungi, and some bacteria, and KOG2254 SPU_08762 Predicted endo-1,3-beta-glucanase iPGMs are found in higher plants, some invertebrates, algae, KOG2263 SPU_25537 Methionine synthase II (cobalamin-independent) KOG2348 SPU_04180, Urea transporter and a distinct subset of bacteria (Jedrzejas, 2000a). The two SPU_11497 forms exist simultaneously in a few bacteria, for example, E. KOG2495 SPU_26684 NADH-dehydrogenase (ubiquinone) coli, selected other gammaproteobacteria and in B. subtilis. Members of iPGM and dPGM families belong, respectively, to The structure of B. subtilis TenA protein recently became alkaline and acid phosphatase superfamilies, which do not share available, and the thiaminase II activity, i.e., hydrolysis of sequence similarity and have distinct three-dimensional folds. thiamin to HMP and thiazole moieties, was shown for this Unexpectedly for a metazoan, sea urchin genome encodes protein (Toms et al., 2005). Thiaminase activity has been for both these PGMs: SPU_010377 belongs to iPGM family reported in bacteria, plants, and some marine animals, though in (KOG4513, with known members from plants and nematodes) the latter it appeared to belong to a different type, thiaminase I, and SPU_028416 to dPGM (KOG0235, found in yeast, human which uses various organic molecules, rather than polarized and Drosophila). water, as nucleophiles in the catalysis (Campobasso et al., 1998). There is no sequence similarity between TenA and a known Table 6 thiaminase I from B. thiaminolyticus. On the other hand, Sea urchin predicted proteins showing higher similarity to bacteria/plants/fungi SPU_001327 and B. subtilis thiaminase II are clearly related, than to other metazoans and all residues implicated in catalysis and in substrate binding SPU_id Conserved Putative function GenBank Taxonomic by TenA are conserved in sea urchin homolog. SPU_001327 is domain ID origin of the thus proposed to be a thiaminase II-like protein, and, given the closest tree neighbor lack of other enzymes of thiamine biosynthesis pathway in sea urchin, we believe that this protein represents a part of some SPU_001929 pfam03016 Exostosin family 15237602 Arabidopsis protein thaliana hitherto undefined pathway of thiamine salvage. Why such an SPU_002810 pfam01425 1707728 Sulfolobus activity would be important for some marine organisms remains solfataricus to be answered: one possibility may be the excess of vitamin B1 SPU_002895 COG3321 Polyketide 75914739 Cystobacter in their diet. synthase fuscus EST database searches indicate the presence of SPU_001327 SPU_004172 COG0520 Selenocysteine 89284317 Tetrahymena lyase thermophila homologs in other metazoans, mainly from aquatic habitats, SPU_005666 pfam01425 Amidase 1707728 Sulfolobus such as mollusk A. irradians and bony fishes I. punctatus and D. solfataricus rerio, but also in soil nematode H. glycines. Type II thiaminase SPU_005483 COG0673 Myo-inositol-2- 91081323 Tribolium may have been present in ancestral metazoan but was later lost in dehydrogenase castaneum some, notably terrestrial, lineages. To investigate the evolu- SPU_009841 COG0462 Phosphoribosyl 4902877 Spinacia pyrophosphate oleracea tionary history of Type II thiaminases in more detail, we synthetase constructed the phylogenetic tree of SPU_001327 homologs SPU_010831 COG1501 Glycosidase 77975771 Yersinia (Fig. 2). Remarkably, eukaryotes do not form a single clade in frederiksenii the tree: fungal sequences appear intermingled with a subset of SPU_017354 pfam01425 Amidase 1707728 Sulfolobus proteobacterial homologs, perhaps reflecting the mitochondrial solfataricus SPU_019708 COG0369 Sulfite reductase 66799961 Dictyostelium endosymbiotic origin of the TenA homologs in fungi, whereas discoideum the sequences from deuterostomes form a separate cluster, SPU_020324 pfam01425 Amidase 1707728 Sulfolobus joining firmicutes and Bacteroides sp. in a well-supported clade. solfataricus This fact, together with disparate position of many proteobac- SPU_021037 COG0436 Aspartate 40062700 Uncultured teria, suggests that evolutionary history of Type II thiaminases aminotransferase bacterium SPU_021638 pfam00933 Beta-D- 67930785 Solibacter should be explained by multiple gains and losses, perhaps xylosidase usitatus including two independent acquisitions of TenA by eukaryotes, SPU_021817 COG0605 Superoxide 62514078 Lactobacillus by symbiogenesis in fungi and by a separate horizontal transfer dismutase casei from another bacterium to the metazoan lineage, followed by SPU_025192 COG2303 Choline 85710688 Erythrobacter losses in terrestrial organisms. dehydrogenase sp. SPU_027832 COG2303 Choline 85710688 Erythrobacter dehydrogenase sp. Phosphoglycerate mutase SPU_027926 COG0451 Predicted 506333 Halocynthia epimerase roretzi Phosphoglycerate mutase (PGM) interconverts 3-phospho- SPU_028395 COG3321 Polyketide 63086968 Stigmatella glyceric acid and 2-phosphoglyceric acid—an essential step in synthase aurantiaca 288 M. Goel, A. Mushegian / Developmental Biology 300 (2006) 282–292

Fig. 2. The phylogenetic tree of putative thiaminase II sequences from S. purpuratus and similar sequences from other metazoans, bacteria, fungi, and archaea. The bootstrap values are also depicted along the edges. D. rerio, I. punctatus, A. irradians, and A. pectinifera sequences are derived from EST database, and the corresponding translated protein sequence was used for the phylogenetic reconstruction. Metazoan cluster is shaded yellow with S. purpuratus shown in red. Archaea are shown in orange cluster, actinobacteria in pink, firmicutes in blue, cyanobacteria in cyan, chloroflexi in deep green, and yeast are in light green. Proteobacteria are indicated in blue font and bacteroidetes in black (bold). Species abbreviations and the accession numbers are—Gka (G. kaustophilus, 56378995), Bsu (B. subtilis, 61680561), Oih (O. iheyensis, 22776321), Spn (S. pneumoniae, 14972196), Bth (B. thuringiensis, 75762115), Cbe (C. beijerincki, 82747980), Hpy (H. pylori, 4155800), Myc (Mycobacterium, 108797337), Rxy (R. xylanophilus, 108805573), Sto (S. tokodaii, 15622038), Art (Arthrobacter, 66967410), Cef (C. efficiens, 23493431), Bli (B. linens, 62423088), Cau (C. aurantiacus, 76261288), Aci (Acinetobacter, 49532392), Ret (R. etli, 86355907), Pab (P. abyssi, 5458517), Pae (P. aerophilum, 99032645), Ppu (P. putida, 82736443), Rru (R. rubrum, 83593360), Vib (Vibrio, 75856887), Ppr (P. profundum, 90414040), Sul (Sulfitobacter, 83954395), Bme (B. melitensis, 17983766), Hin (H. influenzae, 68057160), Mgr (M. grisea, 39939910), Afu (A. fumigatus, 71001452), Gze (G. zeae, 46116572), Spo (S. pombe, 3810842), Cup (C. upsaliensis, 57242672), Ava (A. variabilis, 75906246), Npu (N. punctiforme, 23124277), Ter (T. erythraeum, 110165682), Dps (D. psychrophila, 50877399), Sel (S. elongatus, 56750464), Pma (P. marinus, 33860915), Spu (S. purpuratus, SPU_001327), Ape (A. pectinifera, 93337679), Air (A. irradians, 56938699), Ipu (I. punctatus, 18393981), Dre (D. rerio, 70898138), Uma (U. maydis, 71019069), Bthe (B. thetaiotaomicron, 29340460), Sce (S. cerevisiae, 6324517).

The high resolution structure of iPGM from B. stearother- clade of protostomes (nematodes). If that enzyme was also mophilus is known (Jedrzejas et al., 2000b), and all amino acids present in the ancestral Metazoa, its evolutionary history must implicated in catalysis are conserved in SPU_010377 as well. have included multiple gene losses from protostome clades and EST database search finds similar sequence in primitive then again from vertebrate lineage. An alternative scenario may chordate, lancelet Branchiostoma floridae, but not in the consist of horizontal gene transfer of iPGM into one protostome extensively covered genome of the semichordate C. intestinalis. lineage and into early deuterostomes (or between these two SPU_028416 contains all predicted catalytic residues typical lineages). of dPGM. ESTs related to SPU_028416 are observed in lancelet and in sea squirt, as well as a primitive metazoan Nematostella 4-Coumarate:CoA -like genes vectensis (Cnidaria). Thus, cofactor-dependent PGM appears to be omnipresent in animal kingdom (with the exception of SPU_009289 and SPU_021869 show significant sequence nematodes) and must have been present in the last common similarity to 4-Coumarate:CoA from plants. These ancestor of Metazoa. On the contrary, iPGM in animals is found proteins catalyze an important step in the phenylpropanoid only in two deuterostomes, sea urchin and lancelet, and one pathway, involved in the production of flavonoids and lignin in M. Goel, A. Mushegian / Developmental Biology 300 (2006) 282–292 289 plants. There are 14 such family members in completely SPU_022493, SPU_004431, and SPU_023132 represent three sequenced Arabidopsis thaliana genome, at least four of which such prokaryotic-style PCS in sea urchin. SPU_004431 and are catalytically active on the phenylpropanoid pathway SPU_023132 appear to lack introns, which is not very typical intermediates (Costa et al., 2005). The putative proteins of sea for sea urchin genes (15% of intronless genes in total, many of urchin contain conserved BoxI motif that plays a role in AMP which are lineage-specific expansions, Raible et al., 2006; Sea binding (Schneider et al., 2003), but show modifications in the urchin consortium, submitted for publication). other conserved motif, BoxII, and in substrate-binding pocket Sequence comparison suggested the presence of phytoche- residues. latin synthases in C. intestinalis genome (Canestro et al., 2003). The nearest metazoan relatives of plant 4-Coumarate:CoA EST database searches revealed the presence of phytochelatin ligases are firefly-type luciferases, Acyl-CoA synthetases, and synthase-like proteins sequences in acoelamate bilaterians like long-chain-fatty-acid-CoA ligases. Sea urchin proteins, how- Schmidtea (plathelminthes), in protostomes like earthworm, ever, are phylogenetically closer to plant 4-Coumarate:CoA and in Cnidaria (Hydra and Nematostella) but not in vertebrates ligases than to these metazoan enzymes (data not shown). The or in arthropods thus far. It is therefore plausible that these genes evolutionary origin and function of these two proteins remain have undergone independent gene loss in the latter two lineages. unclear. Amino acid biosynthesis Laccases While plants and bacteria can synthesize all 20 translation- Laccases are copper-dependent oxidases that target various ally encoded amino acids, most metazoans can synthesize only organic compounds, notably phenol derivatives. Laccase-like some of them and depend on the presence of the other, proteins are known in plants, fungi, and bacteria, where they “essential” amino acids in the diet. A review of amino acid play roles in pigment metabolism, cell wall lignification, and biosynthesis capabilities of various groups of organisms copper homeostasis, and in insects, where they are needed for (Fitzgerald and Szmant, 1997) suggests that primitive Metazoa sclerotization of the cuticle (Nakamura and GO, 2005). There (for example, Cnidaria) can synthesize most of amino acids, are 10 genes in sea urchin with significant sequence similarity to similarly to plants and bacteria, and that the list of essential insect laccases, three of which are partial gene predictions. The amino acids varies widely among Metazoa and even among conserved histidines and a cysteine characterized in insect species of nematodes or molluscs. Vertebrates (fishes, chicken, laccases (Dittmer et al., 2004) are conserved in the full-length and humans), however, seem to share the requirement of the sea urchin laccases. Interestingly, insect laccases are unique in same ten essential amino acids in their diets, which may be a having a methionine in the T1 copper center, which is replaced result of gene/pathway loss that coincided with the emergence by phenylalanine or leucine in plants and fungi. Sea urchin of vertebrate clade. Our analysis of enzymes of amino acid laccase-like proteins have either methionine or leucine at this metabolism in sea urchin suggests that the list of essential and position, a replacement that should influence the redox potential non-essential amino acids in S. purpuratus is basically identical of the enzyme (Dittmer et al., 2004). Thus, it appears that the to nutritional requirements of humans and other vertebrates, up redox potential range, and perhaps biological functions, of sea to a few presumable artifacts of genome assembly that result in a urchin laccase-like proteins may be broader than in either small number of missing enzymes (Table 1). This suggests the insects or plants. The EST database search with sea urchin same set of ten essential amino acids for the ancestor of all laccase proteins shows the presence of similar proteins in some deuterostomes. At the level of individual pathways, however, urochordates (M. tectiformis), molluscs (L. stagnalis), and there are several notable modifications in amino acid metabo- cnidarians (Hydra), as well as in a variety of arthopods lism in sea urchin as opposed to vertebrates. including crabs and lobsters. This widespread distribution of SPU_020491 encodes a putative threonine dehydratase-like laccases in various invertebrates, and absence of orthologous enzyme. Threonine dehydratase catalyzes the breakdown of genes in vertebrates, makes it quite likely that ancestral Metazoa threonine to ammonia and 2-oxobutanoate, an intermediate in had laccase-like protein(s) which were lost in the vertebrate the isoleucine biosynthetic pathway. There are two different lineage. types of threonine dehydratases, both best known from E. coli, the biosynthetic IlvA (COG1171) and catabolic Tdc. These Phytochelatin synthases enzymes share similar N-terminal parts, but the biosynthetic type having a longer variable C-terminus, which is thought to Phytochelatin synthases (PCS) catalyze the synthesis of interact with allosteric effector isoleucine (Taillon et al., 1988). heavy metal-binding peptides phytochelatins from glutathione. The sea urchin protein shows similarity to the C-terminal region PCS-like proteins became known first from plant and fungi, but of C. glutamicum IlvA, but, notwithstanding this hint that the have recently been identified in bacteria (e.g. Nostoc sp.) and in sea urchin enzyme is a biosynthetic type, we have not been able invertebrates (e.g. C. elegans). The prokaryotic PCS-like to identify any additional enzymes of isoleucine biosynthesis polypeptides are about half as long as the eukaryotic counter- pathway. parts, missing the variable C-terminal domain but retaining SPU_026470 is similar to cysteine synthase from fungi catalytically active N-terminus with a known configuration of (COG0031/KOG1481). Cysteine synthase catalyzes the sec- catalytic residues (Rea et al., 2004; Vivares et al., 2005). ond step of cysteine biosynthesis from serine, condensing 290 M. Goel, A. Mushegian / Developmental Biology 300 (2006) 282–292

O-acetylserine with sulfide to give L-cysteine and acetic acid. various portions of glutamate synthase from Drosophila and C. The orthologs of the first enzyme of this pathway, serine elegans. This suggests the presence of NADH-dependant acetyltransferase, are not found in the sea urchin genome. In glutamate synthase in sea urchin and may move the loss of vertebrates and S. cerevisiae, cysteine is synthesized exclusively this gene to a time point after the echinoderm split from other through an alternate route, the cystathionine pathway, starting deuterostomes. from methionine, although an enzyme with dual activity of O- acetylhomoserine sulfhydrylase and O-acetylserine sulfhydry- Concluding remarks lase is known in yeast (Ono et al., 1999). Other fungi and plants are able to synthesize cysteine via both pathways. Enzymes of We describe the initial outline of the intermediary metabolism the cystathionine pathway, namely cystathionine β-synthase in echinoderms emerging from the analysis of the completely (SPU_017058, SPU_024701) and cystathionine γ-lyase sequenced genome of S. purpuratus. Annotation of the 2297 (SPU_003623, SPU_025100), are found in the sea urchin proteins involved in metabolism-related functions and their genome. This may indicate that sea urchin produces cysteine via mapping to known metabolic pathways suggests that metabo- cystathionine pathway as in vertebrates and uses cysteine lism in sea urchin is broadly and significantly similar to that in synthase of fungal type in a different capacity, such as synthesis humans and other vertebrates. At the same time, there are several of other non-protein amino acids (Noji et al., 1993). gene products that set echinoderms apart from vertebrates and Methionine biosynthesis by transfer of a methyl group to suggest evolutionary history of differential gene gain and loss. homocysteine can be catalyzed by two different enzymes. The Several enzymes, such as phytochelatin synthase and cofactor- only form of methionine synthase found in mammals is type I, independent phosphoglycerate mutase, apparently have been which requires the presence of cofactor cobalamin (vitamin lost in higher vertebrates but are retained in sea urchin. A number B12). Bacteria can have either type I, or different, cobalamin- of proteins involved in cysteine and methionine metabolism independent type II enzyme, or both, whereas plants have only distinguish sea urchin from most other metazoans. Finally, a the type II enzyme. Interestingly, sea urchin genome appears to bacterial TenA-like protein with predicted thiaminase II activity encode both types of enzymes. SPU_014251 is a type I suggests a novel vitamin B1 salvage pathway and raises a strong methionine synthase (5-methyltetrahydrofolate homocysteine possibility of horizontal gene transfer between bacteria and methyltransferase) gene, while SPU_025537 shows significant eukaryotes. similarity to methionine synthase type II (5-methyltetrahydrop- teroyltriglutamate-homocysteine methyltransferase) from var- Appendix A. Supplementary data ious proteobacteria and archaea. Type II methionine synthases in some bacteria and plants are proteins that have evolved by Supplementary data associated with this article can be found, internal duplication, subsequently loosing the substrate binding in the online version, at doi:10.1016/j.ydbio.2006.08.030. motifs in the noncatalytic N-terminal domain (Pejchal and Ludwig, 2005). Many proteobacteria and archaea, however, References encode functional methionine synthase similar to the C-terminal domain of plant/bacterial methionine synthase II (Zhou et al., Altschul, S.F., Koonin, E.V., 1998. Iterated profile searches with PSI-BLAST— 1999). The zinc-binding amino acids, as determined from the A tool for discovery in protein databases. Trends Biochem. Sci. 23 (11), crystal structure of T. maritima methionine synthase, are 444–447 (Nov). conserved in the predicted sea urchin methionine synthase II Armbrust, E.V., Berges, J.A., Bowler, C., Green, B.R., Martinez, D., Putnam, (Pejchal and Ludwig, 2005). EST database searches suggest the N.H., Zhou, S., Allen, A.E., Apt, K.E., Bechner, M., Brzezinski, M.A., Chaal, B.K., Chiovitti, A., Davis, A.K., Demarest, M.S., Detter, J.C., presence of similar protein sequences in other fungi, nematodes Glavina, T., Goodstein, D., Hadi, M.Z., Hellsten, U., Hildebrand, M., (G. pallida), and in cephalochordata (B. floridae). These Jenkins, B.D., Jurka, J., Kapitonov, V.V., Kroger, N., Lau, W.W., Lane, T.W., modifications in the metabolism of cysteine and methionine Larimer, F.W., Lippmeier, J.C., Lucas, S., Medina, M., Montsant, A., may represent connections to the pathways of bitter-tasting, Obornik, M., Parker, M.S., Palenik, B., Pazour, G.J., Richardson, P.M., sulfur-containing amino acids known in another genus of sea Rynearson, T.A., Saito, M.A., Schwartz, D.C., Thamatrakoln, K., Valentin, K., Vardi, A., Wilkerson, F.P., Rokhsar, D.S., 2004. The genome of the urchins, Hemicentrotus (Murata and Sata, 2000). diatom Thalassiosira pseudonana: ecology, evolution, and metabolism. Glutamate synthase is a protein of about 2200 amino acids, Science 306, 79–86. found in bacteria, plants, arthopoda, and nematodes, catalyzing Bagnasco, S.M., 2003. Gene structure of urea transporters. Am. J. Physiol., the conversion of glutamine to glutamate using either Renal Physiol. 284, F3–F10. ferredoxin, NADPH or NADH as cofactors. This enzyme is Canestro, C., Bassham, S., Postlethwait, J.H., 2003. Seeing chordate evolution through the Ciona genome sequence. Genome Biol. 4, 208. missing from the vertebrate lineage, where similar reaction can Cameron, R.A., Rast, J.P., Brown, C.T., 2004. Genomic resources for the study be catalyzed by a different enzyme, glutamine-hydrolyzing of sea urchin development. Methods Cell Biol. 74, 733–757. asparagine synthase, which converts glutamine to glutamate Campobasso, N., Costello, C.A., Kinsland, C., Begley, T.P., Ealick, S.E., 1998. while also making asparagine from aspartate. Complete Crystal structure of thiaminase-I from Bacillus thiaminolyticus at 2.0 A – glutamine synthase could not be identified in the current sea resolution. Biochemistry 37, 15981 15989. Costa, M.A., Bedgar, D.L., Moinuddin, S.G., Kim, K.W., Cardenas, C.L., urchin assembly, but three predicted peptides, SPU_013104 Cochrane, F.C., Shockey, J.M., Helms, G.L., Amakura, Y., Takahashi, H., (about 200 amino acids) SPU_016142 (950 amino acids), and Milhollan, J.K., Davin, L.B., Browse, J., Lewis, N.G., 2005. Characteriza- SPU_016259 (271 amino acids), show strong similarity to tion in vitro and in vivo of the putative multigene 4-coumarate:CoA ligase M. Goel, A. Mushegian / Developmental Biology 300 (2006) 282–292 291

network in Arabidopsis: syringyl lignin and sinapate/sinapyl alcohol Hornischer, K., Nordsiek, G., Agarwala, R., Aravind, L., Bailey, J.A., derivative formation. Phytochemistry 17, 2072–2091. Bateman, A., Batzoglou, S., Birney, E., Bork, P., Brown, D.G., Burge, C.B., Czihak, G., Peter, R., Baltzer, F., 1975. The Sea Urchin Embryo—Biochemistry Cerutti, L., Chen, H.C., Church, D., Clamp, M., Copley, R.R., Doerks, T., and Morphogenesis. Springer-Verlag, Berlin. Eddy, S.R., Eichler, E.E., Furey, T.S., Galagan, J., Gilbert, J.G., Harmon, C., Dittmer, N.T., Suderman, R.J., Jiang, H., Zhu, Y.C., Gorman, M.J., Kramer, Hayashizaki, Y., Haussler, D., Hermjakob, H., Hokamp, K., Jang, W., K.J., Kanost, M.R., 2004. Characterization of cDNAs encoding putative Johnson, L.S., Jones, T.A., Kasif, S., Kaspryzk, A., Kennedy, S., Kent, W.J., laccase-like multicopper oxidases and developmental expression in the Kitts, P., Koonin, E.V., Korf, I., Kulp, D., Lancet, D., Lowe, T.M., tobacco hornworm, Manduca sexta, and the malaria mosquito, Anopheles McLysaght, A., Mikkelsen, T., Moran, J.V., Mulder, N., Pollara, V.J., gambiae. Insect Biochem. Mol. Biol. 34, 29–41. Ponting, C.P., Schuler, G., Schultz, J., Slater, G., Smit, A.F., Stupka, E., Eddy, S.R., 1998. Profile hidden Markov models. Bioinformatics 14, 755–763. Szustakowski, J., Thierry-Mieg, D., Thierry-Mieg, J., Wagner, L., Wallis, J., Felsenstein, J., 1989. PHYLIP—Phylogeny inference package (Version 3.2) Wheeler, R., Williams, A., Wolf, Y.I., Wolfe, K.H., Yang, S.P., Yeh, R.F., Cladistics 5, 164–166. Collins, F., Guyer, M.S., Peterson, J., Felsenfeld, A., Wetterstrand, K.A., Fitzgerald, L.M., Szmant, A.M., 1997. Biosynthesis of ‘essential’ amino acids Patrinos, A., Morgan, M.J., de Jong, P., Catanese, J.J., Osoegawa, K., by scleractinian corals. Biochem. J. 322, 213–221. Shizuya, H., Choi, S., Chen, Y.J., International Human Genome Sequencing Goldstone, J.V., Hamdoun, A., Cole, B.J., Howard-Ashby, M., Scally, M., Dean, Consortium, 2001. Initial sequencing and analysis of the human genome. M., Epel, D., Hahn, M.E., Stegeman, J.J., 2006. The chemical defensome: Nature 409, 860–921. environmental sensing and response genes in the Strongylocentrotus Liu, F., Thatcher, J.D., Barral, J.M., Epstein, H.F., 1995. Bifunctional purpuratus genome. Dev. Biol. 300, 366–384. glyoxylate cycle protein of Caenorhabditis elegans: a developmentally Hong, J., Salo, W.L., Lusty, C.J., Anderson, P.M., 1994. Carbamyl phosphate regulated protein of intestine and muscle. Dev. Biol. 169, 399–414 synthetase III, an evolutionary intermediate in the transition between (Jun). glutamine-dependent and ammonia-dependent carbamyl phosphate synthe- Marchler-Bauer, A., Panchenko, A.R., Shoemaker, B.A., Thiessen, P.A., Geer, tases. J. Mol. Biol. 243, 131–140. L.Y., Bryant, S.H., 2002. CDD: a database of conserved domain alignments Jedrzejas, M.J., 2000. Structure, function, and evolution of phosphogly- with links to domain three-dimensional structure. Nucleic Acids Res. 30, cerate mutases: comparison with fructose-2,6-bisphosphatase, acid 281–283. phosphatase, and alkaline phosphatase. Prog. Biophys. Mol. Biol. 73, Marchler-Bauer, A., Anderson, J.B., Cherukuri, P.F., DeWeese-Scott, C., 263–287. Geer, L.Y., Gwadz, M., He, S., Hurwitz, D.I., Jackson, J.D., Ke, Z., Jedrzejas, M.J., Chander, M., Setlow, P., Krishnasamy, G., 2000. Structure and Lanczycki, C.J., Liebert, C.A., Liu, C., Lu, F., Marchler, G.H., Mullokandov, mechanism of action of a novel phosphoglycerate mutase from Bacillus M., Shoemaker, B.A., Simonyan, V., Song, J.S., Thiessen, P.A., Yamashita, stearothermophilus. EMBO J. 19, 1419–1431. R.A., Yin, J.J., Zhang, D., Bryant, S.H., 2005. CDD: a conserved domain Kanehisa, M., Goto, S., Hattori, M., Aoki-Kinoshita, K.F., Itoh, M., Kawashima, database for protein classification. Nucleic Acids Res. 33, D192–D196 S., Katayama, T., Araki, M., Hirakawa, M., 2006. From genomics to (Database issue Jan 1). chemical genomics: new developments in KEGG. Nucleic Acids Res. 34, Murata, Y., Sata, N.U., 2000. Isolation and structure of pulcherrimine, a novel D354–D357. bitter-tasting amino acid, from the sea urchin (Hemicentrotus pulcherrimus) Krieger, C.J., Zhang, P., Mueller, L.A., Wang, A., Paley, S., Arnaud, M., Pick, J., ovaries. J. Agric. Food Chem. 48, 5557–5560. Rhee, S.Y., Karp, P.D., 2004. MetaCyc: a multiorganism database of Nakamura, K., GO, N., 2005. Function and molecular evolution of multicopper metabolic pathways and enzymes. Nucleic Acids Res. 32, D438–D442. blue proteins. Cell. Mol. Life Sci. 62, 2050–2066. Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody, M.C., Baldwin, J., Noji, M., Murakoshi, I., Saito, K., 1993. Evidence for identity of beta- Devon, K., Dewar, K., Doyle, M., FitzHugh, W., Funke, R., Gage, D., pyrazolealanine synthase with cysteine synthase in watermelon: formation Harris, K., Heaford, A., Howland, J., Kann, L., Lehoczky, J., LeVine, R., of beta-pyrazole-alanine by cloned cysteine synthase in vitro and in vivo. McEwan, P., McKernan, K., Meldrim, J., Mesirov, J.P., Miranda, C., Morris, Biochem. Biophys. Res. Commun. 197, 1111–1117. W., Naylor, J., Raymond, C., Rosetti, M., Santos, R., Sheridan, A., Sougnez, Oliveri, P., Davidson, E.H., 2004. Gene regulatory network analysis in sea C., Stange-Thomann, N., Stojanovic, N., Subramanian, A., Wyman, D., urchin embryos. Methods Cell Biol. 74, 775–794. Rogers, J., Sulston, J., Ainscough, R., Beck, S., Bentley, D., Burton, J., Clee, Ono, B.I., Hazu, T., Yoshida, S., Kawato, T., Shinoda, S., Brzvwczy, J., C., Carter, N., Coulson, A., Deadman, R., Deloukas, P., Dunham, A., Paszewski, A., 1999. Cysteine biosynthesis in Saccharomyces cerevisiae:a Dunham, I., Durbin, R., French, L., Grafham, D., Gregory, S., Hubbard, T., new outlook on pathway and regulation. Yeast 15, 1365–1375. Humphray, S., Hunt, A., Jones, M., Lloyd, C., McMurray, A., Matthews, L., Pejchal, R., Ludwig, M.L., 2005. Cobalamin-independent methionine synthase Mercer, S., Milne, S., Mullikin, J.C., Mungall, A., Plumb, R., Ross, M., (MetE): a face-to-face double barrel that evolved by gene duplication. PLoS Shownkeen, R., Sims, S., Waterston, R.H., Wilson, R.K., Hillier, L.W., Biol. 3 (2), e31. McPherson, J.D., Marra, M.A., Mardis, E.R., Fulton, L.A., Chinwalla, A.T., Poustka, A.J., Groth, D., Hennig, S., Thamm, S., Cameron, A., Beck, A., Pepin, K.H., Gish, W.R., Chissoe, S.L., Wendl, M.C., Delehaunty, K.D., Reinhardt, R., Herwig, R., Panopoulou, G., Lehrach, H., 2003. Generation, Miner, T.L., Delehaunty, A., Kramer, J.B., Cook, L.L., Fulton, R.S., annotation, evolutionary analysis, and database integration of 20,000 unique Johnson, D.L., Minx, P.J., Clifton, S.W., Hawkins, T., Branscomb, E., sea urchin EST clusters. Genome Res. 13, 2736–2746. Predki, P., Richardson, P., Wenning, S., Slezak, T., Doggett, N., Cheng, J.F., Raible, F., Tessmar-Raible, K., Arboleda, E., Kaller, T., Bork, P., Arendt, D., Olsen, A., Lucas, S., Elkin, C., Uberbacher, E., Frazier, M., Gibbs, R.A., Arnone, M.I., 2006. Opsins and clusters of sensory G-protein coupled Muzny, D.M., Scherer, S.E., Bouck, J.B., Sodergren, E.J., Worley, K.C., receptors in sea urchin genome. Dev. Biol. 300, 461–475. Rives, C.M., Gorrell, J.H., Metzker, M.L., Naylor, S.L., Kucherlapati, R.S., Rea, P.A., Vatamaniuk, O.K., Rigden, D.J., 2004. Weeds, worms, and more. Nelson, D.L., Weinstock, G.M., Sakaki, Y., Fujiyama, A., Hattori, M., Yada, Papain's long-lost cousin, phytochelatin synthase. Plant Physiol. 136, T., Toyoda, A., Itoh, T., Kawagoe, C., Watanabe, H., Totoki, Y., Taylor, T., 2463–2474. Weissenbach, J., Heilig, R., Saurin, W., Artiguenave, F., Brottier, P., Bruls, Ruppert, E.E., Richard, S.F., Barnes, R.D., 2003. In: Cole, Brooks (Ed.), The T., Pelletier, E., Robert, C., Wincker, P., Smith, D.R., Doucette-Stamm, L., Invertebrate Zoology—A Functional Evolutionary Approach, 7th ed. Rubenfield, M., Weinstock, K., Lee, H.M., Dubois, J., Rosenthal, A., Schneider, K., Hovel, K., Witzel, K., Hamberger, B., Schomburg, D., Kombrink, Platzer, M., Nyakatura, G., Taudien, S., Rump, A., Yang, H., Yu, J., Wang, E., Stuible, H.P., 2003. The substrate specificity-determining amino acid J., Huang, G., Gu, J., Hood, L., Rowen, L., Madan, A., Qin, S., Davis, R.W., code of 4-coumarate:CoA ligase. Proc. Natl. Acad. Sci. U. S. A. 100, Federspiel, N.A., Abola, A.P., Proctor, M.J., Myers, R.M., Schmutz, J., 8601–8606. Dickson, M., Grimwood, J., Cox, D.R., Olson, M.V., Kaul, R., Raymond, Sea urchin consortium, submitted for publication. The genome of the sea urchin C., Shimizu, N., Kawasaki, K., Minoshima, S., Evans, G.A., Athanasiou, Strongylocentrotus purpuratus. Science. M., Schultz, R., Roe, B.A., Chen, F., Pan, H., Ramser, J., Lehrach, H., Shayakul, C., Hediger, M.A., 2004. The SLC14 gene family of urea transporters. Reinhardt, R., McCombie, W.R., de la Bastide, M., Dedhia, N., Blocker, H., Pflugers Arch. 447, 603–609. 292 M. Goel, A. Mushegian / Developmental Biology 300 (2006) 282–292

Smith, A.G., Goad, L.J., 1974. Sterol biosynthesis by the sea urchin Echinus weighting, position-specific gap penalties and weight matrix choice. Nucleic esculentus. Biochem. J. 142, 421–427. Acids Res. 22, 4673–4680. Taillon, B.E., Little, R., Lawther, R.P., 1988. Analysis of the Toms, A.V., Haas, A.L., Park, J.H., Begley, T.P., Ealick, S.E., 2005. Structural functional domains of biosynthetic threonine deaminase by compar- characterization of the regulatory proteins TenA and TenI from Bacillus ison of the amino acid sequences of three wild-type alleles to the subtilis and identification of TenA as a thiaminase II. Biochemistry 44, amino acid sequence of biodegradative threonine deaminase. Gene 2319–2329. 63, 245–252. Vivares, D., Arnoux, P., Pignol, D., 2005. A papain-like enzyme at work: native Tatusov, R.L., Fedorova, N.D., Jackson, J.D., Jacobs, A.R., Kiryutin, B., and acyl-enzyme intermediate structures in phytochelatin synthesis. Proc. Koonin, E.V., Krylov, D.M., Mazumder, R., Mekhedov, S.L., Nikolskaya, Natl. Acad. Sci. U. S. A. 102, 18848–18853. A.N., Rao, B.S., Smirnov, S., Sverdlov, A.V., Vasudevan, S., Wolf, Y.I., Yin, Withers, P.C., 1998. Urea: diverse functions of a ‘waste’ product. Clin. Exp. J.J., Natale, D.A., 2003. The COG database: an updated version includes Pharmacol. Physiol. 25, 722–727. eukaryotes. BMC Bioinformatics 4, 41. Zhou, Z.S., Peariso, K., Penner-Hahn, J.E., Matthews, R.G., 1999. Identification Thompson, J.D., Higgins, D.G., Gibson, T.J., 1994. CLUSTAL W: improving of the zinc ligands in cobalamin-independent methionine synthase (MetE) the sensitivity of progressive multiple sequence alignment through sequence from Escherichiaf coli. Biochemistry 38, 15915–15926.