CORE Metadata, citation and similar papers at core.ac.uk

Provided by Elsevier - Publisher Connector

Available online at www.sciencedirect.com

Developmental Biology 315 (2008) 567–578 www.elsevier.com/developmentalbiology

Genomes & Developmental Control Evolutionary analysis of the cis-regulatory region of the spicule matrix gene SM50 in strongylocentrotid sea urchins ⁎ Jenna Walters a, Elaine Binkley a, Ralph Haygood b, Laura A. Romano a,

a Department of Biology, Denison University, Granville, OH 43023, USA b Department of Biology, Duke University, Durham, NC 27708, USA Received for publication 10 April 2007; revised 7 January 2008; accepted 7 January 2008 Available online 18 January 2008

Abstract

An evolutionary analysis of transcriptional regulation is essential to understanding the molecular basis of phenotypic diversity. The is an ideal system in which to explore the functional consequence of variation in cis-regulatory sequences. We are particularly interested in the evolution of genes involved in the patterning and synthesis of its larval skeleton. This study focuses on the cis-regulatory region of SM50, which has already been characterized to a considerable extent in the purple sea urchin, purpuratus. We have isolated the cis-regulatory region from 15 individuals of S. purpuratus as well as seven closely related species in the family Strongylocentrotidae. We have performed a variety of statistical tests and present evidence that the cis-regulatory elements upstream of the SM50 gene have been subject to positive selection along the lineage leading to S. purpuratus. In addition, we have performed electrophoretic mobility shift assays (EMSAs) and demonstrate that nucleotide substitutions within Element C affect the ability of nuclear proteins to bind to this cis-regulatory element among members of the family Strongylocentrotidae. We speculate that such changes in SM50 and other genes could accumulate to produce altered patterns of gene expression with functional consequences during skeleton formation. © 2008 Elsevier Inc. All rights reserved.

Keywords: Cis-regulatory; Evolution; Non-coding; Sea urchin; Skeletogenesis; SM50; Spicule matrix gene; Strongylocentrotidae; Strongylocentrotus purpuratus; Transcription

Introduction et al., 2006). Patterns of such substitutions between species and of nucleotide polymorphisms within species hold clues about The morphology of an organism is the outcome of its the evolution of phenotypic diversity through changes in development, which is controlled by a complex network of transcriptional regulation (reviewed by Wray et al., 2003). genes. The transcription of these genes is regulated by non- The sea urchin is an ideal model system in which to study the coding sequences known as cis-regulatory elements that are evolution of transcriptional regulation because the gene regu- typically located upstream of the coding sequence, but may be latory network underlying its development is being recon- located downstream of the coding sequence or even within an structed and already includes N50 genes involved in endo- intron. Proteins known as transcription factors interact with mesodermal specification (Davidson et al., 2002). Moreover, cis-regulatory elements to specify the level, timing, and spatial there is a considerable amount of information regarding the expression of genes. Mutations in cis-regulatory elements have cis-regulatory elements that allow these genes to interact been shown to produce a different pattern of gene expression directly or indirectly with one another. Our study focuses on a with functional consequences during development. For exam- gene known as SM50, which encodes a spicule matrix protein ple, a few nucleotide substitutions in a cis-regulatory element that is essential for development of the larval skeleton (Benson upstream of the yellow gene account for the presence or absence et al., 1987; Sucov et al., 1987; George et al., 1991). of wing spots in different species of fruit flies (Prud'homme The larval skeleton is primarily composed of calcium car- bonate (∼95%) but also contains magnesium carbonate (∼5%) ⁎ Corresponding author. Fax: +1 740 587 5634. and more than 40 spicule matrix proteins (0.1%) (e.g. Okazaki E-mail address: [email protected] (L.A. Romano). and Inoue, 1976; Benson et al., 1986; Killian and Wilt, 1996).

0012-1606/$ - see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.ydbio.2008.01.007 568 J. Walters et al. / Developmental Biology 315 (2008) 567–578

For many species of indirect-developing sea urchins, the larval skeleton is derived from primary mesenchyme cells (PMCs) that ingress into the blastocoel at the onset of gastrulation. The PMCs migrate through the blastocoel by means of filopodia that eventually fuse with one another to form a syncytium that extends around the archenteron. The onset of spiculogenesis is marked by the deposition of amorphous calcium carbonate within the extracellular space that is enclosed by the filopodial protrusions of the PMC syncytium (Beniash et al., 1997). SM50 and a variety of other proteins are secreted into this space and may serve to stabilize amorphous calcium carbonate before it is converted to a crystalline form (e.g. Kitajima and Urakami, 2000; Urry et al., 2000; Ingersoll et al., 2003). The first indication of biomineralization is the pair of triradiate spicules that are formed by two ventrolateral clusters of PMCs at the mid-gastrula Fig. 1. Overview of the cis-regulatory region of SM50 in S. purpuratus. There stage of development. Short-range signals from the overlying are at least four cis-regulatory elements in the region of DNA extending from −440 to +120 with respect to the transcriptional start site (+1) of the SM50 gene ectoderm influence the rate and orientation of spicule growth by (Sucov et al., 1987). Element C is particularly important because it is responsible regulating the expression of genes involved in biomineralization for limiting transcription to the primary mesenchymal cells (PMCs), which are (Ettensohn and Malinda, 1993; Guss and Ettensohn, 1997; responsible for synthesis of the larval skeleton (Makabe et al., 1995). Elements Duloquin et al., 2007). Spicule growth occurs in a stereotypical A, D4, and D5 are positively-acting (+) elements that enhance transcription in the manner to produce a larval skeleton that is distinctive for each PMCs (Makabe et al., 1995). species of sea urchin (Hyman, 1955). Interestingly, treatment of embryos with antisense oligonucleotides directed against SM50 closely related species within the family Strongylocentrotidae inhibits spicule elongation resulting in a larval skeleton with an primarily in order to evaluate whether natural selection has aberrant morphology (Peled-Kamar et al., 2002). influenced the evolution of this DNA sequence and, if so, in A recent analysis of the genome indicates that SM50 belongs what matter. The evolutionary relationships among strongylo- to a family of 16 genes that presumably arose via a series of centrotids have been firmly established using mtDNA duplications during the evolution of (Livingston sequences (Biermann et al., 2003)(Fig. 2). In particular, we et al., 2006). This hypothesis is supported by the observation that aimed to provide a detailed analysis of genetic variation in S. some of the spicule matrix genes, such as those belonging to the purpuratus (which is the only species in which there has been a SM30 subfamily, are clustered within the genome. SM50 functional analysis of the cis-regulatory region of SM50), and a transcripts, which are subject to alternative splicing, gradually less complex picture of genetic variation among the other accumulate in the PMCs with peak levels of expression at ∼50 h strongylocentrotids. We have also examined the impact of of development in Strongylocentrotus purpuratus (Lee et al., variation on the ability of transcription factors to bind to the 1999). At this stage of development, SM50 expression is DNA. Such information regarding SM50 and other genes primarily localized to the tips of the postoral rods (Guss and should help elucidate the evolutionary dynamics of non-coding Ettensohn, 1997). The region extending from −440 to +120 with sequences and the functional significance of mutations in cis- respect to the transcriptional start site is sufficient to confer an regulatory elements with regard to anatomy, physiology, and accurate pattern of SM50 expression (Sucov et al., 1987)(Fig. 1). behavior (reviewed by Wray, 2007). There have been only a few There is evidence that the distal portion of this region extending cases in which the evolution of non-coding sequences has been from −440 to −200 influences the amplitude of expression but analyzed in sea urchins (e.g. Balhoff and Wray, 2005; Cameron discrete elements have not yet been identified (Makabe et al., et al., 2005). The paucity of information on this topic can be 1995). Systematic mutagenesis of the proximal portion extend- attributed to the fact that there are few genes for which there is ing from −200 to +105 identified four cis-regulatory elements detailed knowledge regarding their transcriptional regulation. (Makabe et al., 1995). Element C is responsible for directing Moreover, it is difficult to predict the effect of specific SM50 expression to the skeletogenic lineage but has no tran- mutations in non-coding sequences due to the lack of a scriptional activity on its own. Other positively-acting elements straightforward “genetic code” for cis-regulatory elements. such as A, D4, and D5 act in conjunction with Element C to Thus, this study provides valuable insight into the evolution of activate transcription in the PMCs. Element C presumably transcriptional regulation. serves as a binding site for one or more activators rather than a repressor since ectopic expression was not observed when the Materials and methods DNA sequence was mutated (Makabe et al., 1995). At least one protein bound to Element C is SpHnf6 although there is evi- Amplification, cloning, and sequencing dence that multiple proteins are capable of binding to this 25 bp Genomic DNA from 15 individuals of S. purpuratus (collected in the site at varying developmental stages (Otim et al., unpublished). vicinity of Santa Barbara, CA) was kindly provided by J. Balhoff (Duke In this study, we have isolated the cis-regulatory region of University). Genomic DNA for one individual from each of seven closely- SM50 from 15 individuals of S. purpuratus as well as seven related species in the family Strongylocentrotidae (Allocentrotus fragilis, J. Walters et al. / Developmental Biology 315 (2008) 567–578 569

replicate simulations of the neutral coalescent in a stable, panmictic population using the ms and sample_stats programs (http://home.uchicago.edu/~rhudson1)

to assess the statistical significance of Tajima's D (DT), Fu and Li's D (DFL), and Fay and Wu's H (Hudson, 2002). (The sample_stats program does not

extract DFL and therefore, we added this functionality.)

Statistical tests to examine positive selection on cis-regulatory elements upstream of SM50

We counted substituted and polymorphic sites and performed McDonald– Kreitman (MK) and Hudson–Kreitman–Aguade (HKA) tests using our Python software; the results of the MK test are displayed in Table 2. We fitted single- nucleotide substitution models by likelihood maximization using HyPhy (http:// www.hyphy.org), taking the best of 100 fits starting from random points to guard against local maxima of the likelihood function (Kosakovsky Pond and Muse, 2005). Our models are equivalent to the HKY85 model (Hasegawa et al., 1985)

Table 1 Polymorphism in non-coding sequence upstream of SM50 in S. purpuratus Statistic Value Ungapped sites a 1376 Fig. 2. Evolutionary relationships of sea urchins belonging to the family Polymorphic sites 271 Strongylocentrotidae. The evolutionary relationships of 10 closely related Biallelic sites 242 species of sea urchins were already established through maximum likelihood Triallelic sites 28 b analysis of mtDNA sequences (Biermann et al., 2003). This analysis supported Singleton sites 154 c the inclusion of Allocentrotus fragilis, pulcherrimus, and Pseu- Watterson's θ per site 0.0497 d − e docentrotus depressus in the Strongylocentotus. In this study, we isolated Tajima's D (DT) 1.69 the cis-regulatory region of SM50 from every species in this phylogeny with the f exception of Pseudocentrotus depressus and Strongylocentrotus nudus. S. droebachiensis S. pallidus A.fragilis Ungapped sites 1187 1186 1175 Polymorphic sites 237 237 232 Hemicentrotus pulcherrimus, Strongylocentrotus franciscanus, Strongylocen- Hudson's ρ per site g 0.0212 0.0192 0.0174 d h h h trotus intermedius, Strongylocentrotus polyacanthus, Strongylocentrotus palli- Fu and Li's D (DFL) −3.07 −3.13 −2.87 dus, and Strongylocentrotus droebachiensis) was also provided by J. Balhoff Fay and Wu's H d −6.05 i −9.08 i −21.5 i (Duke University). Primers based on the published SM50 sequence (Accession # a Nucleotide sites with no insertion or deletion in any of our 30 S. purpuratus AH001102) were designed to amplify a ∼1.5 kb fragment of non-coding sequences or one S. droebachiensis sequence. Gapped sites are discarded from sequence by PCR (Sigma GenoSys). The 5′ primer was: AGTTTGGTGGA- these analyses. GACGATTC while the 3′ primer was: CGTAGTATGCTGGGCAGTC. PCR b Polymorphic sites where the rarest allele is present in only one of our 30 was conducted using 2.5 μl of each 10 μM primer, 0.2 μl of High Fidelity S. purpuratus sequences. Platinum Taq Polymerase (Invitrogen), 2.5 μl of 10× buffer, 1 μlof50mM c Watterson's estimator of 4×effective population size×neutral mutation rate MgCl , 0.5 μl of 10 mM dNTPs (Invitrogen), and b1 μg of genomic DNA. The 2 per site (Hein et al., 2005). This locus is probably not evolving neutrally, so θ is reactions were incubated in a Mastercycler Gradient (Eppendorf) under the merely a summary of the amount of polymorphism. following conditions: 1 cycle of 3 min at 94 °C; 34 cycles of 30 s at 94 °C for d D , D , and H summarize the frequency spectrum of polymorphic sites denaturation, 1 min at 50–57 °C for annealing, and 2 min at 72 °C for extension; T FL (Hein et al., 2005). For each of these statistics, the expected value at a neutrally 1 cycle of 10 min at 72 °C. The PCR products were separated by gel evolving locus in a stable, panmictic population is 0. The statistical significance electrophoresis and visualized with ethidium bromide. Bands of the expected of an observed deviation from 0 is assessed by coalescent simulation with size were excised from the gel and purified using the MinElute gel purification specified number of ungapped sites, number of polymorphic sites, and kit (Qiagen). The purified DNA was inserted into pGEM-T vector (Promega), ρ (Hudson, 2002). All else being equal, the higher ρ, the more significant the which was then used to transform competent DH5α cells (Invitrogen). Plasmid deviation (Hein et al., 2005). Estimation of ρ is subject to error, so we assessed DNA was recovered from bacteria using the Wizard Plus SV Miniprep Kit significance at not only the average estimated value of ρ=0.0193 but also the (Promega) and submitted to the Plant Microbe Genomic Facility at The Ohio most conservative value of ρ=0. State University (OSU) for bi-directional sequencing (i.e. in both the 5′ and 3′ e p=0.0002 at ρ=0.0193; p=0.024 at ρ=0. directions). Multiple samples were submitted for each of the 15 individuals to f ρ, D , and H involve an outgroup for inferring which allele at a ensure that both alleles were represented in our analyses, and that the sequence FL polymorphic site is ancestral. We considered three outgroups in turn: for each allele was based on more than one clone. Sequences were assembled S. droebachiensis, S. pallidus, and A. fragilis, which are the closest relatives with Sequencher 4.2.2 and aligned with ClustalX 1.83 using default parameters. of S. purpuratus and almost equally diverged from it at this locus. Adding an outgroup sequence to our 30 S. purpuratus sequences changes the numbers of Statistical tests to examine directional selection on the cis-regulatory ungapped and polymorphic sites in the alignment. region of SM50 g Hudson's estimator of 4×effective population size×recombination rate between adjacent sites (Hudson, 2001). All of the statistics displayed in Table 1 and Fig. 7 were computed using h S. droebachiensis: pb0.0001 at ρ=0.0193; p=0.0028 at ρ=0. S. pallidus: software that we wrote in Python within the BioPython framework (http://www. p b0.0001 at ρ =0.0193; p =0.0023 at ρ =0. A. fragilis: p b0.0001 at biopython.org). (This software is available upon request to R.H.) We also used ρ=0.0193; p=0.0054 at ρ=0. the exhap and maxhap programs (http://home.uchicago.edu/~rhudson1)to i S. droebachiensis: pN0.10 at ρ=0.0193 and at ρ=0. S. pallidus: pN0.10 at compute Hudson's rho (Hudson, 2001). We conducted and analyzed 100,000 ρ=0.0193 and at ρ=0. A. fragilis: p=0.085 at ρ=0.0193; pN0.10 at ρ=0. 570 J. Walters et al. / Developmental Biology 315 (2008) 567–578 modulated within binding sites relative to other nucleotide sites in the same way shown). One of the polymorphic sites occurs within Element that Zhang et al.'s (2005) preferred models are modulated at non-synonymous D , which is responsible for enhancing transcription provided sites relative to synonymous sites (Wong and Nielsen, 2004; Haygood et al., 4 2007). (HyPhy Batch Language code is available upon request to R.H.) We that protein is bound to Element C. We performed an tested the fit of the alternate model relative to that of the null model using a electrophoretic mobility shift assay (EMSA) to determine if likelihood ratio test conservatively approximated as a chi-squared test with one substitutions at this site have any effect on protein binding. We degree of freedom (Zhang et al., 2005). found that protein is capable of binding to Element D4 regardless if there is an A or a C at position 5 (data not Electrophoretic mobility shift assay (EMSA) shown). That is, there is no apparent difference in binding affinity between the two alleles. Polymorphic sites were also Biotin-labeled and unlabeled oligonucleotides corresponding to both alleles detected in Elements C and A; we decided that it was not worth of Element D4 in S. purpruatus were obtained (Sigma GenoSys). Allele #1 was: AGTCTTGCACTCGGCCCAGGGTTACGCCTGT and allele #2 was: AGTC- investigating their functional significance. TTGCACTCGGCACAGGGTTACGCCTGT. Oligonucleotides corresponding to Element C were also obtained (Sigma GenoSys). The DNA sequence Interspecific differences in the cis-regulatory region of SM50 for S. franciscanus, S. intermedius, S. polyacanthus, S. pallidus,and and impact on protein binding S. droebachiensis (“Ancestral”) was: AGTCGTGAATGCATCGATCCCATT- TCTTCT. The DNA sequence for S. purpuratus (“Derived #1′) was: AGTC- GTGAATGCATCGATCTCATTTCTTCT. The DNA sequence for A. fragilis We amplified a ∼1.5 kb fragment of the SM50 locus (“Derived #2) was: AGTCGTGAATGCGTCGATCCCATTTCTTCT. Finally, from every member of the family Strongylocentrotidae (one the DNA sequence for H. pulcherrimus (“Derived #3) was: AGTCGTGAATG- individual per species) with the exception of Pseudocentrotus CATCAATCCCATTTCTTCT. Assays were performed using the LightShift depressus and Strongylocentrotus nudus (Fig. 2). We found Chemiluminescent EMSA Kit (Pierce). Reactions contained 1× binding buffer, 50 ng/μl Poly (dI·dC), 10 fmol of biotin-labeled oligo, 2 μg of nuclear extract, that the cis-regulatory region of SM50, which extends from and unlabeled oligo at 5, 50, 200, or 500-fold excess. Nuclear extract from S. −440 to +120, contains 66 differences and 12 indels ranging purpuratus embryos at ∼38 h of development was kindly provided by Titus in size from 1 to 5 bp (Fig. 3). We found that Element C is Brown (Caltech). The reactions were incubated at room temperature for 20 min. conserved among S. franciscanus, S. intermedius, S. poly- – The protein DNA complexes were then separated on a 5% polyacrylamide gel acanthus, S. pallidus, and S. droebachiensis. We refer to this (Bio-Rad) by electrophoresis and transferred to a positive nylon membrane “ ” (Pierce). The membrane was cross-linked at 120 mJ/cm2 for 1 min, probed with sequence as ancestral since it is shared by the majority of a streptavidin–HRP conjugate, and incubated with luminol. The biolumines- species including S. franciscanus and therefore, is likely cence was detected with X-ray film (Pierce) and the intensity of the resulting representative of the common ancestor. Interestingly, we bands was quantified using ImageJ 1.37 software. The time of exposure varied detected differences in Element C in the remaining three between 1 and 25 s. species, S. purpuratus, A. fragilis, and H. pulcherrimus, despite the fact that this particular cis-regulatory element is Results essential for localization of SM50 expression to the PMCs. Several differences (and even one indel) also occur in Elements Polymorphism in the cis-regulatory region of SM50 and impact A, D4, and D5. on protein binding We decided to focus our attention on the functional con- sequence of interspecific differences in Element C. First, we We amplified a ∼1.5 kb fragment of the SM50 locus from 15 performed EMSAs to determine if there is a difference between individuals of S. purpuratus by PCR. This DNA fragment the binding affinity of the ancestral sequence and derived included the region extending from −440 to +120, which sequence #1, which corresponds to S. purpuratus. In particular, was previously shown to be sufficient for transcription in we performed reciprocal experiments in which we tested the S. purpuratus (Sucov et al., 1987). Upon alignment of the 30 ability of derived sequence #1 to compete with biotin-labeled alleles, we found that this particular region of the SM50 locus probe (corresponding to the ancestral sequence), as well as the contained 88 polymorphic sites and eight indels (insertions ability of the ancestral sequence to compete with biotin-labeled and/or deletions) ranging in size from 1 to 21 bp (data not probe (corresponding to derived sequence #1) for protein binding. As a control, we excluded protein from the reaction mixture (Figs. 4A, B lane 1) and then added nuclear extract to Table 2 demonstrate a shift in the position of the biotin-labeled probe McDonald–Kreitman tests of binding sites versus non-binding sites in the cis- (Figs. 4A, B lane 2). We then tested increasing concentrations regulatory region of SM50 in S. purpuratus (ranging from 5- to 500-fold) of competitor that corresponded to – Polymorphic Substituted sites a the same sequence as the probe (Figs. 4A, B lanes 3 6) before sites testing increasing concentrations (ranging from 5- to 500-fold) S. droebachiensis S. pallidus A. fragilis of competitor that corresponded to the alternate sequence (Figs. Binding sites 3 6 8 7 4A, B lanes 7–10). In all cases, we analyzed the uppermost Other sites 78 8 10 8 band on the gel, which corresponds to a specific interaction a As in Table 1, we considered three outgroups in turn: S. droebachiensis, since it was consistently diminished by the addition of excess S. pallidus, and A. fragilis. Combining the “polymorphic sites” column with any one of the “substituted sites” columns yields a 2×2 contingency table. The competitor. We observed a second band using 38-h nuclear p-values (G-test with Williams correction for continuity) are p=0.00031 for extract although the addition of competitor did not consistently S. droebachiensis, p=0.00004 for S. pallidus, and p=0.00007 for A. fragilis. diminish its intensity. After quantification of the band intensities J. Walters et al. / Developmental Biology 315 (2008) 567–578 571

Fig. 3. Interspecific differences in the cis-regulatory region of SM50. ∼1.5 kb of non-coding sequence upstream of the SM50 gene was amplified from S. purpuratus as well as S. franciscanus, H. pulcherrimus, S. intermedius, S. polyacanthus, A. fragilis, S. pallidus, and S. droebachiensis. An alignment of the region of DNA extending from −440 to +120 is shown here. Element D4 is shown in blue, Element D5 is shown in green, Element C is shown in red, and Element A is shown in orange. Note that S. purpuratus is represented by the published SM50 sequence (Accession # AH001102). 572 J. Walters et al. / Developmental Biology 315 (2008) 567–578

Fig. 4. Binding affinity of the ancestral sequence versus derived sequence #1 (S. purpuratus) for Element C. An EMSA was performed to test the ability of derived sequence #1 (S. purpuratus) to compete with biotin-labeled probe corresponding to the ancestral sequence (A), and the ability of the ancestral sequence to compete with biotin-labeled probe corresponding to derived sequence #1 (B). In both cases, lane 1 corresponds to probe incubated without nuclear extract while lane 2 corresponds to probe incubated with nuclear extract in the absence of competitor. Competitor corresponding to the same sequence as the probe was used to demonstrate the specificity of the protein–DNA interaction (lanes 3–6). Competitor corresponding to the alternate sequence was used to assess the difference in binding affinity between the ancestral sequence and derived sequence #1 (lanes 7–10). The intensity of the uppermost band in lanes 2–10 is displayed as a bar graph (C, D). In addition, the intensity of the uppermost band in lanes 2–10 was plotted against concentration of the competitor (ranging from 5- to 500-fold). The dark blue diamonds correspond to the ancestral sequence competed with the ancestral sequence, the light blue squares correspond to the ancestral sequence competed with derived sequence #1, the dark green triangles correspond to derived sequence #1 competed with derived sequence #1, and the light green circles correspond to derived sequence #1 competed with the ancestral sequence (E). Finally, the ratio of band intensity obtained for each of the two assays was plotted against concentration of the competitor. The dark blue diamonds correspond to the ancestral sequence competed with derived sequence #1, and the dark green squares correspond to the reciprocal assay (F).

(Figs. 4C, D), we found that the ancestral sequence has a lower Evidence for directional selection on the cis-regulatory region binding affinity than derived sequence #1 (Fig. 4E). In of SM50 particular, the difference in binding affinity was as much as ∼9-fold (Fig. 4F). We performed similar experiments to We analyzed the amount and pattern of polymorphism in the compare the binding affinity of the ancestral sequence versus cis-regulatory region of SM50 in S. purpuratus (Table 1). The derived sequence #2 (which corresponds to A. fragilis) and most striking feature is the abundance of polymorphic sites derived sequence #3 (which corresponds to H. pulcherrimus). where the frequency of the rarer/rarest allele is very low. The Once again, the results of these assays indicate that the ancestral abundance of such sites is substantially greater than expected at sequence is less effective at protein binding than derived a neutrally evolving locus in a stable, panmictic population. sequences (Figs. 5E and 6E). In both cases, the difference in This is reflected in the significantly negative values of Tajima's binding affinity was as much as ∼9-fold as observed for derived D (DT) and Fu and Li's D (DFL). Note that we relied upon a sequence #1 (Figs. 5F and 6F). It is important to note that we single sequence for each outgroup (S. droebachiensis, A. relied on a nuclear protein extract for S. purpuratus for each fragilis,orS. pallidus) as in a study by Balhoff and Wray, assay since this reagent is too difficult and time-consuming to 2005. However, our conclusions are strengthened by the fact prepare for each species in our analysis. As described in that we obtain essentially the same results regardless of which Villinski et al., 2005, it is our assumption that the spectrum of species is used as the outgroup for a particular analysis, and that proteins does not vary considerably between these closely we obtain essentially the same results for different analyses (e.g. related species. DT and DFL). Other than sequencing errors, which we have J. Walters et al. / Developmental Biology 315 (2008) 567–578 573

Fig. 5. Binding affinity of the ancestral sequence versus derived sequence #2 (A. fragilis) for Element C. An EMSA was performed to test the ability of derived sequence #2 (A. fragilis) to compete with biotin-labeled probe corresponding to the ancestral sequence (A), and the ability of the ancestral sequence to compete with biotin-labeled probe corresponding to derived sequence #2 (B). In both cases, lane 1 corresponds to probe incubated without nuclear extract while lane 2 corresponds to probe incubated with nuclear extract in the absence of competitor. Competitor corresponding to the same sequence as the probe was used to demonstrate the specificity of the protein–DNA interaction (lanes 3–6). Competitor corresponding to the alternate sequence was used to assess the difference in binding affinity between the ancestral sequence and derived sequence #2 (lanes 7–10). The intensity of the uppermost band in lanes 2–10 is displayed as a bar graph (C, D). In addition, the intensity of the uppermost band in lanes 2–10 was plotted against concentration of the competitor (ranging from 5- to 500-fold). The dark blue diamonds correspond to the ancestral sequence competed with the ancestral sequence, the light blue squares correspond to the ancestral sequence competed with derived sequence #2, the dark red triangles correspond to derived sequence #2 competed with derived sequence #2, and the light red circles correspond to derived sequence #2 competed with the ancestral sequence (E). Finally, the ratio of band intensity obtained for each of the two assays was plotted against concentration of the competitor. The dark blue diamonds correspond to the ancestral sequence competed with derived sequence #2, and the dark red squares correspond to the reciprocal assay (F). taken pains to avoid, the most plausible explanations for an rarest allele appears to be ancestral. This is reflected in the excess of extreme-frequency polymorphic sites are population negative values of Fay and Wu's H (Hein et al., 2005). expansion and directional selection (Hein et al., 2005). The former seems unlikely because there is no such excess at several Evidence for positive selection on cis-regulatory elements other loci that have been analyzed (e.g. Balhoff and Wray, upstream of SM50 2005). Moreover, the excess is not uniform across the cis- regulatory region of SM50 itself (Fig. 7). The excess of The cis-regulatory region of SM50 as characterized by extreme-frequency polymorphic sites is concentrated in the Sucov et al. (1987) contains four known cis-regulatory elements region that is ∼150–400 bp upstream of the transcriptional start (Fig. 1). We compared variation within these “binding sites” to site, which is distal to the cis-regulatory region characterized by variation at “other nucleotide sites” in the cis-regulatory region Sucov et al. (1987). A plausible hypothesis is that this region of SM50 using two approaches. (We believe that “binding contains functional DNA sequences such as cis-regulatory sites” versus “other nucleotide sites” is a reasonable proxy for elements that are or have been under directional selection but “functional” versus “non-functional” although it is certainly have not been characterized yet. Such cis-regulatory elements possible that there may be “other nucleotide sites” with func- might regulate later phases of SM50 expression (Richardson tions that have not yet been characterized). One approach cen- et al., 1989). The directional selection might be negative (i.e. tered on the ratio of polymorphism within S. purpuratus to preserving ancestral alleles), positive (i.e. promoting derived divergence between S. purpuratus and close relatives such as alleles), or a combination of both. A role for positive selection is S. droebachiensis. If the region were evolving homogeneously, suggested by the fact that at some polymorphic sites, the rarer/ the ratio would not be expected to differ appreciably between 574 J. Walters et al. / Developmental Biology 315 (2008) 567–578

Fig. 6. Binding affinity of the ancestral sequence versus derived sequence #3 (H. pulcherrimus) for Element C. An EMSA was performed to test the ability of derived sequence #3 (H. pulcherrimus) to compete with biotin-labeled probe corresponding to the ancestral sequence (A), and the ability of the ancestral sequence to compete with biotin-labeled probe corresponding to derived sequence #3 (B). In both cases, lane 1 corresponds to probe incubated without nuclear extract while lane 2 corresponds to probe incubated with nuclear extract in the absence of competitor. Competitor corresponding to the same sequence as the probe was used to demonstrate the specificity of the protein–DNA interaction (lanes 3–6). Competitor corresponding to the alternate sequence was used to assess the difference in binding affinity between the ancestral sequence and derived sequence #3 (lanes 7–10). The intensity of the uppermost band in lanes 2–10 is displayed as a bar graph (C, D). In addition, the intensity of the uppermost band in lanes 2–10 was plotted against concentration of the competitor (ranging from 5- to 500-fold). The dark blue diamonds correspond to the ancestral sequence competed with the ancestral sequence, the light blue squares correspond to the ancestral sequence competed with derived sequence #3, the orange triangles correspond to derived sequence #3 competed with derived sequence #3, and the yellow circles correspond to derived sequence #3 competed with the ancestral sequence (E). Finally, the ratio of band intensity obtained for each of the two assays was plotted against concentration of the competitor. The dark blue diamonds correspond to the ancestral sequence competed with derived sequence #3, and the orange squares correspond to the reciprocal assay (F). binding sites and other nucleotide sites. The standard tests of noted above, this seems unlikely for S. purpuratus. The pattern this null hypothesis, the McDonald–Kreitman (MK) and could also arise from balancing selection on non-binding sites, Hudson–Kreitman–Aguade (HKA) tests, concur in rejecting which may have post-embryonic functions, but this seems less it (Table 2). (Although the MK and HKA tests were developed likely than directional selection on binding sites, which are for analyzing synonymous versus non-synonymous variation known to be functional. in coding sequence (Hudson et al., 1987; McDonald and The other approach, which is relatively insensitive to Kreitman, 1991), they remain valid for analyzing functional demography and is particularly useful for assessing the role of versus non-functional variation in non-coding sequences positive selection, is to compare nucleotide substitution rates (Jenkins et al., 1995; Hahn, 2006).) Once again, similar results within binding sites to those at other nucleotide sites (Wong and were obtained regardless of whether S. droebachiensis, Nielsen, 2004; Zhang et al., 2005; Haygood et al., 2007). We A. fragilis,orS. pallidus was used as the outgroup. Relative fitted two models of the nucleotide substitution process to an to the homogeneous expectation, there is too much divergence alignment containing the published S. purpuratus sequence and/or not enough polymorphism within binding sites compared (GenBank #M16230) and our sequences from the other seven to other nucleotide sites. This pattern is expected if binding sites species in the family Strongylocentrotidae (Fig. 2). In the null are or have been subject to positive selection driving divergence model, binding sites are constrained to evolve no faster than and/or negative selection suppressing polymorphism, while other nucleotide sites, while in the alternate model, binding sites other nucleotide sites have evolved neutrally or at least under are free to evolve faster than other nucleotide sites along a substantially weaker selection. In principle, the pattern could designated “foreground” lineage. The null model accommo- arise from population expansion (Eyre-Walker, 2002), but as dates binding sites that evolve under negative selection along J. Walters et al. / Developmental Biology 315 (2008) 567–578 575

Fig. 7. Frequency spectrum of polymorphic sites in the non-coding sequence upstream of the SM50 gene in S. purpuratus. Tajima's D (DT) and Fu and Li's D (DFL) in 101-site windows. Each point represents one window, and its horizontal coordinate is the distance to the center of the window from the proximal end of the cis-regulatory region (+120). The region of DNA extending from −440 to +120 with respect to the transcriptional start site is sufficient for the endogenous pattern of SM50 expression and contains at least four cis-regulatory elements (Sucov et al., 1987). the “background” lineages but neutrally along the “foreground” We isolated the cis-regulatory region of SM50 from 15 lineage, so the contrast between the two models is sensitive to individuals of S. purpuratus and found several nucleotide poly- positive selection rather than mere relaxation of negative morphisms even within cis-regulatory elements. We also iso- selection (Zhang et al., 2005). We tested whether the alternate lated the cis-regulatory region of SM50 from seven closely model fit the data appreciably better than the null model, related species within the family Strongylocentrotidae (Fig. 2). designating each of the 13 lineages in the unrooted phylogeny Members of this family are the result of a recent speciation event as the “foreground” lineage in turn. After a conservative adjust- (∼15–20 mya) in the North Pacific and it appears that there has ment for multiple testing, the only significant result is with the been a gradual reduction in the complexity of the larval skeleton S. purpuratus lineage as the “foreground” lineage (p=0.00096; with increasing distance from the common ancestor (Biermann Bonferroni adjusted p=0.013). This suggests that binding sites et al., 2003). P. depressus, S. franciscanus, and S. nudus belong are or have been under positive selection on the S. purpuratus to one clade, while S. purpuratus belongs to a second clade that lineage. Of course, the binding sites have been characterized also includes H. pulcherrimus, S. intermedius, S. polyacanthus, only in S. purpuratus, and they may not function in the same A. fragilis, S. pallidus, and S. droebachiensis. Interestingly, the ways or even at all in the other species. clustering of species in the latter clade correlates with their geographical locations. H. pulcherrimus and S. intermedius are Discussion found in the western Pacific, while S. purpuratus, A. fragilis, S. pallidus, S. droebachiensis are found in the eastern Pacific. Changes in transcriptional regulation can have dramatic S. polyacanthus is found in the Aleutian Islands and is positioned consequences during the development of an organism and between these two groups of sea urchins. As might be expected, consequently, are predicted to be an important mechanism for there were many interspecific differences in the cis-regulatory the evolution of phenotypic diversity. We have taken advantage region of SM50. We decided to investigate the impact of changes of the sea urchin to explore the extent to which there is variation in Element C on protein binding because of the importance of in the cis-regulatory region of the SM50 and the impact of this this particular cis-regulatory element in spatial control of variation on protein binding. The region extending from −440 transcription. We performed a series of reciprocal experiments to +120 with respect to the transcriptional start site is sufficient to evaluate the difference between the “ancestral” sequence for the endogenous pattern of SM50 expression in embryos (corresponding to five of the eight species, including S. (Sucov et al., 1987). There are at least four cis-regulatory franciscanus) and three “derived” sequences (corresponding to elements within this region including Element C, which results S. purpuratus, A. fragilis, and H. pulcherrimus). Surprisingly, in localization of SM50 expression to the PMCs. Experiments the ancestral sequence consistently demonstrated a lower utilizing antisense oligonucleotides demonstrated that SM50 binding affinity for proteins within a nuclear extract that was is essential for elongation of the skeletal rods (Peled-Kamar prepared from S. purpuratus. The evolutionary significance of et al., 2002). this result remains unclear although we can speculate that such 576 J. Walters et al. / Developmental Biology 315 (2008) 567–578 changes in binding affinity could produce altered patterns of the family Strongylocentrotidae, we found evidence for positive gene expression during development of the larval skeleton. For selection on the binding sites along the lineage leading to example, a change in binding affinity could result in an increase S. purpuratus. Interestingly, an analysis of coding sequence or decrease in levels of transcription thereby affecting the within the mitochondrial genome revealed that this particular deposition of biomineral. It is even possible that such a change species is evolving much faster than its closest relatives (i.e. could result in preferential binding of a different transcription A. fragilis, S. droebachiensis, and S. pallidus)(Biermann et al., factor as in the case of a single nucleotide substitution that occurs 2003). We were surprised by our result since cis-regulatory in one of the repetitive sequence elements upstream of the elements are functionally constrained and therefore, should spec2a gene (Dayal et al., 2004). It will be interesting to extend typically be subject to negative selection (Wray et al., 2003). this study to genes that act “upstream” of SM50 as more Once again, the evolutionary significance of this result remains information becomes available regarding their transcriptional to be determined although it has been hypothesized that an regulation. Evolutionary changes in the cis-regulatory elements accelerated rate of evolution may be related to environmental upstream of other genes expressed by the PMCs such as VEGF- pressures such as climatic fluctuations over the past several R could have a more substantial impact on gene expression million years (Biermann et al., 2003). We speculate that this and consequently, the size and shape of the larval skeleton finding might also be relevant to the observation that the larval (Duloquin et al., 2007). There are many cases in which muta- skeleton of more derived species such as S. purpuratus is less tions in cis-regulatory regions have been shown to affect tran- complex than other members of the family Strongylocentrotidae scription and in some cases, these mutations are even associated (Biermann et al., 2003). Furthermore, it is possible that the with ecologically relevant traits such as the presence or absence SM50 protein is functionally redundant with one of the other of wing spots in fruit flies (Prud'homme et al., 2006), the skeletal 15 spicule matrix proteins that have been identified in pattern of stickleback fish (Shapiro et al., 2004), and resistance S. purpuratus and consequently, is less evolutionarily con- to malarial infection in humans (Tournamille et al., 1995). Thus, strained (Livingston et al., 2006). It will be interesting to see if it is possible that mutations in the cis-regulatory elements a similar trend (i.e. positive selection on binding sites) upstream of SM50 and other genes may be responsible for the exists among other genes involved in skeleton formation in rapid diversification of larval morphology, including skeletal S. purpuratus. Recent sequencing of the sea urchin genome will features, during the post-Paleozoic radiation of the Class undoubtedly facilitate such analyses (Sea Urchin Genome Echinoidea (Wray, 1992). Sequencing Consortium, 2006). In order to gain insight into the mode and intensity of natural Additional work is required for a comprehensive under- selection acting upon the cis-regulatory region of SM50,we standing of the molecular basis for differences in the size performed a variety of statistical tests. First, we analyzed intra- and shape of the larval skeleton. It has been postulated that specific variation to determine if there is any evidence of phenotypic novelty is rarely the product of change affecting a directional selection on this locus. Significantly negative values single gene. Rather, it is more likely to be the product of of DT and DFL indicate that there has been directional selection changes affecting the cis-regulatory “linkages” of multiple on a region of non-coding sequence that is located ∼150– genes and consequently, the overall architecture of the network 400 bp upstream of the transcriptional start site of the SM50 (Davidson, 2001). Therefore, this study should be extended to gene. There is a substantial excess of extreme-frequency poly- other genes expressed by PMCs (and the overlying ectoderm) morphic sites, suggesting the action of directional selection. to examine the co-evolution of developmental regulatory genes Moreover, there is a moderate excess of such sites where the and their downstream targets. We expect that there have been most common allele is the derived allele, suggesting the action dramatic changes in patterning cues and/or how these cues are of positive selection instead of, or in addition to, negative interpreted by the PMCs, which account for the morphological selection. Interestingly, preliminary evidence suggests that diversity exhibited by extant species. In addition, it will be additional cis-regulatory elements may be located upstream of useful to examine more distantly related species to determine Element D4 within the region that has been subject to directional if entirely different mechanisms of transcriptional regulation selection. In particular, the DNA sequence is extremely con- have evolved. In summary, this study provides important new served between Lytechinus and Strongylocentrotus in this insights into the extent to which there is variation in cis- region suggesting the presence of additional transcription factor regulatory elements upstream of a gene that is essential for binding sites (data not shown). Next, we compared variation embryonic development. Moreover, we explore the impact of within binding sites to variation within other nucleotide sites this variation on protein binding as well as the evolutionary using MK and HKA tests (Jenkins et al., 1995). These tests forces that may have led to the pattern of variation that exists suggest that there has been directional selection on the binding among members of the family Strongylocentrotidae. There sites although they fail to discriminate between positive have been relatively few studies in which non-coding and negative selection. Therefore, we took advantage of a sequences have been isolated from multiple, closely related maximum-likelihood method that was recently developed for species and scrutinized in so much detail. Yet such studies are analysis of non-coding (as opposed to coding) sequences (Wong essential for determining how evolutionary changes in and Nielsen, 2004; Haygood et al., 2007). This method is par- transcriptional regulation might accumulate to produce altered ticularly useful in detecting cases of strong positive selection. patterns of gene expression with functional consequences in By utilizing the DNA sequences obtained for other members of the embryo. J. Walters et al. / Developmental Biology 315 (2008) 567–578 577

Acknowledgments George, N.C., Killian, C.E., Wilt, F.H., 1991. Characterization and expression of a gene encoding a 30.6-kDa Strongylocentrotus purpuratus spicule matrix protein. Dev. Biol. 147, 334–342. We would like to acknowledge Titus Brown (CalTech) and Guss, K.A., Ettensohn, C.A., 1997. Skeletal morphogenesis in the sea urchin Takae Kiyama (MD Anderson) for providing us with nuclear embryo: regulation of primary mesenchyme gene expression and skeletal protein extract. We would also like to acknowledge our rod growth by ectoderm-derived cues. Development 124, 1899–1908. colleagues at Denison University (Warren Hauk, Peter Kuhl- Hahn, M.W., 2006. Detecting natural selection on cis-regulatory DNA. Genetica – man, Eric Liebl, Geoff Smith, Jeff Thompson, Christine 129, 7 18. Hasegawa, M., Kishino, H., Yano, T., 1985. Dating of the human–ape splitting Weingart, and Lina Yoo) as well as James Balhoff and Greg by a molecular clock of mitochondrial DNA. J. Mol. Evol. 22, 160–174. Wray (Duke University) for their helpful advice regarding our Haygood, R., Fedrigo, O., Hansen, B., Yokoyama, K., Wray, G.A., 2007. project. This work was supported by the Anderson Endowment, Promoter regions of many neural- and nutrition-related genes have Fairchild Foundation, Laura C. Harris Fund, Howard Hughes experienced positive selection during human evolution. Nat. Genet. 39, – Medical Institute, and the Office of Provost at Denison 1140 1144. Hein, J., Schierup, M.H., Wiuf, C., 2005. Gene Genealogies, Variation and University. In addition, R.H. was supported by a National Evolution: A Primer in Coalescent Theory. Oxford Univ. Press, Oxford, UK. Science Foundation Postdoctoral Fellowship in Biological Hudson, R.R., 2001. Two-locus sampling distributions and their application. Informatics (Grant No. 0434655). Genetica 159, 1805–1817. Hudson, R.R., 2002. Generating samples under a Wright–Fisher neutral model – Appendix A. Supplementary data of genetic variation. Bioinformatics 18, 337 338. Hudson, R.R., Kreitman, M., Aguade, M., 1987. A test of neutral molecular evolution based on nucleotide data. Genetica 116, 153–159. Supplementary data associated with this article can be found, Hyman, L.H., 1955. The Invertebrates: Echinodermata. McGraw-Hill Inc., New in the online version, at doi:10.1016/j.ydbio.2008.01.007. York, NY. Ingersoll, E.P., McDonald, K.L., Wilt, F.H., 2003. Ultrastructural localization of spicule matrix proteins in normal and metalloproteinase inhibitor-treated sea References urchin primary mesenchyme cells. J. Exp. Zool. A. Comp. Exp. Biol. 300, 101–112. Balhoff, J.P., Wray, G.A., 2005. Evolutionary analysis of the well characterized Jenkins, D.L., Ortori, C.A., Brookfield, J.F.Y., 1995. A test for adaptive change endo16 promoter reveals substantial variation within functional sites. Proc. in DNA sequences controlling transcription. Proc. R. Soc. London B 261, Natl. Acad. Sci. U. S. A. 102, 8591–8596. 203–207. Beniash, E., Aizenberg, J., Addadi, L., Weiner, S., 1997. Amorphous calcium Killian, C.E., Wilt, F.H., 1996. Characterization of the proteins comprising the carbonate transforms into calcite during sea urchin larval spicule growth. integral matrix of Strongylocentrotus purpuratus embryonic spicules. Proc. R. Soc. London B 264, 461–465. J. Biol. Chem. 271, 9150–9159. Benson, S.C., Benson, N.C., Wilt, F.H., 1986. The organic matrix of the skeletal Kitajima, T., Urakami, H., 2000. Differential distribution of spicule matrix spicule of sea urchin embryos. J. Cell Biol. 102, 1878–1886. proteins in the sea urchin embryo skeleton. Dev. Growth Differ. 42, Benson, S., Sucov, H.M., Stephens, L., Davidson, E.H., Wilt, F.H., 1987. A 295–306. lineage-specific gene encoding a major matrix protein of the sea urchin Kosakovsky Pond, S.L., Muse, S.V., 2005. HyPhy: hypothesis testing using embryo spicule: I. Authentication of the cloned gene and its developmental phylogenies. In: Nielsen, R. (Ed.), Statistical Methods in Molecular expression. Dev. Biol. 120, 499–506. Evolution. Springer, New York, NY, pp. 125–181. Biermann, C.H., Kessing, B.D., Palumbi, S.R., 2003. Phylogeny and develop- Lee, Y.-H., Britten, R.J., Davidson, E.H., 1999. SM37, a skeletogenic gene of ment of marine model species: strongylocentrotid sea urchins. Evol. Dev. 5, the sea urchin embryo linked to the SM50 gene. Dev. Growth Differ. 41, 360–371. 303–312. Cameron, R.A., Chow, S.H., Berney, K., Chiu, T.-Z., Yuan, Q.-A., Kramer, A., Livingston, B.T., Killian, C.E., Wilt, F., Cameron, A., Landrum, M.J., Helguero, A., Ransick, A., Yun, M., Davidson, E.H., 2005. An evolutionary Ermolaeva, O., Sapojnikov, V., Maglott, D.R., Buchanan, A.M., Ettensohn, constraint: strongly disfavored class of change in DNA sequence during C.A., 2006. A genome-wide analysis of biomineralization-related proteins in divergence of cis-regulatory modules. Proc. Natl. Acad. Sci. U. S. A. 102, the sea urchin Strongylocentrotus purpuratus. Dev. Biol. 300, 335–348. 11769–11774. Makabe, K.W., Kirchhamer, C.V., Britten, R.J., Davidson, E.H., 1995. Cis- Davidson, E.H., 2001. Genomic Regulatory Systems: Development and regulatory control of the SM50 gene, an early marker of skeletogenic lineage Evolution. Academic Press, San Diego, CA. specification in the sea urchin embryo. Development 121, 1957–1970. Davidson, E.H., Rast, J.P., Oliveri, P., Ransick, A., Calestani, C., Yuh, C.-H., McDonald, J.H., Kreitman, M., 1991. Adaptive protein evolution at the Adh Minokawa, T., Amore, G., Hinman, V., Arenas-Mena, C., Otim, O., Brown, locus in Drosophila. Nature 351, 652–654. C.T., Livi, C.B., Lee, P.Y., Revilla, R., Schiltra, M.J., Clarke, P.J., Rust, A.G., Okazaki, K., Inoue, S., 1976. Crystal property of larval sea urchin spicules. Dev. Pan, Z., Arnone, M.I., Rowen, L., Cameron, R.A., McClay, D.R., Hood, L., Growth Differ. 18, 413–434. Bolouri, H., 2002. A provisional regulatory gene network for specification of Peled-Kamar, M., Hamilton, P., Wilt, F.H., 2002. Spicule matrix protein LSM34 endomesoderm in the sea urchin embryo. Dev. Biol. 246, 162–190. is essential for biomineralization of the sea urchin spicule. Exp. Cell Res. Dayal, S., Kiyama, T., Villinski, J.T., Zhang, N., Liang, S., Klein, W.H., 2004. 272, 56–61. Creation of cis-regulatory elements during sea urchin evolution by co-option Prud'homme, B., Gompel, N., Rokas, A., Kasner, V.A., Williams, T.M., Yeh, and optimization of a repetitive sequence adjacent to the spec2a gene. Dev. S.D., True, J.R., Carroll, S.B., 2006. Repeated morphological evolution Biol. 273, 436–453. through cis-regulatory changes in a pleiotropic gene. Nature 440, Duloquin, L., Lhomond, G., Gache, C., 2007. Localized VEGF signaling from 1050–1053. ectoderm to mesenchyme cells controls morphogenesis of the sea urchin Richardson, W., Kitajima, T., Wilt, F., Benson, S., 1989. Expression of an embryo skeleton. Development 134, 2293–2302. embryonic spicule matrix gene in calcified tissues of adult sea urchins. Dev. Ettensohn, C.A., Malinda, K.M., 1993. Size regulation and morphogenesis: a Biol. 132, 266–269. cellular analysis of skeletogenesis in the sea urchin embryo. Development Sea Urchin Genome Sequencing Consortium, 2006. The genome of the sea 119, 155–167. urchin Strongylocentrotus purpuratus. Science 314, 941–952. Eyre-Walker, A., 2002. Changing effective population size and the McDonald– Shapiro, M.D., Marks, M.E., Peichel, C.L., Blackman, B.K., Nereng, K.S., Kreitman test. Genetica 162, 2017–2024. Jonsson, B., Schluter, D., Kingsley, D.M., 2004. Genetic and developmental 578 J. Walters et al. / Developmental Biology 315 (2008) 567–578

basis of evolutionary pelvic reduction in threespine sticklebacks. Nature trotus franciscanus spec gene family encoding intracellular calcium-binding 428, 717–723. proteins. Dev. Genes Evol. 215, 410–422. Sucov, H.M., Benson, S., Robinson, J.J., Britten, R.J., Wilt, F., Davidson, E.H., Wong, W.S.W., Nielsen, R., 2004. Detecting selection in noncoding regions of 1987. A lineage-specific gene encoding a major matrix protein of the sea nucleotide sequences. Genetica 167, 949–958. urchin embryo spicule. II. Structure of the gene and derived sequence of the Wray, G.A., 1992. The evolution of larval morphology during the post- protein. Dev. Biol. 120, 507–519. Paleozoic radiation of echinoids. Paleobiology 18, 258–287. Tournamille, C., Colin, Y., Cartron, J.P., Le Van Kim, C., 1995. Disruption of a Wray, G.A., 2007. The evolutionary significance of cis-regulatory mutations. GATA motif in the Duffy gene promoter abolishes erythroid gene expression Nat. Rev., Genet. 8, 206–216. in Duffy-negative individuals. Nat. Genet. 10, 224–228. Wray, G.A., Hahn, M.W., Abouheif, E., Balhoff, J.P., Pizer, M., Rockman, M.V., Urry, L.A., Hamilton, P.C., Killian, C.E., Wilt, F.H., 2000. Expression of spicule Romano, L.A., 2003. The evolution of transcriptional regulation in matrix proteins in the sea urchin embryo during normal and experimentally eukaryotes. Mol. Biol. Evol. 20, 1377–1419. altered spiculogenesis. Dev. Biol. 225, 201–213. Zhang, J., Nielsen, R., Yang, Z., 2005. Evaluation of an improved branch-site Villinski, J.T., Kiyama, T., Dayal, S., Zhang, N., Liang, S., Klein, W.H., 2005. likelihood method for detecting positive selection at the molecular level. Structure, expression, and transcriptional regulation of the Strongylocen- Mol. Biol. Evol. 22, 2472–2479.