<<

Developmental Biology 300 (2006) 349–365 www.elsevier.com/locate/ydbio

The immune gene repertoire encoded in the purple genome

Taku Hibino a,1, Mariano Loza-Coll a,1, Cynthia Messier a, Audrey J. Majeske b, Avis H. Cohen c, David P. Terwilliger b, Katherine M. Buckley b, Virginia Brockton b, Sham V. Nair d, Kevin Berney e, Sebastian D. Fugmann f, Michele K. Anderson g, Zeev Pancer h, ⁎ ⁎ R. Andrew Cameron e, L. Courtney Smith b, , Jonathan P. Rast a,

a Sunnybrook Research Institute and Department of Medical Biophysics, University of Toronto, 2075 Bayview Ave., Room S-126b, Toronto, Ontario, Canada M4N 3M5 b 340 Lisner Hall, Department of Biological Sciences, George Washington University, 2023 G St. NW, Washington, DC 20052, USA c Department of Biology and the Institute of Systems Research, University of Maryland, College Park, MD 20742, USA d Department of Biological Sciences, Macquarie University, Sydney, Australia e Division of Biology, 156-29 California Institute of Technology, Pasadena, CA 91125, USA f Laboratory of Cellular and Molecular Biology, National Institute on Aging, Baltimore, MD 21224, USA g Sunnybrook Research Institute and Department of Immunology, University of Toronto. 2075 Bayview Ave., Room A-340, Toronto, Ontario, Canada M4N 3M5 h Center of Marine Biotechnology, UMBI, Columbus Center, Suite 236 701 East Pratt St. Baltimore, MD 21202, USA Received for publication 1 June 2006; revised 21 August 2006; accepted 28 August 2006 Available online 3 September 2006

Abstract

Echinoderms occupy a critical and largely unexplored phylogenetic vantage point from which to infer both the early evolution of bilaterian immunity and the underpinnings of the vertebrate adaptive immune system. Here we present an initial survey of the purple sea urchin genome for genes associated with immunity. An elaborate repertoire of potential immune receptors, regulators and effectors is present, including unprecedented expansions of innate pathogen recognition genes. These include a diverse array of 222 Toll-like receptor (TLR) genes and a coordinate expansion of directly associated signaling adaptors. Notably, a subset of sea urchin TLR genes encodes receptors with structural characteristics previously identified only in protostomes. A similarly expanded set of 203 NOD/NALP-like cytoplasmic recognition proteins is present. These genes have previously been identified only in vertebrates where they are represented in much lower numbers. Genes that mediate the alternative and lectin complement pathways are described, while gene homologues of the terminal pathway are not present. We have also identified several homologues of genes that function in jawed vertebrate adaptive immunity. The most striking of these is a gene cluster with similarity to the jawed vertebrate Recombination Activating Genes 1 and 2 (RAG1/2). Sea urchins are long-lived, complex organisms and these findings reveal an innate immune system of unprecedented complexity. Whether the presumably intense selective processes that molded these gene families also gave rise to novel immune mechanisms akin to adaptive systems remains to be seen. The genome sequence provides immediate opportunities to apply the advantages of the sea urchin model toward problems in developmental and evolutionary immunobiology. © 2006 Elsevier Inc. All rights reserved.

Keywords: Hematopoiesis; Developmental immunology; Innate immunity; TLR; RAG; NOD; ; Scavenger receptor; Complement; Cytokine

Introduction

⁎ Corresponding authors. L.C. Smith is to be contacted at 340 Lisner Hall, The classic work of Ilya Metchnikoff using echinoderm larvae Department of Biological Sciences, George Washington University, 2023 G St. and other invertebrates formed the basis of the field of cellular NW, Washington, DC 20052, USA. J.P. Rast, Sunnybrook Research Institute immunology more than a century ago (Metchnikoff, 1891). The and Department of Medical Biophysics, University of Toronto, 2075 Bayview fundamental observation behind this work was that mesenchymal Ave., Room S-126b, Toronto, Ontario, Canada M4N 3M5. E-mail addresses: [email protected] (L.C. Smith), [email protected] cells with the ability to recognize non-self are ubiquitous among (J.P. Rast). . The mechanisms responsible for non-self recognition 1 These authors contributed equally to this work. can be classified into two broad categories: innate immunity, in

0012-1606/$ - see front matter © 2006 Elsevier Inc. All rights reserved. doi:10.1016/j.ydbio.2006.08.065 350 T. Hibino et al. / Developmental Biology 300 (2006) 349–365 which immune specificity is encoded directly in the germline, and arthropods are highly divergent and because immune systems acquired (adaptive) immunity, in which specificity is achieved by are inherently labile, it is difficult to make meaningful a combination of somatic diversification and selection. Since the generalizations about immunity given the large early days of immunology the overwhelming focus of the field phylogenetic gaps in our data sources. The sea urchin genome has been on the adaptive immune system of jawed vertebrates. fills an important gap and offers a particularly relevant external The recent shift of attention to the more inclusive realm of innate perspective of vertebrate immunity (Fig. 1A). immunity has brought immunology more in line with fields such Immunity in the sea urchin has been investigated from both a as developmental biology and physiology where comparative cellular and molecular standpoint (reviewed in Gross et al., approaches using invertebrate models are widely used. These 1999; Rast et al., 2000; Smith et al., 2006). A variety of blood- comparative studies increasingly suggest that many fundamental like cells present in the fluid of the coelomic cavity, collectively molecular features of animal immunity are ancestral bilaterian known as coelomocytes, mediate immunity in adult sea urchins. traits (Hoffmann, 2003; Janeway and Medzhitov, 2002). Similar These include several types of macrophage-like cells that are evidence extends this view to the developmental programs that capable of phagocytosis, encapsulation and other related specify mesodermal immune cells (e.g., Evans et al., 2003). immune functions (Figs. 1B–D), granular cells, including red Nonetheless, much more detailed information is needed to spherule cells that produce and degranulate echinochrome A, a exclude the possibility that particular immune similarities arose naphthoquinone with antibacterial properties (Fig. 1E; Perry by chance, convergent evolution (Friedman and Hughes, 2002) and Epel, 1981; Service and Wardlaw, 1984), and other or for other reasons that do not reflect the primitive state of types that may also function in wound healing, clotting and immune function. immunity (e.g., Figs. 1F, G). Sea urchin larvae also have cells Acquired immunity that is mediated by somatically rearran- that function in immune defense (Metchnikoff, 1891; Silva, ging T and B cell antigen receptors in the process of V(D)J 2000). These include blastocoelar cells (Tamboline and Burke, recombination (i.e., T cell receptors [TCR] and immunoglobulins 1992) that are capable of phagocytosis and bacterial degradation [Ig]) is found only in the jawed vertebrates (reviewed in Cannon (Figs. 1H, I; Hibino and Rast, unpublished), and granulated et al., 2004a; Litman et al., 2005), although the underlying pigment cells that produce echinochrome A (Calestani et al., mechanisms and effectors from which these systems are built 2003; Gibson and Burke, 1987) and are thought to be involved have been part of animal blood systems since long before the in the defense of the larval ectoderm (reviewed in Smith, 2005). emergence of V(D)J recombination (Klein and Nikolaidis, 2005). Pigment cells are also capable of phagocytosis and appear to Independent but analogous adaptive systems may be a more participate in wound healing in the larva (Fig. 1J; Hibino and ubiquitous feature of immunity than was previously appreciated Rast, unpublished). as exemplified by the recent discovery of potential somatic Here we present a survey of the genome sequence of the diversification systems in protostome invertebrates (Watson et purple sea urchin, Strongylocentrotus purpuratus, for genes that al., 2005; Zhang et al., 2004) and a particularly elaborate system are homologues of known immune receptors, regulators and in cyclostomes (Alder et al., 2005; Pancer et al., 2004, 2005; effectors. We find that the diversity and number of innate Pancer and Cooper, 2006). These findings begin to blur the receptors greatly exceed that which has been observed in boundary between adaptive and innate immune systems (Cannon animals characterized to date. Homologues of genes that were et al., 2004b; Flajnik and Du Pasquier, 2004; Loker et al., 2004) previously identified only in chordates, some of which have and highlight how little we know about animal immunity. important functions in the adaptive immune system of jawed Our lack of knowledge stems largely from the fact that immune vertebrates, are present in the sea urchin genome. These genes evolve at an extraordinarily rapid pace (Hughes, 1997; findings greatly refine our understanding of the immune system Murphy, 1993). Presumably this is driven by intense selection in the of the common deuterostome ancestor. The sea urchin interplay between host and pathogen. As a consequence, finding represents the first large and long-lived invertebrate to be the immune gene homologues with standard molecular strategies and subject of a genome sequencing project. This initial survey inferring primitive states is a difficult task. Genome sequences offer reveals a complex immune system that is unlike any yet the best means for progress and their availability has broken long- characterized, and suggests that major revisions are forthcoming standing barriers to advancement in this area. Not only do for our understanding of animal immunity. comprehensive genome sequences enable the identification of homologous genes with much lower sequence identity, they Materials and methods conversely allow more confident exclusion of immune mechanisms when gene homologues are absent. Perhaps most importantly, they Sequence database searches provide a framework for the identification of novel immune genes based on structural and genomic considerations. In conjunction The sea urchin genome sequence is derived from the sperm DNA of a single with experimental screens, the latter strategy is especially attractive sea urchin specimen (Cameron et al., 2004). Several genomic data sets were since it can be a relatively unbiased approach. used in our analysis: (1) the whole genome shotgun (WGS) sequence, (2) the 11/ Our current understanding of immunity comes from a small 23/2004, 7/18/2005 and 3/16/2006 genome assemblies from the Baylor College of Medicine Human Genome Sequencing Center (BCM-HGSC, http://www. and scattered group of phyla. The great majority of direct hgsc.bcm.tmc.edu/projects/seaurchin/) and (3) several sets of gene models evidence derives from studies in mammals and to a lesser including the GLEAN3 gene models (BCM-HGSC) and the NCBI Gnomon ab degree from Drosophila melanogaster. Because chordates and initio gene models (http://www.ncbi.nlm.nih.gov/genome/guide/sea_urchin/). T. Hibino et al. / Developmental Biology 300 (2006) 349–365 351

Fig. 1. Phylogeny of animal immunity and immune cells of the sea urchin. (A) A simplified overview of the phylogeny of bilaterian immunity. Verified and potential examples of immune somatic diversification systems are noted with an asterisk (*). The inferred acquisition of immune mechanisms is noted by red bars. These include (1) the use of TLRs, complement and PGRPs for immune recognition, (2) NLR-like genes (see text), (3) a VLR system of adaptive immunity in the cyclostomes, (4) V(D)J-mediated adaptive immunity in the gnathostomes. Taxa designations are as follows: B, Bilateria; P, Protostoma; D, Deuterostoma; Cn, Cnidaria; Ecd, Ecdysozoa including Drosophila melanogaster and Caenorhabditis elegans; Loph, Lophotrochazoa including the mollusk Biomphalaria glabrata; E, Echinodermata including the purple sea urchin, Strongylocentrotus purpuratus; H, Hemichordata; U, Urochordata including the tunicate Ciona intestinalis and the larvacean Oikopleura dioica; C, Cephalochordata, including Branchiostoma floridae; A, Agnatha including lamprey and hagfish; and G, . (B-G) Immune cells of the adult sea urchin. (B–D) Phagocytic cells, (E) Red spherule cell, (F) vibritile cell (note flagellum), (G) colorless spherule cell. (H–J) Larval cells with immune function. (H, I) Blastocoelar cells with engulfed GFP labeled bacteria (Escherichia coli). (J) A pigment cell within the ectoderm with phagocytosed bacterium.

The strategy for identifying genes with potential immune and hematopoietic Sequence analysis functions in the S. purpuratus genome varied according to gene type. For genes with significant primary sequence conservation, BLASTP and TBLASTN from Gene models were analyzed for domain structure and features such as trans- the BLASTALL suite (Altschul et al., 1997) were used to search the various membrane regions and signal using the SMART suite (Letunic et al., gene model sets and the genome assemblies to identify potential homologues. 2004), the Sanger Institute PFAM site (http://www.sanger.ac.uk/Software/Pfam/) Many immune gene homologues are best distinguished by the possession of and local PFAM searches. Multiple alignments and phylogenetic analyses were characteristic combinations (e.g., TLR and NLR genes). For this made with the MEGA3 Phylogenetic analysis program suite (ver. 3.1; Kumar category of genes HMMER (Eddy, 1998) was used to identify PFAM domain et al., 2004). ClustalX (Thompson et al., 1997), BioEdit (Hall, 1999), PAUP* profile matches (Bateman et al., 2004) to gene models, open reading frame (Swafford, 2002) and a Bayesian method (Ronquist and Huelsenbeck, 2003) translations extracted using the GETORF program from the analysis EMBOSS were also used in the phylogenetic analysis of thioester proteins. In the case of suite (Rice et al., 2000) or full assembly translations. some TLR genes, pseudogenes were evident as introns in the gene models that Immune gene designations incorporated additional evidence from (1) upon closer examination encompassed coding regions that contained stop reciprocal BLAST analyses, (2) protein domain architecture that was not a codons or frame-shift mutations. These sequences were verified as determinant of the initial search, (3) tiling array expression data for confirmation pseudogenes by reference to primary trace chromatograms. or revision of gene structure (Samanta and Davidson, 2006), (4) signal and transmembrane region identification and (5) conserved structure relative to Expression analysis related sea urchin gene family members. For some genes RT-PCR and Rapid Amplification of cDNA Ends (RACE; GeneRacer kit; Invitrogen) were used to For some predicted genes, transcript prevalence was analyzed at key obtain transcript sequence flanking conserved regions. embryonic stages, in larvae and in adult tissues including coelomocytes. Real- 352 T. Hibino et al. / Developmental Biology 300 (2006) 349–365 time quantitative PCR was used for these measurements as described elsewhere Table 1 (Fugmann et al., 2006; Rast et al., 2002). Genes with relevance to immunity and other blood cell functions in the Strongylocentrotus purpuratus genome a Results and discussion Pattern recognition receptors and signaling Toll-like receptors (TLR) a 222 NACHT domain and leucine-rich repeat (NLR) proteins a 208 General considerations Scavenger receptor cysteine-rich (SRCR) a 218 (containing 1095 SRCR domains) A survey of the S. purpuratus genome sequence identified Peptidoglycan recognition proteins (PGRP) 5 more than 1000 gene models with relevance to immunity and Gram-negative binding proteins (GNBP) 3 other blood cell functions (summarized in Table 1; see Table 2 RIG-I/LGP2/MDA-5-like 12 MyD88-like 4 for a reference list explaining gene acronyms of key immune SARM-like 15 homologues that are listed in the text; detailed lists are provided Additional TIR containing 7 in Supporting Online Material Tables S1, S2, S3, S4, S5). ICE/-1-like 5 Although a number of genes that are expected to function in RIPK1/2/3/4-like 2 viral immunity are listed, we did not specifically search for CRADD- and ASC-like 2 NFkB/IRF pathway homologues Present generalized cellular protection functions related to viral interference. To a greater extent than for surveys of other Complement system gene types, the inability to detect many gene homologues may C3/4-like thioester containing 4 Factor B-like 3 reflect methodological limitations imposed by the high rates of C1q-like 7 divergence of immune system receptors and immune signaling Mannose-binding protein (MBP)-like 1 genes rather than their true absence. For this same reason novel Additional thioester-containing proteins 3 immune mechanisms, which may ultimately be some of the Effector genes most interesting aspects of sea urchin immunity, are likely to be MACPF containing (perforin-like) 22 missed in this type of survey, although the genome sequence Nitric oxide synthase (NOS) 3 will be critical to their identification. Peroxidase 3 Vertebrate adaptive immune system An exceptionally complex innate repertoire Ig superfamily models with similarity to non-rearranging 15 IGSF co-receptors b One of the more striking features of the genome with regard Rag1/2-like gene cluster 1 ∼ to immunity is an enormous expansion of three classes of innate Rag1-like pseudogenes 30 TdT/Polμ 1 immune recognition proteins comprising Toll-like receptors Additional components of the V(D)J recombination machinery (Ku70, Present (TLR), NACHT and leucine-rich repeat containing proteins Ku80, DNA IV, Xrrc4, Cerunnos, DNA-PKcs, Artemis) (NLR) and multidomain scavenger receptor cysteine-rich (SRCR) proteins (for comparison among animals see Table Cytokines and growth factors IL-1 receptor 1 3). Each of these gene classes participates in recognition of IL-1 receptor accessory proteins 2 potential pathogens through direct or indirect binding to Tumor necrosis factor (TNF) superfamilyc 4 pathogen associated molecular patterns (PAMPs). It is generally TNF receptor (TNFR) superfamily c 9 c hypothesized that a relatively small number of these innate TNFR adaptors 7 receptors is sufficient to detect a near comprehensive range of IL-17-like 30 IL-17 receptor-like 2 pathogens. Consequently, the many-fold amplification of these MIF-like 9 gene families in the sea urchin genome suggests that they are Flk-1/Flt-1/Flt-4 2 likely to function as part of a fundamentally different immune Tie1/2 1 mechanism. Key immune and hematopoietic transcriptional regulators c Homologues of the following transcription factors are present: Ets1/2, Toll-like receptors Erg/Fli-1, PU.1/Spi-B and C, Elf-1, Tel, GABP, GATA1-3, E2A/ TLR proteins are found throughout the animal kingdom and HEB/ITF-2, SCL, MITF/TFE3, Id-2, Hes-1and4, Hey/HERP, – have been shown to have immune function in both protostomes Runx1-3, EBF1-3, Pax-5, C/EBPa-b e, C/EBPg, Ikaros-like, Egr1-3, LKLF/EKLF, Gfi-1, Blimp-1, Tcf-1/LEF-1, IRF1-2, IRF4 and vertebrates (Beutler, 2003; Hoffmann, 2003; Medzhitov, and 8, c-Fos, c-JunB/D, ATF-2/ATF-7/CREB, XBP-1, NFE, c-rel/ 2001). Though the Toll genes also have a developmental NFkB/NFAT family, STAT5, RARa, RARb, SOX4-11-22-24, T-bet, function in insects, similar functions have not been found in FoxP1-4, SMAD1-5, Dead ringer, Oct1-2, c-myb, Lmo1/2, Fog1/2 other animal groups. Between one and 20 TLR genes were a Refer to Online Supplementary Material Table S1 (for full categorized list of found in previously characterized genomes (Roach et al., 2005). immunity-related genes), Table S2 (for full list of TLR genes), Table S3 (for full TLR proteins are composed of a solenoid-like ectodomain made list of NLR genes) and Table S4 (for full list of SRCR genes). b up of 20–25 leucine-rich repeat (LRR) motifs, a transmembrane Models show low-score reciprocal Blasting and domain architecture similarity with Ig/TCR and related non-rearranging IGSF proteins. Further region and a cytoplasmic Toll/interleukin-1 receptor (TIR) work is needed for validation of these models. domain (Bell et al., 2003). Immune recognition is mediated by c These genes are described in more detail in accompanying papers in this the LRR ectodomain and signaling is transduced through the issue of Developmental Biology (See text for references). T. Hibino et al. / Developmental Biology 300 (2006) 349–365 353

Table 2 Reference to key immune gene acronyms used throughout the text Acronym Description Functional association 185/333 Secreted sea urchin immune response gene family Immune response

α2m Alpha 2 macroglobulin Protease inhibitor AID Activation-induced Ig/TCR somatic diversification APOBEC Apolipoprotein B mRNA editing catalytic polypeptide Antiviral innate immunity and RNA editing ASC -associated speck-like protein containing a CARD NLR signaling Bf Factor B Complement pathway C1q Complement component C1q Complement pathway CSF-1 Colony stimulating factor-1 Growth factor CSF-1R/c-Kit Colony stimulating factor-1 receptor/Receptor tyrosine kinase c-kit Hematopoietic growth factor receptor (RTK) CRADD Caspase and RIP adaptor with NLR signaling GNBP Gram-negative binding protein Gram-negative bacteria recognition ICE Interleukin-converting enzyme Pro-interleukin1b processing IL-1 Interleukin-1 Inflammation IL-17 Interleukin-17 Inflammation IMCV Ips-1/MAVS/Cardif/VISA RIG-I signaling IRF Interferon regulatory factor Transcription factor Ku70 Autoantigen in sera of patient with autoimmune disease of 70 kDa DNA helicase Ku80 Autoantigen in sera of patient with autoimmune disease of 80 kDa dsT-DNAs and dsDNA breaks repair LGP2 Novel gene downstream of mouse Stat5b RIG-I family member MASP MBP-associated protease Complement pathway MBP Mannose-binding protein Complement pathway MDA-5 Melanoma differentiation-associated protein 5 RIG-I family member MHC Major histocompatibility complex Antigen presentation MIF Macrophage migration inhibitory factor Inflammation MyD88 Myeloid differentiation primary response gene 88 Immediate adaptor for TLR signaling NALP NACHT leucine-rich repeat and PYD containing protein Intracellular pathogen recognition NLR NACHT leucine-rich repeat protein Intracellular pathogen recognition NOD Nucleotide oligomerization domain protein Intracellular pathogen recognition NOS Nitric oxide synthase Synthesis of Nitric Oxide PDGF Platelet-derived growth factor Growth factor PDGFR Platelet-derived growth factor receptor Receptor for PDGF (RTK) Pelle/IRAK Interleukin-1 receptor-associated kinase TLR signal transduction PGRP Peptidoglycan recognition protein Bacterial recognition Polμ DNA polymerase μ Error prone polymerase RAG1/2 Recombination activating gene 1/2 V(D)J recombination RIG-I Retinoic-acid inducible gene I Double stranded RNA recognition RIPK Receptor (TNFRSF)-interacting serine–threonine kinase TNF signaling SARM Sterile alpha and TIR motif containing TLR signal transduction (?) SRCR Scavenger receptor cysteine-rich domain Innate immune recognition TAB2/3 TAK1-binding protein 2/3 NFκB activation TAK1 TGF-activated kinase 1 NFκB activation TCP Sea urchin thioester containing protein Complement TCR T-cell receptor MHC/antigen recognition TdT Terminal deoxynucleotide V(D)J recombination TICAM-1/TRIF TIR-containing adaptor molecule 1/TIR-containing inducing inf-β TLR signaling TICAM-2/TRAM TIR-containing adaptor molecule 2/TRIF-related adaptor molecule TLR signaling Tie-1/2 Tyr kinase with Ig and epidermal growth factor homology domains Angiogenesis/Hematopoiesis TIRAP/MAL TIR-containing adaptor protein/myelin and lymphocyte protein TLR signaling TLR Toll-like receptor Innate immune recognition TNF Tumor necrosis factor Inflammation TRAF TNF-receptor-associated factor TNFR and IL-1 signaling VEGF Vascular endothelial growth factor Growth factor VEGFR Vascular endothelial growth factor receptor Receptor for VEGF (RTK) Xrrc4 X-ray complementing defective repair in Chinese hamster cells Non-homologous end-joining dsDNA break repair

TIR domain to activate Rel/NFκB (homologues of these genome sequence; however, independent analysis of unassem- transcription factors are present in the sea urchin genome, bled WGS sequence indicates that the number of TLR genes is Tables 1 and S5) and, in mammals, IRF transcription factor no fewer than 150 per haplotype (1633 TLR-like TIR domains target genes (Akira et al., 2006). sequences were identified in 5× coverage of unassembled The S. purpuratus genome sequence contains 222 TLR gene sequence representing 357 unambiguously distinguishable models. Some gene models may represent sampling of two sequences). The members of this expansive TLR gene repertoire haplotypes from the single sea urchin that is the source of the can be divided into two broad categories by comparison of their 354 T. Hibino et al. / Developmental Biology 300 (2006) 349–365

Table 3 Multiplicity of innate immune recognition proteins in representative protostomes and deuterostomes D. melanogaster C. elegans S. purpuratus C. intestestinalis H. sapiens. TLR 9 1 222 3 10 (+1 ψ) NLR 0 0 203 0 a ∼20 SRCR b 14/7 3/1 1095/218 22/8 c 81/16 PGRP d 15 0 5 0 6 GNBP e 40330 a Although no canonical NLR genes have been described a few Ciona intestinalis gene models may be related. b For SRCR proteins the approximate number of SRCR domains in each species encoded in genes containing more than one SRCR domain/number of genes. c Estimate includes single SRCR domain genes. d Number of genes containing a PGRP (SMART) or an Amidase_2 (Pfam) domain. e Entries correspond to genes containing a Glyco_hydro_16 Pfam domain.

TIR domain sequences: a small set of 11 divergent genes like comprises nearly all the genes found in insects (e.g., toll (Group D, Fig. 2B; Table 4, Table S2) and a large lineage- itself) and other protostomes (e.g., the single Caenorhabditis specific expansion that includes 211 genes which fall roughly elegans TLR gene, tol-1) characterized by an ectodomain that into seven groups (Groups I–VII, Fig. 2B, Table 4, Table S2). is divided by an internal LRRCT-LRRNT pair (Rock et al., Animal TLR proteins can be categorized into two major 1998). structural types. The first, here referred to as vertebrate-like, Nearly all of the sea urchin TLR genes are intronless and includes all previously described deuterostome TLRs and a encode proteins with a structure typical of vertebrate-like TLRs minority of insect TLRs. These are characterized by the (Fig. 2A, S1A and Table S2). Among the 11 divergent genes, presence of specialized N-terminal and C-terminal LRR motifs three contain introns but are otherwise vertebrate-like (Table 4). (LRRNT and LRRCT) capping both ends of the internal LRR Remarkably, three divergent sea urchin TLR genes are of the solenoid. The second category, here referred to as protostome- protostome-like structure (Fig. 2A, Figs. S1B, C). Consistent

Fig. 2. Toll-like receptors, NOD/NALP-like proteins and associated signal transduction mediators encoded in the sea urchin genome. (A) TIR domain proteins of the sea urchin genome. (B) A neighbor-joining tree of sea urchin TLR gene TIR domain sequences. Cluster designations are used for reference in the text and Table 4. Blue shaded clades are the sea urchin lineage expansion that comprises the vast majority (95%) of genes. The gray shaded cluster contains the divergent protostome-like (red), “intron containing” (blue)- and “short” (green)-type genes. (C) NACHT and leucine-rich repeat (NLR) proteins and downstream transduction components present in the sea urchin genome. Domain representations are as in panel A. There are 203 NLR genes encoded in the genome sequence. The majority of these encode a single N-terminal DEATH domain, a central NACHT domain and C-terminal LRR motifs. Seven variant genes have two DEATH domains and one gene has a CARD and a DEATH domain (similar to mammalian Nod2). Note that all putative NLR signal transduction mediators contain DEATH instead of CARD domains (see text). T. Hibino et al. / Developmental Biology 300 (2006) 349–365 355

Table 4 Sea urchin TLR gene families TLR Subgroup a No. genes (complete) b Pseudogenes c (%) No. LRRs d No. AA per LRR e Insertion f I 109 (75) 31 22.2 (21–25) 24.6 (21–31) 17, (18)/22 II 28 (26) 24 23.6 (23–24) 24.7 (23–31) 9, 19/24 III 31 (20) 21 23 (23±0) 25.2 (23–39) 4, 8, 18 /23 IV 14 (12) 33 22 (22±0) 24.7 (17–34) 17/22 V 9 (6) 40 22 (22±0) 24.5 (21–31) 17/22 VI 8 (7) 0 24 (24±0) 24.5 (22–32) 9, 19/24 VII 5 (4) 25 22 (22±0) 24.7 (23–31) 8, 17/22 Orphan 7 (7) 0 21.6 (20–22) 24.7 (18–31) 17/22 Protostome-like 3 (3) 0 15 (14–17) 24.1 (21–27) None Short 5 (5) 0 8 (8±0) 24.3 (23–25) None Intron-containing 3 (1) 0 21 21 18/21 Total 222 (166) 26 a See Fig. 2B. Group D comprises protostome-like, Short and Intron subgroups. b Complete genes include signal peptide through TIR to stop codon. c Percent pseudogenes calculated from complete subset. d Number of LRRs per TLR gene [Averange (Range)]. e Number of amino acids per LRR [Averange (Range)]. f Position of small inter- and intra-LRR insertions characteristic of vertebrate genes designated by number of preceding LRR domains. with this finding, the TIR regions of these three genes exhibit the protostome-like TLR genes is undetectable in the embryo, primary amino acid sequence identity and share a diagnostic whereas in the adult one type of protostome-like TLR is shortening of the TIR domain C-terminal β-strand with the expressed in coelomocytes and the other in tube feet (Hibino protostome genes (Fig. S2). This TLR structure has not been and Rast, data not shown). These data are consistent with an previously identified in deuterostomes. These analyses strongly immune function for at least one type of sea urchin TLR genes suggest that genes of this structure were present in the common with protostome-like characteristics. ancestor of modern Bilateria. The remaining five divergent sea Many sea urchin TLR genes are encoded in tandem arrays in urchin TLR genes encode an unusually short ectodomain the genome. For example, Scaffoldi3328 and Scaffoldi909 (labeled “short” in Fig. 2A, S1D) and may have affinities with encode seven and 13 TLR genes, respectively. In each case all the protostome-like genes. genes are in the same transcriptional orientation. Further studies The vast majority of sea urchin TLR genes (211/222) are will clarify the larger genomic organization of the sea urchin more similar to each other than to those of other animals, TLR genes, but these findings are likely to have significant suggesting a gene expansion specific to the sea urchin lineage. implications for both the evolutionary interpretation of this gene Sequence diversity among these genes is greatest in the LRR expansion and how these genes are regulated. ectodomain, consistent with an associated diversification of immune recognition specificity. Diversity among closely related Other TIR-containing genes genes takes three general forms: (1) individual amino acid TLR proteins transduce signals through the interactions of substitutions and small insertion/deletions (Fig. S3A), (2) their TIR domains with cytoplasmic TIR-containing adaptor longer insertions between or within LRR motifs (Fig. S3B), molecules (summarized in Fig. 3A; Akira et al., 2006; and (3) less common insertions of whole LRR motifs (Fig. McGettrick and O'Neill, 2004; O'Neill et al., 2003). Twenty- S3C). In vertebrates, insertions of the second type may affect six genes encoding potential TLR signaling adaptor proteins unique PAMP recognition (Bell et al., 2003). Within phyloge- were identified (Fig. 2A). These include an orthologue of netic clusters, particular LRRs can be highly variable and MyD88 and three additional genes with a MyD88-like domain patterns of nucleotide substitution potentially indicate localized structure, an orthologue of SARM, fourteen SARM-related selection for binding specificity (e.g., Cluster 1B genes exhibit genes, and seven genes encoding cytoplasmic TIR domain hypervariability in LRR 15–18). The variety of insertions proteins of unknown affinity (Table S1; see Fig. 3 for an identified in the major sea urchin TLR clades are not present in illustration of the functions of these genes). These findings protostome-like and short TLR genes. suggest that in addition to the expansion of TLR genes, a Sea urchin TLR sequences were first identified from modest expansion has taken place in TLR adaptor signaling coelomocyte cDNA where they are expressed in considerable proteins. diversity (Pancer, unpublished). Analysis of embryonic tiling The presence of homologues of a complete complement of array data for message prevalence (Samanta and Davidson, TLR signal transduction elements in the S. purpuratus genome, 2006) and our message prevalence measurements for major and the parallels in TLR signaling observed between vertebrates classes of sea urchin TLR genes indicate that expression of TLR and insects (Akira et al., 2006) suggest that engagement of TLR genes is low or absent in embryos prior to the end of proteins may similarly lead to the activation of rel/NFkB target gastrulation. Expression then increases in the prism larvae and genes in sea urchin cells. Moreover, the elevated multiplicity of early pluteus (Messier and Rast, unpublished). Expression of the TIR domain containing adaptor proteins may act to 356 T. Hibino et al. / Developmental Biology 300 (2006) 349–365 partition immune activation among combinations of different (tir-1)inC. elegans can also function in a TLR-independent TLR gene subsets. manner to trigger an immune response (Couillault et al., SARM is another cytoplasmic adaptor molecule that in 2004). In this regard, the multiplicity of SARM-related genes addition to a TIR domain contains SAM and Armadillo may represent an additional family of expanded innate domains. Homologues of these genes have been found in C. immunity genes in the S. purpuratus genome and allow for elegans, D. melanogaster and vertebrates and, although their a more complex modulation of TLR-mediated responses. function is not well understood, they are thought to play a Finally, although no clear homologues could be identified in role in immune TLR signaling (Liberati et al., 2004; the sea urchin genome for other vertebrate TIR domain McGettrick and O'Neill, 2004; O'Neill et al., 2003). In adaptors (TIRAP/MAL, TICAM-1/TRIF and TICAM-2/ addition, a recent report showed that the SARM homologue TRAM), seven novel cytoplasmic TIR-containing genes are predicted in the S. purpuratus genome that could fulfill these adaptor functions (Table S1).

Extensive multiplicity of NOD/NALP-like genes NLR proteins perform a wide variety of functions, but a subclass including the vertebrate NOD and NALP proteins have been proposed to act as cytoplasmic pathogen recognition proteins (Inohara et al., 2005; Kufer et al., 2005). Mammalian NOD1 and NOD2 recognize degradation fragments from peptidoglycans through their LRR motifs (Girardin et al., 2005). These genes have previously been identified only in vertebrates and none have been identified in the Ciona intestinalis or protostome genome sequences. The mammalian NOD1/2 and NALP proteins encode N-terminal CARD and PYD domains, respectively, which are members of the Death- fold domain superfamily (Kohl and Grutter, 2004) the subtypes of which interact by forming subfamily-specific homotypic contacts. There are 203 predicted NLR genes in the sea urchin genome compared to only about 20 in vertebrate species. The structure of the large majority of the sea urchin NLR proteins consists of a central NACHT domain, an N-terminal DEATH domain (a member of the Death-fold domain superfamily) and C-terminal LRRs (Fig. 2C and Fig. S4). Like the TLR genes, phylogenetic analysis of the NACHT domain sequence indicates that the NLR genes cluster into closely related subfamilies (Fig. S5). Like the vertebrate NOD and NALP genes, the LRRs of the sea urchin NLR genes are encoded in a complex exon structure. Although this complexity is not accurately captured in the gene models, the presence of LRRs is evident in all clusters and in most NLR gene models. Remarkably, as observed in vertebrates (e.g., Girardin et al.,

Fig. 3. Hypothetical TLR (A) and NLR (B) signaling pathways in the purple sea urchin. Vertebrate gene products are in black and the corresponding sea urchin homologues (if present) are in blue. Broken arrows represent pathways in which one or more sea urchin homologues could not be identified. No sea urchin homologues for TIRAP/MAL, TICAM1/2 (dashed box) or orthologues of IRF- 3, 5 or 7 could be identified, though other TIR-containing and Sp-IRF genes may fulfill their functions. The dual color boxes in panel B represent the consistent death-fold domain “swapping” between vertebrate and sea urchin homologues (inset) in otherwise identical domain architectures. The transduction cascades emanating from NLR proteins could therefore be mediated by homotypic interactions between DEATH as opposed to CARD domains. Similar domain “swapping” is seen in molecules involved in antiviral immunity (RIG-I/IMCV/ IRF pathway). No homologue of IL-1 was identified in the sea urchin genome. However cytokines are difficult to identify in this type of survey and the presence of IL-1 receptor-like and Sp-ICE-like genes are consistent with the presence of an IL-1-like gene. T. Hibino et al. / Developmental Biology 300 (2006) 349–365 357

2003), our initial characterization of NLR expression in sea occurrence in the genome may represent a form of immune urchin identifies the gut as the major site of expression, at diversity. least for the NLR gene subsets assayed (Fig. S6; C. Messier and J. Rast, unpublished). This finding suggests that control C-type lectin domain proteins of normal gut flora or responses to gut associated pathogen- Lectins are involved in a variety of functions, either as esis may be a driving force of NLR (and possibly TLR; single domain proteins or as domains within larger mosaic Pancer and Cooper, 2006) gene expansion in the sea urchin. proteins (for review, see Drickamer and Taylor, 1993). Many The two major known consequences of NOD/NALP small, secreted lectins function as opsonins. A search of the engagement by PAMPs in mammalian cells are the activation sea urchin genome identified 104 gene models that encode of the rel/NFkB pathway by members of the Receptor small C-type lectins composed of one or two domains with a Interacting Protein Kinase (RIPK) family and the processing variety of putative oligosaccharide binding motifs. A of interleukin-1 by the Interleukin Converting Enzyme (ICE or phylogenetic analysis of 77 lectin domains from these gene caspase-1). Both effects rely on homotypic interactions between models reveals nine small subfamilies of three or more the Death-fold domains (CARD and PYD) of engaged NOD/ members (not shown). As an example, SpEchinoidin is a NALP proteins and those of the RIPKs, ICE and the adaptor member of an 11 gene subfamily (Table S1) and is expressed molecules CRADD and ASC. Sea urchin homologues have exclusively by LPS-activated phagocytes (Multerer and been identified for all genes known to participate in this process Smith, 2004; Terwilliger et al., 2004). Although the other (listed in Table S1 and Fig. 3B). Remarkably, throughout all members of the SpEchinoidin-like gene family encode similar levels of the NLR signal transduction, sea urchin proteins have proteins, they show four different carbohydrate binding DEATH domains whereas their vertebrate counterparts have motifs implying diversified recognition functions. The CARD/PYD domains. multiplicity of gene models that encode similar small C- type lectins suggests that many may have important functions RIG-I-like genes in immune recognition. Vertebrate RIG-I is an RNA helicase that binds viral double- stranded RNA and initiates antiviral responses (Yoneyama et Other innate recognition proteins al., 2005) through the homotypic interaction of its CARD Peptidoglycan recognition proteins (PGRPs) and Gram- domain with that of downstream Ips-1/MAVS/Cardif/VISA negative binding proteins (GNBPs) recognize bacterial cell (IMCV; Hiscott et al., 2006). Twelve genes coding for RNA wall components. PGRPs can act in the recognition and helicase proteins homologous to RIG-I family members (which sometimes degradation of peptidoglycans (PGN) (Zaidman- include MDA-5 and LGP2) and one homologue of IMCV were Remy et al., 2006). Five PGRP gene models are present in identified in the sea urchin genome. Of note, four of the the sea urchin genome and, of these, two are potentially predicted sea urchin RIG-I-like and the IMCV-like proteins also catalytic (i.e., they may function to degrade PGN either as encode DEATH domains instead of CARD domains (see part of an effector function or in a regulatory role) and one above). encodes a transmembrane domain and may act as an immune receptor. Three putative GNBP gene models were also Scavenger receptor proteins identified. GNBPs bind LPS and β-1,3-glucan and have Scavenger receptors comprise a structurally heterogeneous been described from insects (Kim et al., 2000). Although group of proteins often expressed on macrophages that function GNBP proteins have not been described from vertebrates, to recognize endogenous or microbial modified lipoproteins similar genes are present in the C. intestinalis genome with polyanionic character (Mukhopadhyay and Gordon, 2004). (Azumi et al., 2003). The multiplicity of PGRP and GNBP is A class of potential scavenger receptors is represented by five similar to that of other invertebrates, in contrast to the CD36-like genes. In vertebrates these proteins can recognize expanded gene classes described above (Table 3). endogenous and bacterial diacylglycerides (Hoebe et al., 2005). A more notable second subclass of potential scavenger The complement system receptors contains multiple scavenger receptor cysteine-rich (SRCR) domains, which are present in proteins with widely The major functions of the complement system in higher varying functions including immune recognition (Sarrias et al., vertebrates are to opsonize foreign cells and particles to 2004). There are 218 sea urchin gene models that encode augment phagocytosis, to lyse pathogenic cells, and to release multiple SRCR scavenger receptor genes with a total of 1095 anaphylatoxins that recruit phagocytic cells to the region of SRCR domains (Table S4; for a species comparison see Table complement activation. This system is composed of about 35 3). The distribution of signal peptides and transmembrane serum and cell surface proteins (Volanakis, 1998)and domains in these gene models suggests that many may represent constitutes an important element of innate immunity that is fragments of larger genes, consequently their collective also intimately connected to the adaptive immune system. The complexity as measured by the number of SRCR domains complement cascade has three activation pathways (classical, they encode is likely to remain as estimated. The dynamic alternative and lectin) that converge to activate the central expression of the sea urchin SRCR genes by coelomocytes component of the system, C3, and thereby initiate the terminal (Pancer, 2000; Pancer et al., 1999) suggests that their expanded or lytic pathway. 358 T. Hibino et al. / Developmental Biology 300 (2006) 349–365

bers encode a third C3-like sequence, Sp-TCP1 (thioester containing protein) and a C4-like sequence, Sp-TCP2 (Fig. 4A). Four additional thioester containing proteins were identified in the genome; one appears to be an alpha-2- macroglobulin (α2m) homologue, but the others are too short to be convincingly classified as either α2m or complement proteins. A phylogenetic analysis indicates that Sp-C3 and Sp-C3-2 are homologues of the vertebrate C3/4/5 genes (Fig. S7). Both Sp-TCP1 and Sp-TCP2 fall outside of the C3/4/5 clade, and Sp- TCP1 shows affinity to the sea urchin α2m gene. The phylogenetic tree indicates that the vertebrate C3/4/5 sequences form a single clade, with urochordate C3 sequences as a sister- group. The structure of the tree suggests that the complement thioester gene family underwent independent gene duplications within different animal lineages. The previously identified Sp-Bf is a modular protein with five short consensus repeats (SCR; as opposed to three that are typical for vertebrate Bf proteins; Fig. 4A), a von Willebrand factor A domain (vWFA) and a serine protease domain (SP; Smith et al., 1998). Two additional Sp-Bf paralogues are present in the genome and all have similar domain structure with five SCRs in Sp-Bf-2 and four SCRs in SpBf-3. Of note, C. Fig. 4. Overview of the sea urchin complement system. (A) Domain structure of intestinalis also has three Bf genes, although in addition to four sea urchin complement genes. Sp-TCP1 was defined based on a conserved αβ SRC domains, low density liproprotein receptor (LDLR) cleavage site that is present in all C3/4/5 proteins and generates a two-chain domains are also present (Yoshizaki et al., 2005). structure upon proteolysis. Sp-TCP2, like C4 proteins, has an additional αγ cleavage site, which would give rise to a three chain structure. The three Sp-C3- related genes (Sp-C3, Sp-C3-2 and Sp-TCP1) have a catalytic located The lectin pathway C-terminal to the thioester site, however, this histidine is missing in Sp-TCP2. In vertebrates, the lectin pathway is activated when members (B) Hypothetical model of sea urchin complement function. The collectins of the collectin family, Mannose Binding Protein (MBP) and (MBP and C1q) bind specifically to microbial surface oligosaccharides. C1q (reviewed in Fujita et al., 2004), bind pathogens and Although members of the MASP/C1r/C1s family have not been definitively identified in the sea urchin genome (dashed oval), we speculate that a protein activate the MASP (MBP associated serine protease), C1r and with MASP-like function would activate the C3 paralogues and the alternative C1s proteases, which in turn cleave and activate C4 and C3. pathway. A positive feedback loop based on the autocatalytic activation of C3b/ Domain searches of the sea urchin gene models identify five Bf complexes would then accelerate the C3b and C3b/Bf-mediated opsonization members of the collectin family, of which four are similar to of the microbe, its recognition by specific macrophage complement receptors C1q and one to MBP (Fig. 4A). Although variants of the and, ultimately, its phagocytosis. Cleavage of sea urchin C3 to C3a and C3b is MASP/C1r/C1s family were identified in C. intestinalis (Azumi based on sequence alignment with vertebrate proteins and is not yet confirmed by experimental evidence. et al., 2003; Nonaka and Yoshizaki, 2004), no such models could be found in the S. purpuratus genome. Nonetheless, the genomic structure of this gene family is complex and The alternative pathway homologues may have escaped detection. Finally, a search for In the alternative pathway of vertebrates, C3b binds to a Ficolin-like genes revealed 46 gene models that contain single target molecule upon cleavage from C3 and associates with fibrinogen domains. Ficolins are collagen-fibrinogen domain factor B (Bf) to form the C3 convertase complex. This proteins that are structurally analogous to collectins and activate complex gives rise to an amplifying positive feedback loop the lectin complement pathway (Fujita et al., 2004). While no that activates additional C3 molecules. The first identification model has a clear N-terminal collagen domain, some models of a complement system in an invertebrate was made in the (e.g., SPU_012682, SPU_019295) have coiled-coil domain purple sea urchin (Smith et al., 1996; reviewed in Smith, predictions that may facilitate oligomerization. 2001) and consisted of one C3 and one factor B (Bf) homologue, which was considered minimally sufficient to The terminal pathway function as an alternative pathway. We have identified four The terminal pathway in vertebrates is triggered by C5 gene models that belong to the complement C3/4/5 family in convertase complexes and consists of the sequential oligomer- the purple sea urchin genome (Fig. 4A, Table S1). Sp-C3 and ization of a family of proteins with similar and complex domain Sp-C3-2 correspond to previously identified members (Al- architectures (C6, C7, C8 and C9), which ultimately leads to the Sharif et al., 1998; and J. Rast, unpublished) and are formation of large lytic pores in pathogen membranes. The predominantly expressed in adult coelomocytes and during signature domain within these proteins is the Membrane Attack larval development, respectively. The newly identified mem- Complex/Perforin (MACPF) domain. Searches of the sea urchin T. Hibino et al. / Developmental Biology 300 (2006) 349–365 359

Fig. 5. Sea urchin gene models with important homologues in the jawed vertebrate adaptive immune system. (A) Several gene models encode small Ig domain proteins with similar structure and low sequence identity to vertebrate genes of the adaptive immune system (TCR, Ig, MHCI, II) and other proteins that have been proposed as potentially related to non-rearranging forerunners of these proteins. Structures and gene numbers are illustrated. (B) The putative structures of the proteins encoded by the sea urchin Sp-Rag1L/Sp-Rag2L gene cluster. SpRag1L: Gray vertical stripes indicate putative N-terminal metal binding domain; black stripes indicate repeat region; dark blue box indicates nonamer binding-like region; core region (light blue) extends from nonamer binding region to near C-terminus; DDE represents conserved catalytic residues; red bars represent regions of similarity to vertebrate Rag1. Rag2: K, Kelch repeats of putative β-propeller region; PHD, Homology Domain. (C) Genomic organization of the Sp-Rag1/2L cluster. Decr (2,4-dienoyl CoA reductase 1-like) and rhpn (rhophilin-like) are flanking gene terminal exons, black dots represent poly-adenylation sites. genome identify 22 gene models with a single MACPF domain Complement function in the sea urchin (described below), but none has the appropriate domain Although complement receptors have not been identified in composition or architecture that would identify them as the genome, their presence on coelomocytes and their members of a canonical terminal pathway. involvement in phagocytosis of complement-opsonized parti- cles has been demonstrated (Clow et al., 2004). Based on our Complement receptors and regulatory proteins current knowledge of the sea urchin complement system, a Many of the vertebrate proteins that provide control for the hypothetical model of complement function is presented in Fig. specificity of pathogen opsonization by complement and 4B. In the absence of a terminal pathway (see above), regulate the positive feedback loop of the alternative pathway opsonization may in fact be the major complement function in are composed of multiple SCR domains (three to more than the sea urchin. The proposed positive feedback loop that leads thirty-five domains; Kim and Song, 2006). Initial searches of to augmented pathogen opsonization and enhanced pathogen the genome for gene models with SCR domains identified 247 detection is likely to exist in the sea urchin (Smith et al., 1999). candidates, but because of the repetitive structure of some of the It should be noted that this idea relies on the ability of complement regulatory proteins, such as factor H or decay phagocytes to detect opsonized pathogens via functionally accelerating factor, it made the identification of the sea urchin implied complement receptors (Clow et al., 2004). The homologues difficult. In order to assign functional significance multiplicity of C3 and Bf genes suggests that multiple to these gene candidates for encoding complement regulatory alternative pathways may function in the sea urchin, and the activity, significant biochemical analysis will be required. dynamic expression patterns of Sp-C3 and Sp-C3-2 suggest that There are four types of complement receptors. Type 1 and the two pathways may work at different times in the life history type 2 receptors have multiple SCRs, type 3 complement of the organism. receptors are composed of integrin heterodimers, and type 4 The independent gene duplication events of complement receptors have two Ig domains (see Fig. 5A; Helmy et al., proteins in both the chordate and echinoderm lineages highlight 2006; Hollers et al., 1992). There are a number of gene the importance and dynamic character of this microbial defense models encoding alpha and beta integrins in the sea urchin system in deuterostomes. Our findings suggest that the genome and, although type 3 receptors have been character- complement system likely originated with the lectin and ized in Ciona, none of the sea urchin integrins have the alternative pathways and later incorporated the classical correct signature to predict complement binding (Whittaker et pathway with the acquisition of V(D)J-mediated adaptive al., 2006). It is expected that type 1 or 2 complement immunity. The addition of the terminal pathway may have receptors may be present among the many sea urchin gene originated from an ancestral co-option of a perforin-like models with SCRs, however, the variability in numbers of molecule into the complement system. SCRs in these receptors as found in different species makes predictions of the sea urchin homologues very difficult. This Immune effector proteins problem also holds true for type 4 receptors because over 500 gene models were identified that encode Ig domains (see An important class of immune proteins participates directly below). in mechanisms of pathogen elimination. One such protein 360 T. Hibino et al. / Developmental Biology 300 (2006) 349–365 family is the vertebrate perforins which form membrane pores that mediate antigen binding (Ig, TCRs and MHC), the as part of a mechanism of host defense and are characterized by that carry out receptor rearrangement (Rag1/2, TdT, AID) and the presence of a MACPF domain. Searches of the genome other molecules that function specifically in the vertebrate identified 22 gene models that contain a signal peptide, a single immune system such as cell surface markers and specialized MACPF domain and exhibit an overall resemblance to perforin. signaling adaptors. The small proteins encoded by these genes can be readily clustered into eight subfamilies of which each exhibits sequence Ig-like domain proteins similarity both within and outside the MACPF domain (data not Ig domains involved in immune functions diverge quickly shown). The overall perforin-like domain architecture and (e.g., Hughes, 1997) and therefore, it is expected that even if multiplicity of these proteins suggests that they may have an homologues are present in the sea urchin genome they will be immune function. A previous report (Haag et al., 1999) shows difficult to identify. Igs, TCRs and MHC proteins, along with a that apextrin, a homologue of these MACPF-containing genes, very limited set of other vertebrate proteins are united in is present in the sea urchin Heliocidaris erythrogramma, and containing Ig C1-set domains (Du Pasquier et al., 2004). The has expression and subcellular localization characteristics that sea urchin genome contains approximately 1500 Ig domains are consistent with an immune function in protecting the encoded in more than 500 gene models. A subset of these has embryo. characteristics that are expected of non-rearranging relatives of Other potential immune effector genes from the S. Ig/TCR/MHC. The most promising of these encode putative purpuratus genome include three peroxidase genes and three transmembrane receptors with IgVset-IgC1set ectodomains Nitric Oxide Synthase (NOS) genes (Table S1). While both (Fig. 5A, Table S1). These show weak but relatively specific families have many functions, the use of homologues of these identity with Ig/TCR/MHC and with other genes that are genes in immunity is widespread in bilaterians and some of the hypothesized to be related to the vertebrate antigen receptors. sea urchin homologues are likely to perform immune functions. Many additional molecules show weak similarity to members of Although their function is not understood, the sea urchin the B7 family of lymphocyte co-receptors. Corroborative 185/333 genes are intensely upregulated in coelomocytes information will be required such as syntenic relationships responding to challenge from bacteria and LPS (Nair et al., that parallel those of vertebrate genes or functional similarities 2005; Rast et al., 2000). These genes encode a multigene family before any further conclusions can be drawn. of modular proteins that have no homologues outside of sea urchins and were named based on matches to two unknown Genes with relevance to V(D)J recombination sequences expressed by coelomocytes. They show significant Perhaps the most striking finding from the genome with sequence diversity that includes extensive single nucleotide regard to V(D)J-mediated immunity is the presence of a polymorphisms and small insertions/deletions, plus variations Recombination Activating Gene (RAG) 1 and 2-like cluster. in the presence and absence of larger blocks of sequence from The RAG genes in vertebrates encode an enzyme that is the the second of two exons (Nair et al., 2005; Terwilliger et al., primary mediator of Ig and TCR rearrangement (reviewed in 2006). Based on their extreme diversity and expression in Fugmann et al., 2000). The cluster has not been reported response to immune challenge, these small proteins are likely from any animals outside of the jawed vertebrates (Schluter to have important functions within the immune response of the and Marchalonis, 2003). A recent finding describes a new sea urchin. The survey of the sea urchin genome sequence class of transposons with similarity to the catalytic region of identified ten gene models encoding 185/333 proteins, RAG1 along with a group of apparent genes that show although other studies suggest that many more genes are likely similarity to this region (Kapitonov and Jurka, 2005). The present (Terwilliger et al., 2006; and Buckley and Smith, sea urchin genome contains approximately 30 elements with unpublished). similarity to the RAG1 core region, but nearly all of these appear to be pseudogenes as they are fragmentary and often The sea urchin genome and the origins of jawed vertebrate contain stop codons. One element (Sp-RAG1L), however, adaptive immunity can be aligned to the RAG1 core region throughout its entire length and exhibits sequence identity also in the none-core The adaptive immune system of jawed vertebrates is regions. Downstream of Sp-RAG1L is a second gene with complex, both in terms of the molecular mechanisms by identical predicted structure and low sequence identity to which it generates immune diversity and the specialized cells RAG2 (Sp-RAG2L; Fig. 5B; Fugmann et al., 2006). The that mediate its function. This system is restricted to jawed genomic organization of these genes (Fig. 5C) is identical to vertebrates and the lack of well-defined intermediates has been that of vertebrate RAG1/2, although the sea urchin genes a long-standing problem in evolutionary biology. The purple sea have a more complex intron–exon structure. No functional urchin is the second deuterostome animal from outside of the or molecular evidence for a V(D)J recombination system has jawed vertebrates for which a genome sequence has been yet been identified in the sea urchin and the function of this obtained and this fills a large phylogenetic gap with significant RAG-like cluster is still unknown. Both RAG genes are co- relevance to the origins of lymphocyte functions and TCR/Ig- expressed at low levels in embryos and adult coelomocytes, mediated immunity. We have analyzed the sea urchin genome and the encoded proteins are able to form a specific RAG1/2 sequence for genes with homology to the Ig domain proteins complex with each other, as well as RAG1/2 from selected T. Hibino et al. / Developmental Biology 300 (2006) 349–365 361 vertebrates (Fugmann et al., 2006). The sea urchin function are outside the scope of this paper, however several notable of these genes may be very different from its vertebrate conclusions can be immediately drawn. First, most families of counterparts but its determination may have some bearing on transcription regulators represented by two to four family how the vertebrate genes came to be utilized for V(D)J members in vertebrates are represented by a single member in recombination. the sea urchin genome. Second, there do not appear to be any Another gene of interest with respect to the origins of jawed transcription factor families important in hematopoiesis that are vertebrate adaptive immunity is a homologue of Terminal missing from the sea urchin genome, although there are deoxynucleotidyl Transferase (TdT). This lymphocyte-specific important vertebrate subfamilies (e.g., IRF3) for which clear DNA polymerase increases antigen receptor diversity, by orthologues were not identified. Our preliminary findings adding non-templated nucleotides during the DNA joining indicate that homologues of many important vertebrate hema- phase of V(D)J recombination. Phylogenetic analysis indicates topoietic regulators are expressed in sea urchin coelomocytes that the sea urchin gene (Sp-TDTl) is orthologous to the including: SCL/TAL2, E2A/HEB/ITF2 and GATA1/2/3. Nota- common ancestor of TdT and polymerase μ (Polμ, a member of bly, transcription factors for which sea urchin homologues are the PolX family of error prone polymerases). Sp-TDTl encodes present include a clear homologue of the PU.1/SpiB/SpiC both an N-terminal BRCT domain and a PolX polymerase subfamily of Ets transcription factors (see Rizzo et al., 2006 for a domain. Notably a similar gene was reported from C. description of Ets factors) and a probable Ikaros/Helios/Aiolos/ intestinalis and homologues of this gene are found only in Eos-related gene both of which have been identified only in other deuterostomes. Besides RAG1/2 and TdT, all other factors deuterostomes. In vertebrates both of these transcription factor essential for V(D)J recombination in vertebrates are DNA repair families are notable in their specialization to processes related to factors of the non-homologous end-joining (NHEJ) pathway: hematopoiesis and differentiated blood cell functions. Further- xrrc4, DNA ligase IV, Ku70, Ku80, DNA-PKcs, artemis, and more, both the PU.1-like and the Ikaros-related genes are also cernunnos. As there is a homologue for each one of them in the expressed in adult coelomocytes (Messier and Rast, unpub- sea urchin genome (Table 1, S1), it appears possible that the lished). Collectively these results indicate that key combinations complete enzymatic machinery required for the V(D)J recom- of crucial regulators in vertebrate hematopoiesis and immune bination was present in a common deuterostome ancestor. This functions may be ancestral to the deuterostome lineage. suggests that either the for the V(D)J recombinase, namely split antigen receptor genes, was acquired later in an Cytokines and immune early jawed vertebrate, or that there is an as yet unidentified rearranging gene locus in the sea urchin genome. Members of most cytokine and chemokine families and their Activation Induced Deaminase (AID) is a recently discov- receptors were not detected in the sea urchin genome. These ered key component of the mechanisms that further diversify the include members of the four-helix bundle/hematopoietin Ig repertoire in jawed vertebrates: Ig somatic mutation, class family; the IL-10 family; the IL-12 family; interferons; and switching and Ig gene conversion (Honjo et al., 2005). This the CXC, CX3C and CC chemokine families (see Table S1). enzyme is closely related to the APOBEC proteins which are Similar negative findings have been reported for the chemo- known to function in antiviral responses and RNA editing kines (Devries et al., 2006) and none of these gene classes have (Conticello et al., 2005). A search of the sea urchin gene models been identified from invertebrates. This could be due to a true identified several genes that encode cytidine deaminases but absence of such genes in the echinoid lineage, or may reflect the phylogenetic analysis indicates that none of them belong to the difficulties of identifying these rapidly evolving genes, which AID/APOBEC gene family (not shown). Nonetheless, these tend to be small, have low protein domain complexity and genes are small, divergent, and rapidly evolving, so a complex intron–exon structure. homologue may not have been captured in the gene models. Members of other interleukin classes, however, are clearly present. Many of these genes have a well-defined domain Hematopoietic transcription regulators structure and are thus more readily identified than the cytokine classes referred to above. Although no homologues of Transcription networks are at the heart of genetic programs interleukin-1 could be identified, three gene models encode that control blood cell development, and similarities or proteins that are related in sequence and domain structure to the differences among these networks provide a basis for homology IL-1 receptor and its accessory proteins (Table S1). determinations among hematopoietic programs of different Aside from their well-known role in apoptosis, tumor animal phyla. This is particularly relevant for comparisons necrosis factor (TNF) family proteins function in immune among deuterostomes aimed at understanding the origins of capacities that include macrophage activation and cell prolif- vertebrate immunity (Rothenberg and Pant, 2004). Although eration. Four TNF ligands and eight TNF receptors are present transcription factors in general are the subject of several other in the S. purpuratus genome (described in detail in Robertson et genome descriptions (Howard-Ashby et al., 2006a,b; Materna et al., 2006). al., 2006; Rizzo et al., 2006; Tu et al., 2006), we have surveyed We also identified a large family of genes that show sequence the genome sequence for homologues of transcription regulators similarity and comparable domain architecture with the IL-17 that are recognized as key mediators of hematopoiesis and family of interleukins, as well as two IL-17 receptor-like immune function (Table 1, S5). Extensive domain comparisons models. The IL-17 family, which has six members in vertebrates 362 T. Hibino et al. / Developmental Biology 300 (2006) 349–365

(IL-17A-F), represents a distinct signaling system with a well- receptors from the vertebrate adaptive immune system, than documented pro-inflammatory function, mostly through the TLRs of other animals. Whether this applies also to the NLRs recruitment of neutrophils, and a proposed function in awaits further resolution of their putative pathogen recognition hematopoietic precursor cell maturation (Huang et al., 2004; regions, which have a complex intron–exon structure that is Kawaguchi et al., 2004; Kolls and Linden, 2004). The sea inadequately captured in the current set of gene models. urchin genome encodes about 30 of these genes in addition to Notably, the analysis of the sea urchin TLR repertoire the two receptor genes. Many of the ligands are linked in provides an opportunity to compare evolutionary variation tandem arrays. In addition, a moderately expanded family of among different regulatory levels of the innate recognition Macrophage Inhibitory Factor (MIF)-like genes is present in the system. Expansion is limited largely to the TLR genes and, to a S. purpuratus genome. MIF has a pro-inflammatory, macro- moderate degree their immediate adaptors, with the downstream phage-activating function in vertebrates and paracrine functions transduction components and transcription factors in typical both within and outside the immune system (Nishihira, 1998; numbers. Specificity of response in the system is thus likely to Orita et al., 2002). come from the spatiotemporal regulation of the repertoire as Homologues of several vertebrate Receptor Tyrosine manifested in cell type or the even more interesting possibility Kinases (RTKs) with well-described roles in immune cell of clonal expression. Although the large-scale genomic development and function are present in the S. purpuratus organization of the TLR genes remains to be determined, genome (Lepage et al., 2006). Of note, two homologues of the many genes are clustered and this may reflect aspects of their VEGFR1-3/Flk1/Flt1/Flt4 gene family (Sp-VEGFR-7 and 10) regulation. For example, the character of these oriented clusters and a homologue of Tie-1 and 2 (Sp-Tie1/2) are present. QPCR immediately suggests locus control mechanisms by which the measurements show that both VEGFR-like genes are expressed number of expressed TLR genes could be randomly limited, in coelomocytes (Messier and Rast, unpublished), as is Tie1/2 perhaps enabling a clonal expression system analogous to that (Smith et al., 1996), which is also expressed during gastrulation of the jawed vertebrate immune system. Combinatorial in the embryonic secondary mesoderm (Messier and Rast, expression could generate tremendous diversity in such a unpublished). Notably, no homologues of PDGFR/c-Kit/Flt3/ system. CFS-1R family could be identified. The sea urchin genome lends a fresh viewpoint toward long- standing questions of the evolutionary origins of the jawed Conclusions vertebrate adaptive immune system. The function of the sea urchin RAG1/2-like genes is unclear and further understanding The survey presented here is an initial overview of what the of the cluster is likely to illuminate and constrain theories of the genome sequence conveys about immunity in the sea urchin, origins of V(D)J recombination. A variety of immunoglobulin and by inference, immunity in our invertebrate and vertebrate domain proteins with similarities to genes of the jawed ancestors. The findings immediately modify and clarify our vertebrate adaptive immune system are present. As is typical understanding of invertebrate immunity and of the evolutionary for this class of genes, primary sequence identities are very low. origins of our own immune system. One of the more striking Nonetheless, additional information such as syntenic organiza- findings from the genome is the enormous complexity of innate tion, which has been useful in suggesting paralogue inter- receptor classes that in other animals are represented by relationships within vertebrates (Du Pasquier et al., 2004) and comparatively modest gene families. The TLR and NLR gene functional data may clarify the relationships of these genes to families are each represented by more than an order of those of the jawed vertebrate system. magnitude more gene members than are described in other A causal explanation for the complexity of the sea urchin invertebrates or vertebrates. This expansion may reflect an immune system is not likely to be simple, although there are inflated recognition capacity, the usage of which is otherwise many possible characters of this echinoderm that can be similar to that of animals with smaller innate repertoires. postulated as contributing factors. These include a complex life Alternatively, it may reflect an essentially different mode of history, an intricate water vascular system and a long lifespan gene function. The latter view is supported by the apparent (e.g., a closely related sympatric congener of the purple sea dynamic evolutionary history of most of the sea urchin TLR urchin, Strongylocentrotus franciscanus can live to well over genes as reflected in a propensity for rapid duplication, 100 years; Ebert and Southon, 2003). Localized expression of diversification of LRR sequence and the high proportion of the NLR genes suggests that gut associated microbial flora may pseudogenes. Further analyses will determine whether positive be a driving force behind expansion for this gene family. selection is important in this process, but the contrast between Whether the parallel expansion of TLR genes is a coincidence nucleotide conservation within the TIR domain and diversifi- or is driven by the same pressures remains to be determined. cation in specific regions of the LRR sequence within Whatever the cause for innate receptor expansion, it seems clear subfamilies suggests that this is the case. This contrasts with from the sequence characteristics of these genes that they are similar analyses from vertebrate TLRs (Roach et al., 2005) and used in fundamentally different ways than their counterparts in insect innate recognition receptors (Jiggins and Hurst, 2003), other organisms. As we explore these findings and those from the evolution of which is dominated by purifying selection. other emerging genome sequences, the view of invertebrate From an evolutionary perspective, the TLR genes may behave immunity as a monolithic and evolutionarily static system is more like the members of multigene families encoding antigen likely to rapidly disappear. T. Hibino et al. / Developmental Biology 300 (2006) 349–365 363

Acknowledgments Conticello, S.G., Thomas, C.J., Petersen-Mahrt, S.K., Neuberger, M.S., 2005. Evolution of the AID/APOBEC family of polynucleotide (deoxy)cytidine deaminases. Mol. Biol. Evol. 22, 367–377. J.P.R. is supported by grants from the Canadian Institutes of Couillault, C., Pujol, N., Reboul, J., Sabatier, L., Guichou, J.F., Kohara, Y., Health Research (CIHR) and the National Science and Ewbank, J.J., 2004. TLR-independent control of innate immunity in Cae- Engineering Research Council of Canada (NSERC). L.C.S. is norhabditis elegans by the TIR domain adaptor protein TIR-1, an ortholog supported by a grant from the National Science Foundation of human SARM. Nat. Immunol. 5, 488–494. (MCB04-24235). This work was in part supported by the Devries, M.E., Kelvin, A.A., Xu, L., Ran, L., Robinson, J., Kelvin, D.J., 2006. Defining the origins and evolution of the chemokine/chemokine receptor Intramural Research Program of the National Institute on Aging/ system. J. Immunol. 176, 401–415. National Institutes of Health to S.D.F. T.H. is supported by a Drickamer, K., Taylor, M.E., 1993. Biology of animal lectins. Annu. Rev. Cell Fellowship for Research Abroad from the Uehara Memorial Biol. 9, 237–264. Foundation. We are grateful to D.J. Philpott, S.E. Girardin and Du Pasquier, L., Zucchetti, I., De Santis, R., 2004. Immunoglobulin superfamily D. Filipp (University of Toronto), G. W. Litman (University of receptors in protochordates: before RAG time. Immunol. Rev. 198, 233–248. South Florida), L. DuPasquier (University of Basel) and E.V. Ebert, T.A., Southon, J.R., 2003. Red sea urchins (Strongylocentrotus Rothenberg (Caltech) for the enlightening discussions. franciscanus) can live over 100 years: confirmation with A-bomb 14carbon. Fish. Bull. 101, 915–922. Appendix A. Supplementary data Eddy, S.R., 1998. Profile hidden Markov models. Bioinformatics 14, 755–763. Evans, C.J., Hartenstein, V., Banerjee, U., 2003. Thicker than blood: conserved mechanisms in Drosophila and vertebrate hematopoiesis. Dev. Cell 5, Supplementary data associated with this article can be found, 673–690. in the online version, at doi:10.1016/j.ydbio.2006.08.065. Flajnik, M.F., Du Pasquier, L., 2004. Evolution of innate and adaptive immunity: can we draw a line? Trends Immunol. 25, 640–644. Friedman, R., Hughes, A.L., 2002. Molecular evolution of the NF-kappaB References signaling system. Immunogenetics 53, 964–974. Fugmann, S.D., Lee, A.I., Shockett, P.E., Villey, I.J., Schatz, D.G., 2000. The Akira, S., Uematsu, S., Takeuchi, O., 2006. Pathogen recognition and innate RAG proteins and V(D)J recombination: complexes, ends, and transposi- immunity. Cell 124, 783–801. tion. Annu. Rev. Immunol. 18, 495–527. Alder, M.N., Rogozin, I.B., Iyer, L.M., Glazko, G.V., Cooper, M.D., Pancer, Z., Fugmann, S.D., Messier, C., Novack, L.A., Cameron, R.A., Rast, J.P., 2006. An 2005. Diversity and function of adaptive immune receptors in a jawless ancient evolutionary origin of the Rag1/2 gene locus. Proc. Natl. Acad. Sci. vertebrate. Science 310, 1970–1973. U. S. A. 103, 3728–3733. Al-Sharif, W.Z., Sunyer, J.O., Lambris, J.D., Smith, L.C., 1998. Sea urchin Fujita, T., Matsushita, M., Endo, Y., 2004. The lectin-complement pathway—Its coelomocytes specifically express a homologue of the complement role in innate immunity and evolution. Immunol. Rev. 198, 185–202. component C3. J. Immunol. 160, 2983–2997. Gibson, A.W., Burke, R.D., 1987. Migratory and invasive behavior of pigment Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., cells in normal and animalized sea urchin embryos. Exp. Cell Res. 173, Lipman, D.J., 1997. Gapped BLAST and PSI-BLAST: a new generation of 546–557. protein database search programs. Nucleic Acids Res. 25, 3389–3402. Girardin, S.E., Boneca, I.G., Carneiro, L.A., Antignac, A., Jehanno, M., Viala, Azumi, K., De Santis, R., De Tomaso, A., Rigoutsos, I., Yoshizaki, F., Pinto, J., Tedin, K., Taha, M.K., Labigne, A., Zahringer, U., Coyle, A.J., M.R., Marino, R., Shida, K., Ikeda, M., Arai, M., Inoue, Y., Shimizu, T., DiStefano, P.S., Bertin, J., Sansonetti, P.J., Philpott, D.J., 2003. Nod1 Satoh, N., Rokhsar, D.S., Du Pasquier, L., Kasahara, M., Satake, M., detects a unique muropeptide from gram-negative bacterial peptidoglycan. Nonaka, M., 2003. Genomic analysis of immunity in a Urochordate and the Science 300, 1584–1587. emergence of the vertebrate immune system: “waiting for Godot”. Girardin, S.E., Jehanno, M., Mengin-Lecreulx, D., Sansonetti, P.J., Alzari, P.M., Immunogenetics 55, 570–581. Philpott, D.J., 2005. Identification of the critical residues involved in Bateman, A., Coin, L., Durbin, R., Finn, R.D., Hollich, V., Griffiths-Jones, S., peptidoglycan detection by Nod1. J. Biol. Chem. 280, 38648–38656. Khanna, A., Marshall, M., Moxon, S., Sonnhammer, E.L., Studholme, D.J., Gross, P.S., Al-Sharif, W.Z., Clow, L.A., Smith, L.C., 1999. Echinoderm Yeats, C., Eddy, S.R., 2004. The Pfam protein families database. Nucleic immunity and the evolution of the complement system. Dev. Comp. Acids Res. 32, D138–D141 (Database issue). Immunol. 23, 429–442. Bell, J.K., Mullen, G.E., Leifer, C.A., Mazzoni, A., Davies, D.R., Segal, D.M., Haag, E.S., Sly, B.J., Andrews, M.E., Raff, R.A., 1999. Apextrin, a novel 2003. Leucine-rich repeats and pathogen recognition in Toll-like receptors. extracellular protein associated with larval ectoderm evolution in Helioci- Trends Immunol. 24, 528–533. daris erythrogramma. Dev. Biol. 211, 77–87. Beutler, B., 2003. Innate immune responses to microbial poisons: discovery and Hall, T.A., 1999. BioEdit: a user friendly biological sequence alignment editor function of the Toll-like receptors. Annu. Rev. Pharmacol. Toxicol. 43, and analysis program for Windows 95/98/NT. Nuc. Acids Symp. Ser. 41, 609–628. 95–98. Calestani, C., Rast, J.P., Davidson, E.H., 2003. Isolation of pigment cell specific Helmy, K.Y., Katschke, K.J., Gorgani, N.N., Kljavin, N.M., Elliott, J.M., Diehl, genes in the sea urchin embryo by differential macroarray screening. L., Scales, S.J., Ghilardi, N., van Looderen Campagne, M., 2006. CRIg: a Development 130, 4587–4596. macrophage complement receptor required for phagocytosis of circulating Cameron, R.A., Rast, J.P., Brown, C.T., 2004. Genomic resources for the study pathogens. Cell 124, 915–927. of sea urchin development. Methods Cell Biol. 74, 733–757. Hiscott, J., Rongtuan, L., Peyman, N., Paz, S., 2006. MasterCARD: a priceless Cannon, J.P., Haire, R.N., Rast, J.P., Litman, G.W., 2004a. The phylogenetic link to innate immunity. Trends Mol. Med. 12, 53–56. origins of the antigen-binding receptors and somatic diversification Hoebe, K., Georgel, P., Rutschmann, S., Du, X., Mudd, S., Crozat, K., Sovath, mechanisms. Immunol. Rev. 200, 12–22. S., Shamel, L., Hartung, T., Zahringer, U., Beutler, B., 2005. CD36 is a Cannon, J.P., Haire, R.N., Schnitker, N., Mueller, M.G., Litman, G.W., 2004b. sensor of diacylglycerides. Nature 433, 523–527. Individual protochordates have unique immune-type receptor repertoires. Hoffmann, J.A., 2003. The immune response of Drosophila. Nature 426, Curr. Biol. 14, R465–R466. 33–38. Clow, L.A., Raftos, D.A., Gross, P.S., Smith, L.C., 2004. The sea urchin Hollers, V.M., Kinoshita, T., Molina, H., 1992. The evolution of mouse and complement homologue, SpC3, functions as an opsonin. J. Exp. Biol. 207, human complement C3-binding proteins: divergence of form but conserva- 2147–2155. tion of function. Immunol. Today 13, 231–236. 364 T. Hibino et al. / Developmental Biology 300 (2006) 349–365

Honjo, T., Nagaoka, H., Shinkura, R., Muramatsu, M., 2005. AID to overcome embryonic development. Dev. Biol. (DBOI-06-506 Sea Urchin Genome the limitations of genomic information. Nat. Immunol. 6, 655–661. Issue). Howard-Ashby, M., Materna, S.C., Brown, C.T., Chen, L., Cameron, R.A., McGettrick, A.F., O'Neill, L.A., 2004. The expanding family of MyD88-like Davidson, E.H., 2006a. Identification and characterization of homeobox adaptors in Toll-like receptor signal transduction. Mol. Immunol. 41, transcription factor genes in S. purpuratus, and their expression in 577–582. embryonic development. Dev. Biol. DBIO-06-337 (Sea Urchin Genome Medzhitov, R., 2001. Toll-like receptors and innate immunity. Nat. Rev., Issue). Immunol. 1, 135–145. Howard-Ashby, M., Materna, S.C., Brown, C.T., Chen, L., Cameron, R.A., Metchnikoff, I., 1891. Lectures on the Comparative Pathology of Inflammation Davidson, E.H., 2006b. Gene families encoding transcription factor Delivered at The Pasteur Institute in 1891. Dover, New York. expressed in early development of Strongylocentrotus purpuratus. Dev. Mukhopadhyay, S., Gordon, S., 2004. The role of scavenger receptors in Biol. DBIO-06-507 (Sea Urchin Genome Issue). pathogen recognition and innate immunity. Immunobiology 209, 39–49. Huang, S.H., Frydas, S., Kempuraj, D., Barbacane, R.C., Grilli, A., Boucher, Multerer, K.A., Smith, L.C., 2004. Two cDNAs from the purple sea urchin, W., Letourneau, R., Madhappan, B., Papadopoulou, N., Verna, N., De Lutiis, Strongylocentrotus purpuratus, encoding mosaic proteins with domains M.A., Iezzi, T., Riccioni, G., Theoharides, T.C., Conti, P., 2004. Interleukin- found in factor H, factor I, and complement components C6 and C7. 17 and the interleukin-17 family member network. Allergy Asthma Proc. 25, Immunogenetics 56, 89–106. 17–21. Murphy, P.M., 1993. Molecular mimicry and the generation of host defense Hughes, A.L., 1997. Rapid evolution of immunoglobulin superfamily C2 protein diversity. Cell 72, 823–826. domains expressed in immune system cells. Mol. Biol. Evol. 14, 1–5. Nair, S.V., Del Valle, H., Gross, P.S., Terwilliger, D.P., Smith, L.C., 2005. Inohara, Chamaillard, McDonald, C., Nunez, G., 2005. NOD-LRR proteins: Macroarray analysis of coelomocyte gene expression in response to LPS in role in host–microbial interactions and inflammatory disease. Annu. Rev. the sea urchin. Identification of unexpected immune diversity in an Biochem. 74, 355–383. invertebrate. Physiol. Genomics 22, 33–47. Janeway Jr., C.A., Medzhitov, R., 2002. Innate immune recognition. Annu. Rev. Nishihira, J., 1998. Novel pathophysiological aspects of macrophage migration Immunol. 20, 197–216. inhibitory factor (review). Int. J. Mol. Med. 2, 17–28. Jiggins, F.M., Hurst, G.D., 2003. The evolution of parasite recognition genes in Nonaka, M., Yoshizaki, F., 2004. Primitive complement system of invertebrates. the innate immune system: purifying selection on Drosophila melanogaster Immunol. Rev. 198, 203–215. peptidoglycan recognition proteins. J. Mol. Evol. 57, 598–605. O'Neill, L.A., Fitzgerald, K.A., Bowie, A.G., 2003. The Toll-IL-1 receptor Kapitonov, V.V., Jurka, J., 2005. RAG1 core and V(D)J recombination signal adaptor family grows to five members. Trends Immunol. 24, 286–290. sequences were derived from transib transposons. PLoS Biol. 3, e181. Orita, M., Yamamoto, S., Katayama, N., Fujita, S., 2002. Macrophage migration Kawaguchi, M., Adachi, M., Oda, N., Kokubu, F., Huang, S.K., 2004. IL-17 inhibitory factor and the discovery of tautomerase inhibitors. Curr. Pharm. cytokine family. J. Allergy Clin. Immunol. 114, 1265–1273. Des. 8, 1297–1317. Kim, D.D., Song, W.C., 2006. Membrane complement regulatory proteins. Clin. Pancer, Z., 2000. Dynamic expression of multiple scavenger receptor cysteine- Immunol. 118, 127–136. rich genes in coelomocytes of the purple sea urchin. Proc. Natl. Acad. Sci. Kim, Y.S., Ryu, J.H., Han, S.J., Choi, K.H., Nam, K.B., Jang, I.H., Lemaitre, B., U. S. A. 97, 13156–13161. Brey, P.T., Lee, W.J., 2000. Gram-negative bacteria-binding protein, a Pancer, Z., Cooper, M.D., 2006. The evolution of adaptive immunity. Annu. pattern recognition receptor for lipopolysaccharide and beta-1,3-glucan that Rev. Immunol. 24, 497–518. mediates the signaling for the induction of innate immune genes in Droso- Pancer, Z., Rast, J.P., Davidson, E.H., 1999. Origins of immunity: transcription phila melanogaster cells. J. Biol. Chem. 275, 32721–32727. factors and homologues of effector genes of the vertebrate immune system Klein, J., Nikolaidis, N., 2005. The descent of the antibody-based immune expressed in sea urchin coelomocytes. Immunogenetics 49, 773–786. system by gradual evolution. Proc. Natl. Acad. Sci. U. S. A. 102, 169–174. Pancer, Z., Amemiya, C.T., Ehrhardt, G.R., Ceitlin, J., Gartland, G.L., Cooper, Kohl, A., Grutter, M.G., 2004. Fire and death: the joins the death- M.D., 2004. Somatic diversification of variable lymphocyte receptors in the domain superfamily. C. R., Biol. 327, 1077–1086. agnathan sea lamprey. Nature 430, 174–180. Kolls, J.K., Linden, A., 2004. Interleukin-17 family members and inflammation. Pancer, Z., Saha, N.R., Kasamatsu, J., Suzuki, T., Amemiya, C.T., Kasahara, M., Immunity 21, 467–476. Cooper, M.D., 2005. Variable lymphocyte receptors in hagfish. Proc. Natl. Kufer, T.A., Fritz, J.H., Philpott, D.J., 2005. NACHT-LRR proteins (NLRs) in Acad. Sci. U. S. A. 102, 9224–9229. bacterial infection and immunity. Trends Microbiol. 13, 381–388. Perry, G., Epel, D., 1981. Ca2+-stimulated production of H2O2 from Kumar, S., Tamura, K., Nei, M., 2004. MEGA3: integrated software for naphthoquinone oxidation in Arbacia eggs. Exp. Cell Res. 134, 65–72. molecular evolutionary genetics analysis and sequence alignment. Brief Rast, J.P., Pancer, Z., Davidson, E.H., 2000. New approaches towards an Bioinform 5, 150–163. understanding of deuterostome immunity. Curr. Top. Microbiol. Immunol. Lepage, T., Lapraz, F., Röttinger, E., Duboc, V., Range, R., Duloquin, L., 248, 3–16. Walton, K., Wu, S.-Y., Bradham, C., Whittaker, C.A., Loza-Coll, M., Rast, J.P., Cameron, R.A., Poustka, A.J., Davidson, E.H., 2002. brachyury Hibino, T., Wilson, K., Poustka, A., McClay, D., Angerer, L., Gache, C., Target genes in the early sea urchin embryo isolated by differential 2006. Genes for receptor tyrosine kinases and TGF-β signaling pathways macroarray screening. Dev. Biol. 246, 191–208. encoded in the sea urchin genome. Dev. Biol. DBIO-06-458 (Sea Urchin Rice, P., Longden, I., Bleasby, A., 2000. EMBOSS: the European molecular Genome Issue). biology open software suite. Trends Genet. 16, 276–277. Letunic, I., Copley, R.R., Schmidt, S., Ciccarelli, F.D., Doerks, T., Schultz, J., Rizzo, F., Fernandez-Serra, M., Squarzoni, P., Archimandritis, A., Arnone, M.I., Ponting, C.P., Bork, P., 2004. SMART 4.0: towards genomic data 2006. Identification and developmental expression of the ets gene family in integration. Nucleic Acids Res. 32, D142–D144. the sea urchin (Strongylocentrotus purpuratus). Dev. Biol. DBIO-06-470 Liberati, N.T., Fitzgerald, K.A., Kim, D.H., Feinbaum, R., Golenbock, D.T., (Sea Urchin Genome Issue). Ausubel, F.M., 2004. Requirement for a conserved Toll/interleukin-1 Roach, J.C., Glusman, G., Rowen, L., Kaur, A., Purcell, M.K., Smith, K.D., resistance domain protein in the Caenorhabditis elegans immune response. Hood, L.E., Aderem, A., 2005. The evolution of vertebrate Toll-like Proc. Natl. Acad. Sci. U. S. A. 101, 6593–6598. receptors. Proc. Natl. Acad. Sci. U. S. A. 102, 9577–9582. Litman, G.W., Cannon, J.P., Dishaw, L.J., 2005. Reconstructing immune Robertson, A.J., Croce, J., Carbonneau, S., Voronina, E., Miranda, E., McClay, phylogeny: new perspectives. Nat. Rev., Immunol. 5, 866–879. D.R., Coffman, J.A., 2006. The genomic underpinnings of apoptosis in Loker, E.S., Adema, C.M., Zhang, S.M., Kepler, T.B., 2004. Invertebrate Strongylocentrous purpuratus. Dev. Biol. DBIO-06-404 (Sea Urchin immune systems—Not homogeneous, not simple, not well understood. Genome Issue). Immunol. Rev. 198, 10–24. Rock, F.L., Hardiman, G., Timans, J.C., Kastelein, R.A., Bazan, J.F., 1998. A

Materna, S.C., Howard-Ashby, M., Gray, R.F., Davidson, E.H., 2006. The C2H2 family of human receptors structurally related to Drosophila Toll. Proc. genes of Strongylocentrotus purpuratus and their expression in Natl. Acad. Sci. U. S. A. 95, 588–593. T. Hibino et al. / Developmental Biology 300 (2006) 349–365 365

Ronquist, F., Huelsenbeck, J.P., 2003. MrBayes 3: Bayesian phylogenetic Tamboline, C.R., Burke, R.D., 1992. Secondary mesenchyme of the sea urchin inference under mixed models. Bioinformatics 19, 1572–1574. embryo: ontogeny of blastocoelar cells. J. Exp. Zool. 262, 51–60. Rothenberg, E.V., Pant, R., 2004. Origins of lymphocyte developmental Terwilliger, D.P., Clow, L.A., Gross, P.S., Smith, L.C., 2004. Constitutive programs: transcription factor evidence. Semin. Immunol. 16, 227–238. expression and alternative splicing of the exons encoding SCRs in Sp152, Samanta, M.P., Davidson, E.H., 2006. Embryo Transcriptome Science. Genome the sea urchin homologue of complement factor B. Implications on the Issue. evolution of the Bf/C2 gene family. Immunogenetics 56, 531–543. Sarrias, M.R., Gronlund, J., Padilla, O., Madsen, J., Holmskov, U., Lozano, F., Terwilliger, D.P., Buckley, K.M., Mehta, D., Moorjani, P.G., Smith, L.C., 2006. 2004. The scavenger receptor cysteine-rich (SRCR) domain: an ancient and Unexpected diversity displayed in cDNAs expressed by the immune cells of highly conserved protein module of the innate immune system. Crit. Rev. the purple sea urchin, Strongylocentrotus purpuratus. Physiol. Genomics Immunol. 24, 1–37. 26, 134–144. Schluter, S.F., Marchalonis, J.J., 2003. Cloning of shark RAG2 and Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F., Higgins, D.G., characterization of the RAG1/RAG2 gene locus. FASEB J. 17, 470–472. 1997. The CLUSTAL_X windows interface: flexible strategies for multiple Service, M., Wardlaw, A.C., 1984. Echinochrome-A as a bactericidal substance sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25, in the coelomic fluid of Echinus esculentus (L). Comp. Biochem. Physiol. 4876–4882. 79B, 161–165. Tu, Q., Brown, C.T., Davidson, E.H., Oliveri, P., 2006. Sea urchin forkhead gene Silva, J.R., 2000. The onset of phagocytosis and identity in the embryo of family: phylogeny and embryo expression. Dev. Biol. (DBIO-06-475 Sea Lytechinus variegatus. Dev. Comp. Immunol. 24, 733–739. Urchin Genome Issue). Smith, L.C., 2001. The complement system in sea urchins. In: Beck, G., Volanakis, J.E., 1998. Overview of the complement system. Immunol. Med. 20, Sugumaran, M., Cooper, E. (Eds.), Phylogenetic Perspectives on the 9–32. Vertebrate Immune Systems: Advances in Experimental Medicine and Watson, F.L., Puttmann-Holgado, R., Thomas, F., Lamar, D.L., Hughes, M., Biology, vol. 484. Kluwer Academic/Plenum Publishing Co, New York, NY, Kondo, M., Rebel, V.I., Schmucker, D., 2005. Extensive diversity of Ig- pp. 363–372. superfamily proteins in the immune system of insects. Science 309, Smith, L.C., 2005. Host responses to bacteria: innate immunity in invertebrates. 1874–1878. In: McFall-Ngai, M., Ruby, B.H.N. (Eds.), Advances in Molecular and Whittaker, C.A., Bergeron, K.-F., Whittle, J., Brandhorst, B.P., Burke, R.D., Cellular Microbiology, vol. 10. Cambridge Univ. Press, pp. 293–320. Hynes, R.O., 2006. The echinoderm adhesome. Dev. Biol. (DBIO-06-350 Smith, L.C., Chang, L., Britten, R.J., Davidson, E.H., 1996. Sea urchin genes Sea Urchin Genome Issue). expressed in activated coelomocytes are identified by expressed sequence Yoneyama, M., Kikuchi, M., Matsumoto, K., Imaizumi, T., Miyagishi, M., tags. Complement homologues and other putative immune response genes Taira, K., Foy, E., Loo, Y.M., Gale Jr., M., Akira, S., Yonehara, S., Kato, A., suggest immune system homology within the deuterostomes. J. Immunol. Fujita, T., 2005. Shared and unique functions of the DExD/H-box helicases 156, 593–602. RIG-I, MDA5, and LGP2 in antiviral innate immunity. J. Immunol. 175, Smith, L.C., Shih, C.S., Dachenhausen, S.G., 1998. Coelomocytes express 2851–2858. SpBf, a homologue of factor B, the second component in the sea urchin Yoshizaki, F.Y., Ikawa, S., Satake, M., Satoh, N., Nonaka, M., 2005. Structure complement system. J. Immunol. 161, 6784–6793. and the evolutionary implication of the triplicated complement factor B Smith, L.C., Azumi, K., Nonaka, M., 1999. Complement systems in genes of a urochordate ascidian, Ciona intestinalis. Immunogenetics 56, invertebrates. The ancient alternative and lectin pathways. Immunopharma- 930–942. cology 42, 107–120. Zaidman-Remy, A., Herve, M., Poidevin, M., Pili-Floury, S., Kim, M.S., Blanot, Smith, L.C., Rast, J.P., Brockton, V., Terwilliger, D.P., Nair, S.V., Buckley, D., Oh, B.H., Ueda, R., Mengin-Lecreulx, D., Lemaitre, B., 2006. The K.M., Majeske, A.J., 2006. The sea urchin immune system. Invertebr. Drosophila amidase PGRP-LB modulates the immune response to bacterial Surviv. J. 3, 25–39. infection. Immunity 24, 463–473. Swafford, D., 2002. PAUP*: Phylogenetic Analysis Using Parsimony (and other Zhang, S.M., Adema, C.M., Kepler, T.B., Loker, E.S., 2004. Diversification of methods) 4.0 Beta. Sinauler Associates, Sunderland, MA. Ig superfamily genes in an invertebrate. Science 305, 251–254.