The 185/333 Gene Family Is a Rapidly Diversifying Host-Defense Gene Cluster in the Purple Sea Urchin Strongylocentrotus Purpuratus
Total Page:16
File Type:pdf, Size:1020Kb
doi:10.1016/j.jmb.2008.04.037 J. Mol. Biol. (2008) 379, 912–928 Available online at www.sciencedirect.com The 185/333 Gene Family Is a Rapidly Diversifying Host-Defense Gene Cluster in the Purple Sea Urchin Strongylocentrotus purpuratus K. M. Buckley1, S. Munshaw2, T. B. Kepler2⁎ and L. C. Smith1⁎ 1Department of Biological The genome of the purple sea urchin contains numerous large gene families Sciences, George Washington with putative immunological functions. One gene family, known as 185/333, University, Washington, is characterized by extraordinary molecular diversity resulting from single DC 20052, USA nucleotide polymorphisms and the presence or the absence of 27 large blocks of sequences known as elements. The mosaic composition of elements, known 2Center for Computational as element patterns, that is present within the members of this gene family is Immunology, Department of encoded entirely in the second of two exons. Many of the elements corres- Biostatistics and Bioinformatics, pond to one of six types of repeats that are present throughout the genes. The Duke University, Durham, sequence diversity and variation in element patterns led us to investigate the NC 27705, USA evolution of the 185/333 gene family. The work presented here suggests that Received 31 December 2007; the element patterns are the result of both recombination and duplication received in revised form and/or deletion of intragenic repeats. Each element is composed of a limited 11 April 2008; number of similar but distinct sequences, and their distribution among the accepted 15 April 2008 185/333 genes suggests frequent recombination within this gene family. Available online Phylogenetic analyses of five 185/333 elements and two regions of the intron 22 April 2008 were performed using two tests: incongruence length difference and incon- gruence permutation. Results indicated that each pair of sequence segments was incongruent, suggesting that recombination occurs frequently along the length of the genes, including both the intron and the second exon, and that recombination is not restricted to intact elements. Paradoxically, the high level of similarity among the elements indicated that the 185/333 genes appear to be the result of a recent diversification. These results add to the growing body of evidence suggesting that invertebrate immune systems are not simple and static, but are dynamic and highly complex, and may employ group-specific mechanisms for diversification. © 2008 Elsevier Ltd. All rights reserved. Keywords: echinoderm; molecular evolution; innate immunology; repeats; Edited by J. Karn invertebrate Introduction Complex molecularly diverse immune responses have been traditionally believed to be limited to higher vertebrates. However, recent evidence has *Corresponding authors. E-mail addresses: demonstrated that invertebrate immune responses [email protected]; [email protected]. may also be highly diversified1 and are often en- – Abbreviations used: TLR, toll-like receptor; LRR, leucine coded within large gene families.2 11 The recently rich repeat; qPCR, quantitative polymerase chain reaction; sequenced genome of the purple sea urchin Strongy- EST, expressed sequence tag; 5′ UTR, 5′ untranslated locentrotus purpuratus12 contains a number of large region; BAC, bacterial artificial chromosome; MP, immune-related gene families, many with consider- maximum parsimony; ML, maximum likelihood; ILD, ably more members than their vertebrate homolo- incongruence length difference; IP, incongruence gues.13 These gene families encode toll-like receptors permutation; TCR, T-cell receptor; FREP, fibrinogen-related (TLRs), NACHT domain and leucine-rich repeat protein; TNT, Tree Analysis Using New Technology. (NLR) proteins, scavenger receptor cysteine-rich 0022-2836/$ - see front matter © 2008 Elsevier Ltd. All rights reserved. Evolution of the Strongylocentrotus purpuratus 185/333 Gene Family 913 domains, and C-type lectins. The 185/333 gene family been used to align the 185/333 sequences and to is another example of a large diverse gene family that define elements based on the locations of gaps is putatively involved in the sea urchin immune within the alignment, as well as the edges of the response.14 Results from quantitative polymerase repeats.14 chain reaction (qPCR) analysis of genomic DNA The 185/333 sequence diversity is extremely high. suggest that the 185/333 gene family consists of From 16 S. purpuratus individuals, 872 185/333 se- between 80 and 120 alleles.15 The genes are closely quences (183 genes and 689 transcripts) have been linked, are flanked by dinucleotide and trinucleotide analyzed, of which 475 are unique, encoding 323 repeats,14 and are highly expressed in response to proteins with 37 different element patterns.14,15,18 immunological challenge with either whole Sequence diversity is the result of variation in ele- bacteria,16,17 lipopolysaccharide,16,18 β-1,3-glucan, ment patterns, as well as point mutations and small double-stranded RNA,18 or peptidoglycan (D.A. indels. No identical gene sequences are shared Raftos, unpublished data). Sea urchins seem to be among different animals, indicating that the nucleo- able to discriminate among these pathogen signatures tide diversity occurs not only within the 185/333 through as yet unknown mechanisms and express gene family of individual sea urchins but also within unique suites of 185/333 genes in response to the S. purpuratus population. challenge.18,19 The 185/333 transcripts constitute An initial analysis of 290 ESTs containing the 5′ 6.45% of a cDNA library constructed from bacterially untranslated region (5′ UTR) and leader sequence activated coelomocytes, as opposed to 0.086% in a revealed that identical 5′ UTRs were associated nonactivated library—a 75-fold increase.16 with different leader sequences and vice versa.16 Although the function of the 185/333 proteins This observation was further supported by a more remains unknown, they localize to the cell surface of thorough analysis of 185/333 genes. From three a subset of the coelomocytes (immune cells) of the individual sea urchins, 121 unique genes were sea urchin and may be involved in the formation identified from 171 clones, even though each of the of syncytia to immobilize invading pathogens.20 27 elements had a small number of related but Analysis of 185/333 protein expression by two- distinct element sequences (average=11 unique dimensional Western blot analysis suggests that dis- sequences/element).14 Unique genes were com- tinct suites of proteins are expressed in response to posed of different combinations of the element lipopolysaccharide compared to peptidoglycan, and sequences in a patchwork structure. This mosaic that individual sea urchins can express N200 unique nature of the gene sequences, in addition to the proteins (D.A. Raftos, unpublished data). These pro- diversifying pressure on the 185/333 genes that tein data therefore support the previous observ- results from their predicted immunological role,16 ations of transcript and gene diversity14,15,18 and prompted further investigation of the evolution of emphasize the putative role of the 185/333 gene this unique gene family. family in the S. purpuratus immune response. The results presented here suggest that the In addition to the striking increase in expression highly diversified 185/333 gene family is subject following immune challenge, the 185/333 sequences to frequent recombination, gene duplication, and are intriguing. Originally discovered as 60% of the gene deletion. Because gaps introduced into expressed sequence tags (ESTs) from an arrayed sequence alignments to define the element patterns cDNA library screened with a subtracted probe to complicate phylogenetic analysis of full-length identify sequences that were upregulated in res- gene sequences, the evolution of this gene family ponse to immune challenge, the 185/333 transcripts was analyzed from the perspective of the repeats are homologous to only two previously isolated within the genes. Phylogenetic analysis suggests S. purpuratus ESTs after which the genes are named that the repeats have arisen as a result of intragenic (DD185, GenBank accession no. AF228877; EST333, repeat duplication and/or deletion, recombination, GenBank accession no. R62011).16 Alignment of the and point mutations. Short simple repeats located 185/333 mRNAs requires the insertion of large gaps, within the larger repeats may facilitate gene creating blocks of similar sequences known as ele- diversification through unknown mechanisms.21 ments.15,16 The elements are variably present or Incongruent phylogenetic histories of a variety of absent in different mRNAs, which have been used to elements and analysis of the distribution of element define specific element patterns. Analysis of 185/333 sequences across the genes suggest that the genes genes indicates that the variation in transcript ele- undergo frequent recombination, which is likely to ment patterns is likely the result of variations in be a mechanism for generating diversity within the element patterns encoded by many genes, rather gene family. Within this framework of gene than the result of extensive alternative splicing diversity, however, there is a paradox of remark- among a few genes.14,15 The genes have two ably conserved element sequences, suggesting that exons. The first encodes a hydrophobic leader, and the divergence from the last common ancestral the second encodes the remainder of the open gene for the extant 185/333 sequences occurred reading