
Elephant shark sequence reveals unique insights into the evolutionary history of vertebrate genes: A comparative analysis of the protocadherin cluster Wei-Ping Yu†‡, Vikneswari Rajasegaran†, Kenneth Yew†, Wai-lin Loh†, Boon-Hui Tay§, Chris T. Amemiya¶, Sydney Brenner‡§, and Byrappa Venkatesh‡§ †Gene Regulation Laboratory, National Neuroscience Institute, 11 Jalan Tan Tock Seng, Singapore 308433; §Institute of Molecular and Cell Biology, Agency for Science, Technology, and Research, Biopolis, Singapore 138673; and ¶Benaroya Research Institute at Virginia Mason, Seattle, WA 98101 Contributed by Sydney Brenner, January 14, 2008 (sent for review November 7, 2007) Cartilaginous fishes are the oldest living phylogenetic group of entire extracellular domain, the transmembrane domain, and a jawed vertebrates. Here, we demonstrate the value of cartilagi- small portion of the cytoplasmic domain, whereas the extracel- nous fish sequences in reconstructing the evolutionary history of lular domain of the nonclustered protocadherins is usually vertebrate genomes by sequencing the protocadherin cluster in the encoded by multiple exons (8). Protocadherin cluster proteins relatively small genome (910 Mb) of the elephant shark (Callorhin- are cadherin-like cell adhesion molecules that are highly en- chus milii). Human and coelacanth contain a single protocadherin riched on synaptic membranes (9). These proteins are believed cluster with 53 and 49 genes, respectively, that are organized in to be a vertebrate innovation that accompanied the emergence three subclusters, Pcdh␣, Pcdh, and Pcdh␥, whereas the dupli- of the neural tube and the elaborate central nervous system and cated protocadherin clusters in fugu and zebrafish contain >77 and have been hypothesized to provide a molecular code for speci- 107 genes, respectively, that are organized in Pcdh␣ and Pcdh␥ fying the enormous diversity in neuronal connectivity (3, 9, 10). subclusters. By contrast, the elephant shark contains a single No clustered protocadherin gene has been identified in inver- protocadherin cluster with 47 genes organized in four subclusters tebrate genomes (11–13). The protocadherin gene cluster has (Pcdh␦, Pcdh, Pcdh, and Pcdh). By comparison with elephant been characterized in several bony vertebrates (Osteichthyes), EVOLUTION shark sequences, we discovered a Pcdh␦ subcluster in teleost including human and coelacanth and teleost fishes. Human and fishes, coelacanth, Xenopus, and chicken. Our results suggest that coelacanth protocadherin cluster genes are organized in three the protocadherin cluster in the ancestral jawed vertebrate con- tandem subclusters, designated as Pcdh␣, Pcdh, and Pcdh␥ (3, tained more subclusters than modern vertebrates, and the evolu- 4). Each subcluster contains multiple ‘‘variable’’ exons (Ϸ2.4 kb tion of the protocadherin cluster is characterized by lineage- each) that are transcribed from independent promoters. Each specific differential loss of entire subclusters of genes. In contrast variable exon codes for an extracellular domain comprising six to teleost fish and mammalian protocadherin genes that have repeats of a calcium-binding ectodomain (EC1–6), a transmem- undergone gene conversion events, elephant shark protocadherin brane domain, and a part of the cytoplasmic domain. In addition genes have experienced very little gene conversion. The syntenic to variable exons, Pcdh␣ and -␥ subclusters contain three block of genes in the elephant shark protocadherin locus is well constant exons at the 3Ј end of the respective subcluster, to which conserved in human but disrupted in fugu. Thus, the elephant each variable exon is independently spliced. The constant region shark genome appears to be less prone to rearrangements com- exons code for the major part of the cytoplasmic domain that is pared with teleost fish genomes. The small and ‘‘stable’’ genome shared by genes in each subcluster. Human and coelacanth of the elephant shark is a valuable reference for understanding the protocadherin clusters contain a total of 53 and 49 genes, evolution of vertebrate genomes. respectively. Because of the ‘‘fish-specific’’ whole-genome du- plication, teleost fishes such as fugu and zebrafish contain two ancestral jawed vertebrate ͉ clustered protocadherins ͉ cartilaginous fish ͉ unlinked protocadherin clusters, designated Pcdh1 and Pcdh2. Callorhinchus milii ͉ gene conversion Although fugu Pcdh1 cluster is highly degenerate, Pcdh2 clusters in both fishes have undergone lineage-specific repeated tandem Ͼ econstructing the evolutionary history of vertebrate ge- duplications, and gene losses giving rise to 107 genes in Ͼ nomes, and in particular the human genome, is a major goal zebrafish (2, 5, 7) and 77 genes in fugu (6). However, these R ␣ of vertebrate comparative genomics. Efforts are underway to teleost protocadherin genes are orthologous to only Pcdh and ␥ reconstruct the evolutionary history of the human genome at the - subcluster genes in mammals and coelacanth. Another inter- nucleotide level and at the level of large-scale chromosomal esting feature of teleost protocadherin cluster genes is that they rearrangements (1) with the objective of predicting ancestral have experienced extensive regional gene conversion events Ͼ Ј genomes with high accuracy. The accuracy of the reconstruction resulting in nucleotide identity of 99% in the 3 region of of ancestral genomes is greatly influenced by the choice and variable exons among paralogs (5, 6). Based on the organization, number of genomes compared. Although a comparison of gene content, and phylogenetic relationships of protocadherin genomes from all major vertebrate groups is desirable, it is also important to compare genomes from the most distant branches Author contributions: W.-P.Y., S.B., and B.V. designed research; W.-P.Y., V.R., K.Y., W.-l.L., of the phylogenetic tree. Here, we show how the genome and B.-H.T. performed research; C.T.A. contributed new reagents/analytic tools; W.-P.Y., sequence from the oldest living phylogenetic group of jawed S.B., and B.V. analyzed data; and W.-P.Y. and B.V. wrote the paper. vertebrates, the cartilaginous fishes, can dramatically change our The authors declare no conflict of interest. inferences of the state of the ancestral vertebrate genome. Data deposition: The sequences reported in this paper have been deposited in the GenBank The protocadherin gene cluster is one of the most evolution- database (accession nos. EF693945, EU267079, and EU267080). arily dynamic loci in vertebrates. It is composed of tandem arrays ‡To whom correspondence may be addressed. E-mail: weiping[email protected], backhill@ of paralogous genes that are highly susceptible to lineage-specific hotmail.co.uk, or [email protected]. gene losses, tandem duplications, and gene conversion (2–7). This article contains supporting information online at www.pnas.org/cgi/content/full/ These clustered genes are distinct from nonclustered protocad- 0800398105/DC1. herin genes by possessing a large coding exon that codes for the © 2008 by The National Academy of Sciences of the USA www.pnas.org͞cgi͞doi͞10.1073͞pnas.0800398105 PNAS ͉ March 11, 2008 ͉ vol. 105 ͉ no. 10 ͉ 3819–3824 Downloaded by guest on September 30, 2021 Pcdhδ Pcdhε Pcdhμ Pcdhν 1 X 1 1 2 3 4 5 6 7 10 11 12 13 14 15 16 17 2 1 3 4 5 6 7 8 9 C CX2 CX3 8 CX1 CX2 CX3 1 CX1 CX2 CX3 CX2b 10 11 12 13 15 17 18 1 2 3 5 6 7 8 9 14 20 4 16 19 21 ε ε ε ε ε ε ε ν ν ν ν ν ν ν ν ν ν ν ν ν ν ν ν ν δ ε ψ ε ε ε ε ε ε ε ν ν ν ε ε δ δ δ ε ε ε ε μ μ μ μ μ μ μ ν μ μ μ μ 49C8 166L19 176N9 Fig. 1. Genomic organization of the elephant shark protocadherin cluster. Constant exons at the end of each subcluster are shown in red color. Variable exons in the same paralog subgroup or intersubcluster subgroup are shown in the same color. Only a single pseudogene (1) was identified. CX2b is an alternatively spliced ‘‘constant’’ exon. The positions and identification numbers of BAC clones sequenced are shown below. clusters in bony vertebrates, the ancestral vertebrate protocad- three constant exons, whereas the second subcluster is com- herin cluster is predicted to contain either two (Pcdh␣ and -␥) prised of only variable exons (21 in all), each of which is or three subclusters (Pcdh␣,-, and -␥) that have subsequently transcribed independently as single exon genes similar to the undergone lineage-specific gene losses and gene duplicat- coelacanth and mammalian  subcluster genes. The third and ions, including the whole-locus duplication in the teleost fish fourth subclusters consist of 8 and 17 variable exons, respectively, lineage (6). and three constant exons each. In addition, the fourth subcluster The living jawed vertebrates (gnathostomes) fall into two contains an alternative constant exon between the second and major monophyletic groups, the Chondrichthyes (cartilaginous third constant exons. The splice sites of the constant exons of the fishes that include elasmobranchs and holocephalians) and the three subclusters are shown in SI Appendix, SI Fig. 6.We Osteichthyes (bony vertebrates, which include ray-finned fishes, designated the four subclusters as the Pcdh␦, Pcdh, Pcdh, and coelacanths, lungfishes, and tetrapods) that shared a common Pcdh subclusters, respectively (Fig. 1). ancestor Ϸ450 million years ago. The protocadherin cluster has To determine the evolutionary relationships between individ- been characterized in several bony vertebrates. Here, we report ual protocadherin (Pcdh) genes in the elephant shark, we aligned the sequencing and characterization of the protocadherin cluster the amino acid sequences of all variable exons using ClustalX from a holocephalian cartilaginous fish, the elephant shark and generated a phylogenetic tree using the Neighbor-joining (Callorhinchus milii). The elephant shark possesses a relatively method. Sequences of only the first three ectodomains (EC1–3) small genome (Ϸ910 Mb) among the cartilaginous fishes and is were used, because the C-terminal ectodomains (EC4–6) have therefore attractive for genome sequencing and comparative been shown to be susceptible to extensive gene conversion in analysis (14).
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages6 Page
-
File Size-